Preferences

cwyers
Joined 12,198 karma
[ my public key: https://keybase.io/colinwyers; my proof: https://keybase.io/colinwyers/sigs/oY3s_sY1T5jtSxebr0LrQSuJ8_6mFLfNR5_RjA_D6yU ]

  1. It's as much of a stretch as describing using an Azure service as "I have to use Halo" or AWS as "I have to use Rings of Power."
  2. "Slop" is at _least_ as fair a description of "we had an LLM rewrite HN headlines" as "we rewrote it in Rust so you have to upvote it" is of "we removed our biggest source of crashes on Android by getting rid of Go FFI issues."
  3. I mean, he's _allowed_. The Compiler Police aren't going to roll up to his house and take away his Jai compiler if there isn't a quorum of HN users blessing his efforts. But people can point out they don't feel the juice is worth the squeeze. Also, Blow is certainly an advocate for his position, which means this kind of public debate is germane to the question of if _other_ people should adopt Jai.
  4. If you have bills to pay, it really is.
  5. The point is that Blow has two blockbuster hits under his belt and can afford to take a decade to ship a single game. Most people would go broke never having shipped a game if they tried to do things Blow's way.
  6. Yeah, there's only two differences between using Markov chains to predict words and LLMs:

    * LLMs don't use Markov chains, * LLMs don't predict words.

  7. Node.js is the most popular web framework/technology in the StackOverflow developer survey. Express is more popular than FastAPI, Django, Flask and Rails in the same survey. Just... what are you talking about?
  8. I was surprised by that, too, and assumed it was a decade-old article until I saw the date at the bottom. Both being mentioned before Python is wilder, as is the total exclusion of JavaScript.
  9. You can actually look at history and see what happens when IBM tries to wrest control of the PC platform back with the PS/2, which was a flop with consumers because it wasn't backwards compatible enough with IBM's own previous PCs or the wider PC market that developed. A bunch of PC clone manufacturers got together and came up with the EISA bus standard so they wouldn't have to pay IBM license fees for MCA, and made it backwards-compatible with ISA cards people already had. It was successful enough that IBM ended up adopting EISA for some of their PCs.

    The other notable thing about the situation is that three companies ended up simultaneously responsible for a large part of the PC platform, originally -- IBM, Microsoft and Intel. They all worked in various ways to encourage competition to each other -- the reason we see OS competition on the PC platform is that IBM and Intel both found it in their interests to allow other OSes on the platform to reduce Microsoft's leverage over them. IBM in fact created one of the competing PC OSes out the gate, OS/2, which was originally an IBM/Microsoft joint project until they started feuding. Now, OS/2 is dead, but IBM's interest in being able to support their own OS instead of Microsoft's is a big reason the PC platform was built in an OS agnostic way. People criticize UEFI for locking down the PC platform more than the previous BIOS implementations, but UEFI is still _way_ more open than basically any other platform, most of which don't have a standard for bootloaders at all. It's really the absense of a standard for bootloaders that keeps most Android phones locked down. Two Android phones from the same OEM might have different bootloaders, much less two phones from different manufacturers. We've yet to see an alternate OS with the resources to support implementing their own bootloaders for a majority of Android phones.

  10. Because the original IBM PC was designed to be cheap and built in a hurry. IBM had a mandate for the original PC to use off the shelf components as much as possible. They also neglected to secure an exclusive license from Microsoft for DOS. 95% of building an IBM PC clone was buying the same parts and getting a DOS license from Microsoft (which they were very happy to sell you). Everyone saw what happened to IBM and just didn't do it that way again.
  11. MinIO is absolutely not a passion project, it's a business.
  12. I'm not saying SWE-Bench is perfect, and there are reports that suggest there is some contamination of training sets for LLMs with common benchmarks like SWE-Bench. But they publish SWE-bench so anyone can run it and have an open leaderboard where they attribute the results to specific models, not just vague groupings:

    https://www.swebench.com/

    ARC-AGI-2 keeps a private set of questions to prevent LLM contamination, but they have a public set of training and eval questions so that people can both evaluate their modesl before submitting to ARC-AGI and so that people can evalute what the benchmark is measuring:

    https://github.com/arcprize/ARC-AGI-2

    Cursor is not alone in the field in having to deal with issues of benchmark contamination. Cursor is an outlier in sharing so little when proposing a new benchmark while also not showing performance in the industry standard benchmarks. Without a bigger effort to show what the benchmark is and how other models perform, I think the utility of this benchmark is limited at best.

  13. The short version is: the PC is a historical accident. By "the PC" I mean "the Windows-Intel platform on which most consumer PCs were built." Linux and BSD were both able to exist in the form they did because there was a commodity hardware platform that was standardized (ad-hoc standardization, mind you) and _somewhat_ open. IBM, Microsoft and Intel were all best frenemies, able to exert enough power to standardize the PC platform but also able to exert enough power against each other to prevent them from locking the platform down too much. There is no standard "smartphone" platform like there is with the PC, really the only standard is Android AOSP. Because of this, it's a lot harder to do a third-party phone platform without adopting large parts of Android's code.
  14. The lack of transparency here is wild. They aggregate the scores of the models they test against, which obscures the performance. They only release results on their own internal benchmark that they won't release. They talk about RL training but they don't discuss anything else about how the model was trained, including if they did their own pre-training or fine-tuned an existing model. I'm skeptical of basically everything claimed here until either they share more details or someone is able to interpedently benchmark this.
  15. If you read the article, one of the buttons on the bar prompted people to upgrade to the latest version of IE.
  16. LLMs are good at pursuing objectives, but they aren't necessarily good at juggling competing objectives at once. So you can picture doing the following, for instance:

    - "Here is a spec for an API endpoint. Implement this spec."

    - "Using these tools, refactor the codebase. Make sure that you are passing all tests from (dead code checker, cyclomatic complexity checker, etc.)"

    The clankers are very good at iteratively moving towards a defined objective (it's how they were post-trained), so you can get them to do basically anything you can define an objective for, as long as you can chunk it up in a way that it fits in their usable context window.

  17. The Doctorow school argument, as best I can tell, would go 'the regulations on black car service were meant for things like limo services that don't compete directly with taxis, and once Uber started competing directly with taxis, regulators and authorities should have moved more aggressively to write new regulations/laws that regulated Uber the same way taxis are regulated.' They would not agree with "the reason why taxis are tightly regulated are for reasons that mostly do not apply to Uber."

    And this is exactly why I think the question of "what is the correct way to regulate car ride services" shouldn't hinge on incumbency bias towards taxis, but actually ask the question of what is best for participants in the market (which doesn't just include taxis and Ubers but also includes public transportation and its users, for instance). But that doesn't fit neatly into Doctorow's enshitification narrative.

  18. Yeah, it's a real thing that happened to me, to. In multiple US cities. And I'm sure we're far from alone.
  19. The opening of the article is laying out the case that the laws are good -- they make the market legible to participants. As he says:

    ``` To navigate all of these technical minefields, you need the help of a third party. In a modern society, that third party is an expert regulator who investigates or anticipates problems in their area of expertise and then makes rules designed to solve these problems.

    To make these rules, the regulator convenes a truth-seeking exercise, in which all affected parties submit evidence about what the best rule should be and then get a chance to read what everyone else wrote and rebut their claims. Sometimes, there are in-person hearings, or successive rounds of comment and counter-comment, but that’s the basic shape of things.

    Once all the evidence is in, the regulator—who is a neutral expert, required to recuse themselves if they have conflicts—makes a rule, citing the evidence on which the rule is based. This whole system is backstopped by courts, which can order the process to begin anew if the new rule isn’t supported by the evidence created while the regulator was developing the record.

    This kind of adversarial process—something between a court case and scientific peer review—has a good track record of producing high-quality regulations. You can thank a process like this for the fact that you weren’t killed today by critters in your tap water or a high-voltage shock from one of your home’s electrical outlets. ```

    And this is central to Doctorow's point, right? The narrow question of the legality of Uber's current service offerings is actually pretty well litigated, and if Uber was as flagrantly illegal as he claims, "we're an app" wouldn't have kept them in business. Doctorow argues that this is happening through regulatory capture -- the case isn't primarily that Uber is violating the currently existing set of laws, regulations, court precedents, etc. It's that Uber is violating what the regulations _would be_ in a world where they had less market power with which to influence regulations.

    And so it's not enough to argue about how the apps get around _current_ laws. By Doctorow's own arguments, we're debating the merits of a counterfactual set of different regulations that we would have if you changed current conditions. And at that point, it is absolutely fair game to ask if this counterfactual set of different regulations is actually better for market participants.

  20. ```When Uber entered the taxi market without securing taxi licenses or extending the workforce protections required under law, it said the move didn’t count because it did it with an app.```

    It's so weird to see the first half of this article written as an ode to the virtues of competition and then see the sharp pivot into defending taxi medallions. Say what you will about Uber, but no Uber driver has ever tried to lie and harass a passenger over whether or not the credit card machine is broken in an effort to cheat on their taxes. It's not even like the anti-consumer hostility of the taxi experience translated into better rights for workers, the high price of a medallion meant in practice your typical cab driver was in a situation damned close to indentured servitude to a medallion company.

    And to top it all off, taxis demonstrate the fallacy of thinking that hundreds of market participants provides meaningful benefits from competition. In a market with a suitably large number of cab drivers and passengers, the odds of repeat business between any pair of driver and passenger is low enough that neither party is incentivized to treat each other well. It's not like anyone was pulling out a Yelp-like site or review book to pick the best-reviewed cab drivers, or like you went out of your way to stick with a cab driver you'd had a good experience with. Meaningful competition requires that people can make _informed_ choices, and without repeat business you don't get participants informed enough to make meaningful choices between market participants. It also requires leverage. It doesn't matter if you threaten to take your business elsewhere next time if you and they both know _you were going to anyway_.

    I'm not saying that Uber is perfect, or even that Uber couldn't be productively regulated better by the government. I'm saying that taxis were a terrible experience, and I don't trust Doctorow to have a good lay of the land when he focuses more on his ideology than the evidence. If subscribing to Doctorow's beliefs requires services to look more like taxis than Ubers, you can count me out.

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal