Preferences

kurtis_reed parent
Why did you have access to a preview?

simonw
I get access to previews from OpenAI, Anthropic and Gemini pretty often. They're usually accompanied by an NDA and an embargo date - in this case the embargo was 10am Pacific this morning.

I won't accept preview access if it comes with any conditions at all about what I can say about the model once the embargo has lifted.

dzhiurgis
Soooo that leaves xAI that had conditions
poopiokaka (dead)
Redster
Simonw is a cheerful and straightforward AI journalist who likes to show and not just tell. He has done a good job aggregating and documenting the progress of LLM tools and models. As I understand it, OpenAI and Anthropic have both wisely decided to make sure he has up to date info because they know he'll write about it.

Thanks for all your work, Simon! You're my favorite journalist in this space and I really appreciate your tone.

tootie
Simon has a popular blog, but he's also co-creator of Django and very well-known in the Python community.
michaelt
> As I understand it, OpenAI and Anthropic have both wisely decided to make sure he has up to date info because they know he'll write about it.

And the wisest part is if he writes something they don't like, they can cut off that advanced access.

As is the longstanding tradition in games journalism, travel journalism, and suchlike.

simonw
If they do that I'll go back to writing about them after they ship. Not a big loss for me at all.
tripzilch
I get it, you would trust yourself if you said that, but it doesn't really matter whether you say that or not, what counts for your ongoing credibility if you will preface every future blog post with, whether you got special access, a special deal, sponsorship, or the fact that you didn't get any of those things.

You're a reviewer. This is how reviewers stay credible. If you don't disclose your relationship with the thing or company you're reviewing, I'm probably better off assuming you're paid.

And if your NDA says you can't write that in your preface, then logically, it is impossible to write a credible review in the first place.

simonw
tripzilch
awesome, thanks a lot that's important but ... sorry I just checked those, and I do think it's better to do it on a per-article basis, because a lot of your audience (I'm guessing) comes from external links, not browsing your website

this is (or should be) a pretty standard thing to do on youtube review channels (that I would trust), and it's not a bad thing to remind people of, on every occasion, plus it can function as a type of "canary" in cases of particularly restrictive NDAs

knowsuchagency
I like Simon, but he's not a journalist. A journalist would not have gone to OpenAI to glaze the GPT-5 release with Theo. I don't say this to discount Simon -- I appreciate his writing and analysis but a journalist, he isn't.
simonw
I don't call myself a journalist, partly because no publication is paying me to do any of this!

If I had an editor I imagine they would have talked me out of going to the OpenAI office for a mysterious product preview session with a film crew.

Redster
That's a fair point. I feel like he's more than a blogger and am not sure the best term!
LudwigNagasena
An influencer.
Guys, he's standing right there
fourthark
Argh
asadotzler
AI blogger seems more appropriate than journalist.
are you aware of any "ai journalists"? Because simonw does great work, so perhaps blogger is what people should aspire towards?
simonw
I actually talk to journalists on the AI beat quite often - I've had good conversations with them at publications including The Economist and NY Times and Washington Post and ArsTechnica.

They're not going to write up detailed reviews of things like the new Claude code interpreter mode though, because that's not of interest to a general enough audience.

I don't have that restriction: https://simonwillison.net/2025/Sep/9/claude-code-interpreter...

grim_io
Not sure what an AI journalist is supposed to be or do, but a lack of one does not promote someone who is not it automatically into the position.
landl0rd
Kylie Robison recently moved to Wired and is a solid "AI journalist".
minimaxir
Although she is indeed solid as an AI journalist, unfortunately she was recently let go for unknown reasons: https://www.kyliebytes.com/thank-god-i-got-fired/
rapfaria
His "pelican riding a bicycle" tests are now a classic and AI shops are benchmaxxing for it
simonw
They need to benchmaxxx a whole lot harder, the illustrations still all universally suck!
I fully expect a model to output a SVG made up of 1000x1000 rectangles (i.e. pixels) representing a raster image of a beautifully hand-drawn pelican riding a bicycle any day now :)
simonw
I got an amazing result from ChatGPT a while back - an SVG with a perfect illustration of a pelican riding a bicycle.

It was suspiciously good in fact... so I downloaded the SVG file and found out it had generated a raster image with its image tool and then embedded it as base64 binary image data inside an SVG wrapper!

dhhugley
You’ll just have to move the goalpost then; perhaps it can be a multidimensional pelican saving the multiverse, or an invisible pelican that only you can see and critique.
How would that help, given that ChatGPT has apparently already figured out how to consistently and systematically game the benchmark by working in pixel space and only using SVG as a wrapper for a raster image?

FWIW, I could totally see a not hugely more advanced model using its native image generation capabilities and then running a vector extraction tool on it, maybe iteratively. (And maybe I would not consider that cheating, anymore, since at some point that probably resembles what humans do?)

sixeyes
ive got such pixelated rectangle SVG's a few times.

also with cursor, "write me a script that outputs X as an svg" it has given me rectangles a few times.

astrange
If they were testing that it'd work more often.

Other things you can ask that they're still clearly not optimizing for are ASCII art and directions between different locations. Complete fabrications 100% of the time.

Sharlin
Well, I definitely hope they aren't trying to teach LLMs directions between locations, given how idiotic use of compute and parameter space that would be. We already have excellent AIs for route planning. What they ought to optimize for is, of course, finally teaching them to say they don't know, or just automatically opting to call a route-planning API if the user asks for directions.
minimaxir
Simon tends to write up reports of new LLM releases (with great community respect) and it's much easier with lead time if the provider is able to set up a preview endpoint.
criddell
I believe the criticism is that he's reporting on a pre-release LLM which isn't the same as the one you and I are going to be using a few weeks from now after they've downgraded it enough to work at scale.
lossolo
The same reason YouTube reviewers and influencers get access to hardware or games before release. In this case, the person is a passionate blogger.
runjake
simonw is Simon Willison, who’s well known for a number of things. But these days, he’s well known for his AI centric blog and his tools. The AI companies give him early access to stuff.

https://simonwillison.net/

kissgyorgy
If you want to keep up with AI progress and model updates, simonw is the man to follow!
lomase
They are an AI evangelist that told me I can replace any technical book created with an LLM.

They are a nice person.

rhizome
You are correct, sir!
mvdtnz (dead)

This item has no comments currently.