- mohsen1Maybe check the fact because I've gone to Home Depot in Manhattan before myself
- It’s strange that the article says the white collar worker in nyc and small business owner in suburban Texas are not the same market. To many businesses they are in the same market. McDonald’s Home Depot etc they don’t make different products for those two individuals
- Just try Gemini Live on your phone. That's state of the art
- I'm just enjoying the last few years of this career. Let me have fun!
Joking aside, we have to understand that this is the way software is being created and this tool is going to be the tool most trivial software (which most of us make) will be created with.
I feel like the industry is telling me: Adopt of become irrelevant
- Since they are not showing you how this model compares against the benchmarks they are showing, here is a quick view with the public numbers from Google and Anthropic. At least this gives some context:
And for terminal workflows, where agentic steps matter:SWE-Bench (Pro / Verified) Model | Pro (%) | Verified (%) --------------------+---------+-------------- GPT-5.2-Codex | 56.4 | ~80 GPT-5.2 | 55.6 | ~80 Claude Opus 4.5 | n/a | ~80.9 Gemini 3 Pro | n/a | ~76.2
So yes, GPT-5.2-Codex is good, but when you put it next to its real competitors:Terminal-Bench 2.0 Model | Score (%) --------------------+----------- Claude Opus 4.5 | ~60+ Gemini 3 Pro | ~54 GPT-5.2-Codex | ~47- Claude is still ahead on strict coding + terminal-style tasks
- Gemini is better for huge context + multimodal reasoning
- GPT-5.2-Codex is strong but not clearly the new state of the art across the board
It feels a bit odd that the page only shows internal numbers instead of placing them next to the other leaders.
- Al Jazeera has been super loud and vocal about how US aggressions towards Venezuela is all about oil. It makes sense since Venezuela’s future oil exports in case the current regime falls will hugely impact the price of oil which funds Qatar which funds Al Jazeera.
- Unlike Nano Banana it allows generating photos of children. Always fun to ask AI to imagine children of a couple but it's also kinda concerning that there might be terrible use cases.
- that's not how humans work
- Folks with big titles will always write comments that sound smart and thoughtful but in reality hinder the process. For example:
- This architecture binds us to AWS. Have we estimated the engineering effort to remain cloud-agnostic in case we need to move to Azure next year?
- I see we're using Postgres. Have we considered how we’ll handle horizontal sharding if our user base grows by 1000x in Q4?
- This synchronous API call introduces tight coupling. Shouldn't this be an event-driven architecture to handle back-pressure?
All sound like things that are easy to ask, sound prudent to management, but are impossibly expensive to answer or implement for a feature that just needs to ship.
- Webpack is typed using JSDoc and type-checked via TypeScript -- I started this migration a while ago. It works pretty well
- Having lots of success with Gemini Flash Live 2.5. I am hoping 3.0 to come out soon. Benchmarks here claim better results that Gemini Live but have to test it. In past I've always been disappointed with Qwen Omni models in my English-first case...
- I am an American citizen living in Europe. There are grocery stores here generating billions too. They have to advertise the price including taxes. That alone is a huge advantage for the shopper that regulations enforce it.
If you're on the lower income side and have limited money to buy your groceries you don't have to guess the total price of your basket. Prices are WYSIWYG.
I didn't think things are bad in the US until I lived here in Europe for some time to realize.
Dollar General situation is bad and in a fairer system they would still make billions. Don't assume regulations that protects people automatically means bad business
- Any toast can be an inline message in my experience
- Can you imagine how annoying ads in the voice interface would look like? Ugh
- This is the fruit of Windsurf brain-drain and I think it might be better than what's out there since those guys got to start from scratch from everything they learned building Windsurf
- Some time I think I should spend $50 on Upwork to get a real human artist to do it first to know what is that we're going for. What a good pelican riding a bicycle SVG is actually looking like?
- More expensive than current 2.5 Pro. for >200k token it's at $2.5 input and $15 output right now
Is that common to mention that? Feels like they built something from scratchThis model is not a modification or a fine-tune of a prior model