Comment by radial_symmetry

radial_symmetry Nov 18, 2025 parent

SWE bench is weird because Claude has always underperformed on it relative to other models despite Claude Code blowing them away. The real test will be if Gemini CLI beats Claude Code, both using the agentic framework and tools they were trained on.

This item has no comments currently.

It looks like you have JavaScript disabled. This web app requires that JavaScript is enabled. Please enable JavaScript to use this site (or just go read Hacker News).

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous