Preferences

1 point
I’ve noticed that the quality of ChatGPT-5.1 occasionally drops substantially. I’m talking GPT-3 level hallucinations - wildly making stuff up or randomly inserting words in a language I do not speak.

In my repeat evaluations on the same datasets the scores are all over the place, sometimes scoring really high and sometimes doing very badly.

Has anyone experienced something similar?

I’m guessing this may be because “GPT-5.1” can sometimes choose to use a much smaller model, but for production use this makes it unreliable.


I'm mainly using it for rewriting or helping me understand legacy code and to me 5.1 is the best yet.
I think ChatGPT as a whole has regressed.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal