Comment by WiSaGaN - Hacker Neue

WiSaGaN Aug 17, 2025 parent

That's true. You would think LLM will condition its surprise completion to be more probable if it's in a joke context. I guess this only gets good when model really is good. It's similar that GPT 4.5 has better humor.

moffkalast Aug 17, 2025

Good completely new jokes are like novel ideas: really hard even for humans. I mean fuck, we have an entire profession dedicated just to making up and telling them, and even theirs don't land half the time.

IshKebab Aug 17, 2025

Exactly. It feels like with LLMs as soon as we achieved the at-the-time astounding breakthrough "LLMs can generate coherent stories" with GPT-2, people have constantly been like "yeah? Well it can't do <this thing that is really hard even for competent humans>.".

That breakthrough was only 6 years ago!

https://openai.com/index/better-language-models/

> We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text...

That was big news. I guess this is because it's quite hard for the most people to distinguish the enormous difficulty gulf between "generate a coherent paragraph" and "create a novel funny joke".

brookst Aug 17, 2025

Same thing we saw with game playing:

- It can play chess -> but not at a serious level

- It can beat most people -> but not grandmasters

- It can beat grandmasters -> but it can’t play go

…etc, etc

In a way I guess it’s good that there is always some reason the current version isn’t “really” impressive, as it drives innovation.

But as someone more interested in a holistic understanding of of the world than proving any particular point, it is frustrating to see the goalposts moved without even acknowledging how much work and progress were involved in meeting the goalposts at their previous location.

nothrabannosir Aug 17, 2025

> it is frustrating to see the goalposts moved without even acknowledging how much work and progress were involved in meeting the goalposts at their previous location.

Half the HN front page for the past years has been nothing but acknowledging the progress of LLMs in sundry ways. I wish we actually stopped for a second. It’s all people seem to want to talk about anymore.

brookst Aug 17, 2025

I should have been more clear. Let me rephrase as: among those who dismiss the latest innovations as nothing special because there is still further to go, it would be nice to acknowledgment when goalposts are moved.

4 More Comments →

ACCount37 Aug 17, 2025

Which is notable, because GPT-4.5 is one of the largest models ever trained. It's larger than today's production models powering GPT-5.

Goes to show that "bad at jokes" is not a fundamental issue of LLMs, and that there are still performance gains from increasing model scale, as expected. But not exactly the same performance gains you get from reasoning or RLVR.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous