Preferences

LLMs are next-token prediction models. They "understand" in that, if the previous 1000 tokens are such-and-such, then they emit a best guess at the 1001th token.

It "knows" what rsync is because it has a lot of material about rsync in the training data. However it has no idea about that particular version because it doesn't have much training data where the actual version is stated, and differences are elaborated.

What would probably produce a much better result if you included the man page for the specific version you have on your system. Then you're not relying on the model having "memorized" the relationship relationships among the specific tokens you are trying to get the model to focus on, instead just passing it all in as part of the input sequence to be completed.

It is absolutely astounding that LLMs work at all, but they're not magic, and some understanding of how they actually work can be helpful when it comes to using them effectively.


Our low-code expression language is not well-represented in the pre-training data. So as a baseline we get lots of syntax errors and really bad-looking UIs. But we're getting much better results by setting up our design system documentation as an MCP server. Our docs include curated guidance and code samples, so when the LLM uses the server, it's able to more competently search for things and call the relevant tools. With this small but high-quality dataset, it also looks better than some of our experiments with fine tuning. I imagine this could work for other docs use cases that are more dynamic (ie, we're actively updating the docs so having the LLM call APIs for what it needs seems more appropriate than a static RAG setup).

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal