Preferences

I am surprised because it's such a simple task. Any human who is a bit diligent would be able to figure it out. They give both the original and the modified version.

However it feels a bit like counting letters. So maybe it can be solved with post training. We'll know in 3 to 6 months if it was easy for the labs to "fix" this.

In my daily use of LLMs I regularly have some overly optimistic answers because they fail to consider potentially absent or missing information (even harder because it's out of context).


This item has no comments currently.