Preferences

potatolicious parent
I think this is at least somewhat true anecdotally. We do know that as context length increases, adherence to the system prompt decreases. Whether that de-adherence is reversion to the base model or not I'm not really qualified to say, but it certainly feels that way from observing the outputs.

Pure speculation on my part but it feels like this may be a major component of the recent stories of people being driven mad by ChatGPT - they have extremely long conversations with the chatbot where the outputs start seeming more like the "spicy autocomplete" fever dream creative writing of pre-RLHF models, which feeds and reinforces the user's delusions.

Many journalists have complained that they can't seem to replicate this kind of behavior in their own attempts, but maybe they just need a sufficiently long context window?


This item has no comments currently.