Preferences

maybe someday soon, LLMs will learn to smoothly forget their own irrelevant context

imagine instead of predicting just the next token, the LLM predicts a mask over the previous tokens, that is then thresholded and only “relevant” tokens are kept in the next inference

one key distinction between humans and LLMs is that humans are excellent at forgetting irrelevant data. we forget tens of times a second and only keep what's necessary


This item has no comments currently.