Preferences

I wonder if you would want to use an earlier layer as opposed to the penultimate layer, I would imagine that the LLM uses that layer to "prepare" for the final dimensionality reduction to clean the signal such that it scores well on the loss function.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal