Preferences

I'm not an expert, but some thoughts:

* The problem with large vectors is that they have large dot products with every other vector, which would imply that they are more similar to everything which doesn't make sense.

* Adding the requirement that "length==1" doesn't matter much in high-dimensional spaces, since that only removes one degree of freedom. Don't try to use too much 3D intuition here.

* It might be intuitive to think that "large" should have implications for the size of the vectors, but that really only applies to a couple of examples. We want vectors to represent thousands of unrelated concepts, so this one case is really not that relevant or important.

* In reality what ends up happening is partially the "very" dimension you're suggesting, but also just a "largeness" dimension. Individual dimensions can still have a scale!


Very good points here, especially about the fact that the single "length" degree of freedom is much less to lose in very high-dimensional spaces. However, I don't agree that large vectors would end up being "more similar to everything" -- really what's happening is that the dot product stops being a good measure of similarity, but we already knew that using it that way relied on everything being normalized anyway! L1 and L2 still work just fine.

This item has no comments currently.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal