Preferences

mlpro
Joined -1 karma
I work in Machine Learning, LLMs and 3D.

  1. Lol. trying to copy the Universal Weight Subspace paper's naming to get famous.
  2. Lol, yeah.
  3. Oh, look - a new 3D model with a new idea - more data.
  4. I don't understand.
  5. Waymo should do a bit more research in reliability and explainability of their AI models.
  6. Read the paper end to end today. I think its the most outrageous ideas of 2025 - at least amongst the papers I've read. So counterintuitive initially and yet so intuitive. Personally, kinda hate the implications. But, a paper like this was definitely needed.
  7. They are not trained on the same data. Even a skim of the paper shows very disjoint data.

    The LLMs are finetuned on very disjoint data. I checked some are on Chinese and other are for Math. The pretrained model provides a good initialization. I'm convinced.

  8. I think its very surprising, although I would like the paper to show more experiments (they already have a lot, i know).

    The ViT models are never really trained from scratch - they are always finetuned as they require large amounts of data to converge nicely. The pretraining just provides a nice initialization. Why would one expect two ViT's finetuned on two different things - image and text classification end up in the same subspace as they show? I think this is groundbreaking.

    I don't really agree with the drift far from the parent model idea. I think they drift pretty far in terms of their norms. Even the small LoRA adapters drift pretty far from the base model.

  9. Why would they be similar if they are trained on very different data? Also, trained from scratch models are also analyzed, imo.
  10. It's about weights/parameters, not representations.
  11. The analysis is on image classification, LLMs, Diffusion models, etc.
  12. It does seem to be working for novel tasks.
  13. Not really. If the models are trained on different dataset - like one ViT trained on satellite images and another on medical X-rays - one would expect their parameters, which were randomly initialized to be completely different or even orthogonal.

This user hasn’t submitted anything.

Keyboard Shortcuts

Story Lists

j
Next story
k
Previous story
Shift+j
Last story
Shift+k
First story
o Enter
Go to story URL
c
Go to comments
u
Go to author

Navigation

Shift+t
Go to top stories
Shift+n
Go to new stories
Shift+b
Go to best stories
Shift+a
Go to Ask HN
Shift+s
Go to Show HN

Miscellaneous

?
Show this modal