mlpro
Joined -1 karma
I work in Machine Learning, LLMs and 3D.
- mlproLol. trying to copy the Universal Weight Subspace paper's naming to get famous.
- Lol, yeah.
- Oh, look - a new 3D model with a new idea - more data.
- I don't understand.
- Waymo should do a bit more research in reliability and explainability of their AI models.
- Read the paper end to end today. I think its the most outrageous ideas of 2025 - at least amongst the papers I've read. So counterintuitive initially and yet so intuitive. Personally, kinda hate the implications. But, a paper like this was definitely needed.
- They are not trained on the same data. Even a skim of the paper shows very disjoint data.
The LLMs are finetuned on very disjoint data. I checked some are on Chinese and other are for Math. The pretrained model provides a good initialization. I'm convinced.
- I think its very surprising, although I would like the paper to show more experiments (they already have a lot, i know).
The ViT models are never really trained from scratch - they are always finetuned as they require large amounts of data to converge nicely. The pretraining just provides a nice initialization. Why would one expect two ViT's finetuned on two different things - image and text classification end up in the same subspace as they show? I think this is groundbreaking.
I don't really agree with the drift far from the parent model idea. I think they drift pretty far in terms of their norms. Even the small LoRA adapters drift pretty far from the base model.
- Why would they be similar if they are trained on very different data? Also, trained from scratch models are also analyzed, imo.
- It's about weights/parameters, not representations.
- The analysis is on image classification, LLMs, Diffusion models, etc.
- It does seem to be working for novel tasks.
- Not really. If the models are trained on different dataset - like one ViT trained on satellite images and another on medical X-rays - one would expect their parameters, which were randomly initialized to be completely different or even orthogonal.