Comment by sailingparrot

sailingparrot Oct 21, 2025 parent

train on bad data, get a bad model

xpe Oct 21, 2025

> train on bad data, get a bad model

Right: in the context of supervised learning, this statement is a good starting point. After all, how can one build a good supervised model if you can't train it on good examples?

But even in that context, it isn't an incisive framing of the problem. Lots of supervised models are resilient to some kinds of error. A better question, I think, is: what kinds of errors at what prevalence tend to degrade performance and why?

Speaking of LLMs and their ingestion processing, there is a lot more going on than purely supervised learning, so it seems reasonable to me that researchers would want to try to tease the problem apart.

This item has no comments currently.

Preferences

Keyboard Shortcuts

Story Lists

Navigation

Miscellaneous