There's some domains where the bitter lesson has big impacts on theory. The Peter Norvig vs Noam Chomsky debate on the merits of brute force compute finding a full and complete theory of language is an example. That's a case where the path of "get a ton of data and handle it statistically" competes with the path of "build a complete and abstract understanding of the domain." Lots of resources and lifetimes of work are decided by which path to take.
Agreed that overfitting the bitter lesson often leads slopping piles of compute and hardware at problems that could just be deterministic.
Agreed that overfitting the bitter lesson often leads slopping piles of compute and hardware at problems that could just be deterministic.