Preferences

But this is just a detail, right? If we went and painstakingly catalogued millions of proteins, we'd be able to use the simple model without needing a complex model to generated data, no?

connorbrinton
Technically yes. But it can take months to years to experimentally obtain the structure for a single protein, and that assumes that it's possible to crystallize (X-ray), prepare grids (cryo-EM) or highly concentrate (NMR) the protein at all.

On the other hand, validating a predicted protein structure to a good level of accuracy is much easier (solvent accessibility, mutagenesis, etc.). So having a complex model that can be trained on a small dataset drastically expands the set of accurate protein structure samples available to future models, both through direct predictions and validated protein structures.

So technically yes, this dataset could have been collected solely experimentally, but in practice, AlphaFold is now part of the experimental process. Without it, the world would have less protein structure data, in terms of both directly predicted and experimentally verified protein structures

stavros OP
I agree, I guess I'm saying that it's more of a quantitative improvement, rather than a qualitative one.

This item has no comments currently.