Meta ESM-2 Fold - AI faster than Alphafold 2

preview_player
Показать описание
In this video I go over latest progress from Meta on protein folding, their ESM-2 model used to predict structures of metagenic proteins. This is a big step forward in the field.
In the paper they compare the performance to Alphafold 2 and RoseTTA.

Paper Title: Evolutionary-scale prediction of atomic level protein structure with a language model

Abstract:
Artificial intelligence has the potential to open insight into the structure of proteins at the scale of evolution. It has only recently been possible to extend protein structure prediction to two hundred million cataloged proteins. Characterizing the structures of the exponentially growing billions of protein sequences revealed by large scale gene sequencing experiments would necessitate a breakthrough in the speed of folding. Here we show that direct inference of structure from primary sequence using a large language model enables an order of magnitude speed-up in high resolution structure prediction. Leveraging the insight that language models learn evolutionary patterns across millions of sequences, we train models up to 15B parameters, the largest language model of proteins to date. As the language models are scaled they learn information that enables prediction of the three-dimensional structure of a protein at the resolution of individual atoms. This results in prediction that is up to 60x faster than state-of-the-art while maintaining resolution and accuracy. Building on this, we present the ESM Metagenomic Atlas. This is the first large-scale structural characterization of metagenomic proteins, with more than 617 million structures. The atlas reveals more than 225 million high confidence predictions, including millions whose structures are novel in comparison with experimentally determined structures, giving an unprecedented view into the vast breadth and diversity of the structures of some of the least understood proteins on earth.

Timestamps:
0:00 Intro
1:18 ESMfold Intro
1:45 Metagenomics
2:11 Protein Folding Background
3:07 Model Architecture
4:51 Results
6:42 Results vs. Aplhafold 2
7:47 Why is Meta doing this?
8:31 Practical Uses

References:

MSA Transformer:

Videos:
Deepmind AlphaFold:

Images:
Рекомендации по теме
Комментарии
Автор

Great job, Matej! You were always good at explaining things!

aogozen
Автор

Excellent. Thanks for making this informative video. You should have 100x the subscribers.

BrienDunn
Автор

Great content, keep up the good work.

filipsand
Автор

ESM-3 came out this year, 2024. The same year as AlphaFold 3.

squamish
Автор

Hi, appear to the content, ColabFold and another variants of the same AlphaFold2: OmegaFold2 (is superiority in short sequences).

However, ESM Fold is the fastest method for sequences with length 50 and 100. But, ESMFold it is not as accurate as Omega Fold or ColabFold. Both (Colab & Omega), are accurate methods than ESMFold.

Still missing bring OpenFold.

samuelsaldana
Автор

Does this resolve the quality issues mentioned on the wiki page of protein folding that claims 1/3 of the data produced by AlphaFold was unusable?

almor
Автор

ok so what is the difference between these *Fold systems and the LLM's that have gotten so much attention in the last few weeks? are they different then the systems that can produce the most efficient structural designs for things like vehicles or furniture frames? what is Alpha Tensor? what other types of systems are there?

eaudesolero
Автор

Your "honestly I'm not entirely sure" at 7:53 speaks volumes... 😀 That's all fascinating and fine. But it remains an estimating tool that will always need empiric verification to check if the prediction is correct. And, at the end of the day, I doubt that, apart from theoretical new insights (yes, evolutionary biology might take great advantage of it). I don't think it will lead us to real-life practical applications such as curing diseases. Because we are light years away from filling the gap between protein structure and organic functions. We have no idea why a specific architecture leads to specific functionality. In fact, even once we will know all the protein structures, that will not automatically tell us how to design from it new drugs. Protein structure by itself will not be more informative to design new drugs as the mapping of the genome was for designing drugs against genetic diseases. As usual, again and again, it turns out that the map is not the territory.
Anyway, thank you for updating us.

The-Wide-Angle
Автор

this is also is why people worry about vaccines.

itsjaysenofficial