Digital visualization of protein molecular structure showing complex folded chains and binding sites

AI Predicts 1 Billion Protein Structures in Open Atlas

🤯 Mind Blown

Scientists just released a free database of over 1 billion protein structures predicted by AI, more than quadrupling the size of previous collections. The breakthrough tool, ESMFold2, could accelerate discoveries in medicine and help researchers design new treatments for cancer and immune diseases.

The catalog of known proteins just expanded by more than a billion entries, thanks to a powerful new AI tool that anyone can use for free.

Researchers at the Chan Zuckerberg Initiative's Biohub in San Francisco unveiled ESMFold2, an artificial intelligence system that predicted the structures of 1.1 billion proteins. The team also released information on 6.8 billion protein sequences, most from poorly understood environments like soil and ocean samples.

The new ESM Atlas dwarfs Google DeepMind's AlphaFold Database by more than 800 million entries. Even more exciting, ESMFold2 outperforms AlphaFold3 at predicting how proteins interact with each other, a crucial ability for developing new medicines.

"What this atlas does is it shows the totality of protein biology and especially the parts that are most unknown," says Alex Rives, Biohub's science head who led the project. The team trained ESMFold2 on billions of proteins from across the tree of life, including genetic material from environments that previous databases missed entirely.

The researchers put their tool to the test by designing new antibodies and proteins that attach to targets linked to cancer and immune disorders. When they created these designs in the lab, a high proportion worked exactly as predicted.

AI Predicts 1 Billion Protein Structures in Open Atlas

Using their atlas, the team already discovered unexpected connections in nature. They found structural similarities between CRISPR defense proteins in microbes and a gene-editing protein identified in a soil fungus, revealing how evolution repurposes similar protein shapes across vastly different organisms.

Why This Inspires

What makes this breakthrough truly special is its accessibility. Unlike many cutting-edge AI tools locked behind corporate paywalls, ESMFold2 is completely open source with no restrictions on commercial use.

Scientists worldwide are celebrating this democratization of discovery. "It's exciting to see how large scale protein language models can capture fundamental rules of protein biology," says Gemma Atkinson, a computational biologist at Lund University in Sweden.

Christine Orengo at University College London notes the predictions could help uncover entirely new protein shapes and functions, advancing both practical protein design and our basic understanding of life itself. The atlas gives researchers a bridge between well-studied proteins and the vast unknown territory of biology waiting to be explored.

Computational biologist Sergey Ovchinnikov at MIT expects widespread enthusiasm for trying ESMFold2, particularly because its open nature allows researchers to build upon it freely. The tool arrives at a moment when protein AI models are advancing at breakneck speed, each breakthrough building on the last.

This massive expansion of humanity's protein knowledge puts powerful discovery tools in the hands of researchers everywhere, from major universities to small labs chasing the next medical breakthrough.

More Images

AI Predicts 1 Billion Protein Structures in Open Atlas - Image 2
AI Predicts 1 Billion Protein Structures in Open Atlas - Image 3
AI Predicts 1 Billion Protein Structures in Open Atlas - Image 4
AI Predicts 1 Billion Protein Structures in Open Atlas - Image 5

Based on reporting by Scientific American

This story was written by BrightWire based on verified news reports.

Spread the positivity!

Share this good news with someone who needs it

More Good News