 
Evolution has been fine-tuning life at the molecular level for billions of years. Proteins, the fundamental building blocks of life, have evolved through this process to perform various biological functions, from fighting infections to digesting food. These complex molecules comprise long chains of amino acids arranged in precise sequences that dictate their structure and function. While nature has produced an extraordinary diversity of proteins, understanding their structure and designing entirely new proteins has long been a complex challenge for scientists.
Recent advancements in artificial intelligence are transforming our ability to tackle some of biology’s most significant challenges. Previously, AI was used to predict how a given protein sequence would fold and behave – a complex challenge due to the vast number of configurations. Recently, AI has advanced to generate entirely new proteins at an unprecedented scale. This milestone has been achieved with ESM3, a multimodal generative language model designed by EvolutionaryScale. Unlike conventional AI systems designed for text processing, ESM3 has been trained to understand protein sequences, structures, and functions. What makes it truly remarkable is its ability to simulate 500 million years of evolution—a feat that has led to the creation of a completely new fluorescent protein, something never before seen in nature.
This breakthrough is a significant step toward making biology more programmable, opening new possibilities for designing custom proteins with applications in medicine, materials science, and beyond. In this article, we explore how ESM3 works, what it has achieved, and why this advancement is reshaping our understanding of biology and evolution.
Meet ESM3: The AI That Simulates Evolution
ESM3 is a multimodal language model trained to understand and generate proteins by analyzing their sequences, structures, and functions. Unlike AlphaFold, which can predict the structure of existing proteins, ESM3 is essentially a protein engineering model, allowing researchers to specify functional and structural requirements to design entirely new proteins.
The model holds deep knowledge of protein sequences, structures, and functions along with the ability to generate proteins through an interaction with users. This capability empowers the model to generate proteins that may not exist in nature yet remain biologically viable. Creating a novel green fluorescent protein (esmGFP) is a striking demonstration of this capability. Fluorescent proteins, initially discovered in jellyfish and corals, are widely used in medical research and biotechnology. To develop esmGFP, researchers provided ESM3 with key structural and functional characteristics of known fluorescent proteins. The model then iteratively refined the design, applying a chain-of-thought reasoning approach to optimize the sequence. While natural evolution could take millions of years to produce similar protein, ESM3 accelerates this process to achieve it in days or weeks.
The AI-Driven Protein Design Process
Here is how researchers have used ESM3 to develop esmGFP:
- Prompting the AI – Initially, they input sequence and structural cues to guide ESM3 toward fluorescence-related features.
- Generating Novel Proteins – ESM3 explored a vast space of potential sequences to produce thousands of candidate proteins.
- Filtering and Refinement – The most promising designs were filtered and synthesized for laboratory testing.
- Validation in Living Cells – Selected AI-designed proteins were expressed in bacteria to confirm their fluorescence and functionality.
This process has resulted to a fluorescent protein (esmGFP) unlike anything in nature.
How esmGFP Compares to Natural Proteins
What makes esmGFP extraordinary is how distant it is from known fluorescent proteins. While most newly discovered GFPs have slight variations from existing ones, esmGFP has a sequence identity of only 58% to its closest natural relative. Evolutionarily, such a difference corresponds to a diverging time of over 500 million years.
To put this into perspective, the last time proteins with similar evolutionary distances emerged, dinosaurs had not yet appeared, and multicellular life was still in its early stages. This means AI has not just accelerated evolution – it has simulated an entirely new evolutionary pathway, producing proteins that nature might never have created.
Why This Discovery Matters
This development is a significant step forward in protein engineering and deepens our understanding of evolution. By simulating millions of years of evolution in just days, AI is opening doors to exciting new possibilities:
- Faster Drug Discovery: Many medicines work by targeting specific proteins, but finding the right ones is slow and expensive. AI-designed proteins could speed up this process, helping researchers discover new treatments more efficiently.
- New Solutions in Bioengineering: Proteins are used in everything from breaking down plastic waste to detecting diseases. With AI-driven design, scientists can create custom proteins for healthcare, environmental protection, and even new materials.
- AI as an Evolutionary Simulator: One of the most intriguing aspects of this research is that it positions AI as a simulator of evolution rather than just a tool for analysis. Traditional evolutionary simulations involve iterating through genetic mutations, often taking months or years to generate viable candidates. ESM3, however, bypasses these slow constraints by predicting functional proteins directly. This shift in approach means that AI could not just mimic evolution but actively explore evolutionary possibilities beyond nature. Given enough computational power, AI-driven evolution could uncover new biochemical properties that have never existed in the natural world.
Ethical Considerations and Responsible AI Development
While the potential benefits of AI-driven protein engineering are immense, this technology also raises ethical and safety questions. What happens when AI starts designing proteins beyond human understanding? How do we ensure these proteins are safe for medical or environmental use?
We need to focus on responsible AI development and thorough testing to tackle these concerns. AI-generated proteins, like esmGFP, should undergo extensive laboratory testing before being considered for real-world applications. Additionally, ethical frameworks for AI-driven biology are being developed to ensure transparency, safety, and public trust.
The Bottom Line
The launch of ESM3 is a vital development in the field of biotechnology. ESM3 demonstrates that evolution shouldn’t be a slow, trial-and-error process. Compressing 500 million years of protein evolution into just days opens a future where scientists can design brand-new proteins with incredible speed and accuracy. The development of ESM3 means that we can not just use AI to understand biology but also to reshape it. This breakthrough helps us to advance our ability to program biology the way we program software, unlocking possibilities we’re only beginning to imagine.
The post AI Just Simulated 500 Million Years of Evolution – And Created a New Protein! appeared first on Unite.AI.

 
			 
			