
Researchers at the Massachusetts Institute of Technology (MIT) have discovered thousands of new proteins that protect bacteria from virus attacks using an AI system called DefensePredictor. What would usually take months of lab work can now be narrowed down to promising candidates in minutes.
Viral defenses
Bacteria are under constant attack from viruses called bacteriophages. One of their most powerful defenses is CRISPR-Cas, a system that cuts up viral DNA to stop an infection and is now a valuable biotechnology tool for precisely editing genes in a lab.
Traditional methods of finding these defenses are long and laborious, equivalent to looking for a needle in a haystack. They involve searching for nearby known defensive genes and manually testing thousands of DNA fragments. But now, AI can take the strain.
To develop their machine learning tool, the scientists trained it on 17,000 different bacterial genomes, as they describe in a paper published in the journal Science. Because genes contain instructions for making proteins, the system identifies the proteins encoded in each genome and analyzes them using a protein language model called ESM2. It can distinguish between a normal protein and a defensive one by examining specific characteristics, such as gene length, nearby genes and patterns in the DNA sequences surrounding each gene.
To further refine DefensePredictor, the team trained it on 15,000 proteins already known to fight viruses and 186,000 normal proteins that perform everyday tasks. By comparing these two groups, the AI learned to rapidly distinguish defensive proteins from non-defensive ones.
Identifying new defenses
Next came the system’s big test. DefensePredictor scanned 69 diverse E. coli strains and identified 624 protein clusters as defensive. This included more than 100 that had no previously known connection to bacterial immune systems. The researchers then cloned 94 of these predicted systems into E. coli cells and exposed them to 24 different phages. Nearly 45% protected the bacteria from infection.
“Our results demonstrate that DefensePredictor is a powerful tool for discovering new prokaryotic immune systems,” commented the study authors in their paper.
“The new systems that we identified indicate that E. coli harbors a much broader landscape of antiphage defense than previously realized.”
Not only did the AI tool identify new bacterial defense systems, but it did so far more quickly than traditional methods. Beyond E. coli, the scientists tested their system on 1,000 different microorganisms, and it identified nearly 3,000 protein clusters with no similarity to previously known bacterial immune systems.
The researchers have released DefensePredictor as a resource for the global scientific community and will continue to refine it as new data arrives.
Written for you by our author Paul Arnold, edited by Lisa Lock, —this article is the result of careful human work. We rely on readers like you to keep independent science journalism alive.
If this reporting matters to you, please consider a donation (especially monthly). You’ll get an ad-free account as a thank-you.
Publication details
Peter C. DeWeirdt et al, DefensePredictor: A machine learning model to discover prokaryotic immune systems, Science (2026). DOI: 10.1126/science.adv7924
On GitHub: github.com/PeterDeWeirdt/defense_predictor
The content is provided for information purposes only.
