Publications
Cell systemsMar 2025 |
16
(
3
),
101236
DOI:
10.1016/j.cels.2025.101236

Engineering highly active nuclease enzymes with machine learning and high-throughput screening

Thomas, Neil; Belanger, David; Xu, Chenling; Lee, Hanson; Hirano, Kathleen; Iwai, Kosuke; Polic, Vanja; Nyberg, Kendra D; Hoff, Kevin G; Frenz, Lucas; Emrich, Charlie A; Kim, Jun W; Chavarha, Mariya; Ramanan, Abi; Agresti, Jeremy J; Colwell, Lucy J
Product Used
Variant Libraries
Abstract
Optimizing enzymes to function in novel chemical environments is a central goal of synthetic biology, but optimization is often hindered by a rugged fitness landscape and costly experiments. In this work, we present TeleProt, a machine learning (ML) framework that blends evolutionary and experimental data to design diverse protein libraries, and employ it to improve the catalytic activity of a nuclease enzyme that degrades biofilms that accumulate on chronic wounds. After multiple rounds of high-throughput experiments, TeleProt found a significantly better top-performing enzyme than directed evolution (DE), had a better hit rate at finding diverse, high-activity variants, and was even able to design a high-performance initial library using no prior experimental data. We have released a dataset of 55,000 nuclease variants, one of the most extensive genotype-phenotype enzyme activity landscapes to date, to drive further progress in ML-guided design. A record of this paper's transparent peer review process is included in the supplemental information.
Product Used
Variant Libraries

Related Publications