Publications
Exploring “dark matter” protein folds using deep learning
Abstract
De novoprotein design aims to explore uncharted sequence-and structure areas to generate novel proteins that have not been sampled by evolution. One of the main challenges inde novodesign involves crafting “designable” structural templates that can guide the sequence search towards adopting the target structures. Here, we present an approach to learn patterns of protein structure based on a convolutional variational autoencoder, dubbed Genesis. We coupled Genesis with trRosetta to design sequences for a set of protein folds and found that Genesis is capable of reconstructing native-like distance-and angle distributions for five native folds and three novel, so-called “dark-matter” folds as a demonstration of generalizability. We used a high-throughput assay to characterize protease resistance of the designs, obtaining encouraging success rates for folded proteins and further biochemically characterized folded designs. The Genesis framework enables the exploration of the protein sequence and fold space within minutes and is not bound to specific protein topologies. Our approach addresses the backbone designability problem, showing that structural patterns in proteins can be efficiently learned by small neural networks and could ultimately contribute to thede novodesign of proteins with new functions.
Product Used
Genes
Related Publications