Abstract
Understanding the effects of genetic variation is a fundamental problem in biology that requires methods to analyse both physical and functional consequences of sequence changes at systems-wide and mechanistic scales. To achieve a systems view, protein interaction networks map which proteins physically interact, while genetic interaction networks inform on the phenotypic consequences of perturbing these protein interactions. Until recently, understanding the molecular mechanisms that underlie these interactions often required biophysical methods to determine the structures of the proteins involved. The past decade has seen the emergence of new approaches based on coevolution, deep mutational scanning and genome-scale genetic or chemical–genetic interaction mapping that enable modelling of the structures of individual proteins or protein complexes. Here, we review the emerging use of large-scale genetic datasets and deep learning approaches to model protein structures and their interactions, and discuss the integration of structural data from different sources.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
27,99 € / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
274,03 € per year
only 22,84 € per issue
Buy this article
- Purchase on SpringerLink
- Instant access to the full article PDF.
39,95 €
Prices may be subject to local taxes which are calculated during checkout





Similar content being viewed by others
References
Sharan, R., Ulitsky, I. & Shamir, R. Network-based prediction of protein function. Mol. Syst. Biol. 3, 88 (2007).
Barabasi, A. L. Scale-free networks: a decade and beyond. Science 325, 412–413 (2009).
Swaney, D. L. et al. A protein network map of head and neck cancer reveals PIK3CA mutant drug sensitivity. Science 374, eabf2911 (2021).
Kim, M. et al. A protein interaction landscape of breast cancer. Science 374, eabf3066 (2021).
Zheng, F. et al. Interpretation of cancer mutations using a multiscale map of protein systems. Science 374, eabf3067 (2021).
Krogan, N. J. et al. Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 440, 637–643 (2006).
Gavin, A. C. et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 440, 631–636 (2006).
Yu, H. et al. High-quality binary protein interaction map of the yeast interactome network. Science 322, 104–110 (2008).
Havugimana, P. C. et al. A census of human soluble protein complexes. Cell 150, 1068–1081 (2012).
Shi, Y. A glimpse of structural biology through X-ray crystallography. Cell 159, 995–1014 (2014).
Henderson, R. Realizing the potential of electron cryo-microscopy. Q. Rev. Biophys. 37, 3–13 (2004).
Wuthrich, K. The way to NMR structures of proteins. Nat. Struct. Biol. 8, 923–925 (2001).
Phillips, P. C. Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems. Nat. Rev. Genet. 9, 855–867 (2008).
Collins, S. R. et al. Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map. Nature 446, 806–810 (2007).
Tong, A. H. et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364–2368 (2001).
Dobson, C. M. Biophysical techniques in structural biology. Annu. Rev. Biochem. 88, 25–33 (2019).
Murata, K. & Wolf, M. Cryo-electron microscopy for structural analysis of dynamic biological macromolecules. Biochim. Biophys. Acta Gen. Subj. 1862, 324–334 (2018).
Huang, C. & Kalodimos, C. G. Structures of large protein complexes determined by nuclear magnetic resonance spectroscopy. Annu. Rev. Biophys. 46, 317–336 (2017).
Wall, M. E., Wolff, A. M. & Fraser, J. S. Bringing diffuse X-ray scattering into focus. Curr. Opin. Struct. Biol. 50, 109–116 (2018).
Altschuh, D., Lesk, A. M., Bloomer, A. C. & Klug, A. Correlation of co-ordinated amino acid substitutions with function in viruses related to tobacco mosaic virus. J. Mol. Biol. 193, 693–707 (1987).
Gobel, U., Sander, C., Schneider, R. & Valencia, A. Correlated mutations and residue contacts in proteins. Proteins 18, 309–317 (1994).
Neher, E. How frequent are correlated changes in families of protein sequences? Proc. Natl Acad. Sci. USA 91, 98–102 (1994).
Taylor, W. R. & Hatrick, K. Compensating changes in protein multiple sequence alignments. Protein Eng. 7, 341–348 (1994).
Shindyalov, I. N., Kolchanov, N. A. & Sander, C. Can three-dimensional contacts in protein structures be predicted by analysis of correlated mutations? Protein Eng. 7, 349–358 (1994).
Thomas, D. J., Casari, G. & Sander, C. The prediction of protein contacts from multiple sequence alignments. Protein Eng. 9, 941–948 (1996).
Dunn, S. D., Wahl, L. M. & Gloor, G. B. Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction. Bioinformatics 24, 333–340 (2008).
Fodor, A. A. & Aldrich, R. W. Influence of conservation on calculations of amino acid covariance in multiple sequence alignments. Proteins 56, 211–221 (2004).
Marks, D. S., Hopf, T. A. & Sander, C. Protein structure prediction from sequence variation. Nat. Biotechnol. 30, 1072–1080 (2012).
Thomas, J., Ramakrishnan, N. & Bailey-Kellogg, C. Graphical models of residue coupling in protein families. IEEE/ACM Trans. Comput. Biol. Bioinform 5, 183–197 (2008).
Balakrishnan, S., Kamisetty, H., Carbonell, J. G., Lee, S. I. & Langmead, C. J. Learning generative models for protein fold families. Proteins 79, 1061–1078 (2011).
Burger, L. & van Nimwegen, E. Disentangling direct from indirect co-evolution of residues in protein alignments. PLoS Comput. Biol. 6, e1000633 (2010).
Weigt, M., White, R. A., Szurmant, H., Hoch, J. A. & Hwa, T. Identification of direct residue contacts in protein-protein interaction by message passing. Proc. Natl Acad. Sci. USA 106, 67–72 (2009).
Jones, D. T., Buchan, D. W., Cozzetto, D. & Pontil, M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. Bioinformatics 28, 184–190 (2012).
UniProt, C. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021).
Marks, D. S. et al. Protein 3D structure computed from evolutionary sequence variation. PLoS ONE 6, e28766 (2011). This study describes the first application of protein structure modelling using spatial restraints derived from coevolution data.
Hopf, T. A. et al. Three-dimensional structures of membrane proteins from genomic sequencing. Cell 149, 1607–1621 (2012).
Sulkowska, J. I., Morcos, F., Weigt, M., Hwa, T. & Onuchic, J. N. Genomics-aided structure prediction. Proc. Natl Acad. Sci. USA 109, 10340–10345 (2012).
Nugent, T. & Jones, D. T. Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis. Proc. Natl Acad. Sci. USA 109, E1540–E1547 (2012).
Kamisetty, H., Ovchinnikov, S. & Baker, D. Assessing the utility of coevolution-based residue-residue contact predictions in a sequence- and structure-rich era. Proc. Natl Acad. Sci. USA 110, 15674–15679 (2013).
Hopf, T. A. et al. Sequence co-evolution gives 3D contacts and structures of protein complexes. eLife 3, e03430 (2014).
Ovchinnikov, S., Kamisetty, H. & Baker, D. Robust and accurate prediction of residue-residue interactions across protein interfaces using evolutionary information. eLife 3, e02030 (2014).
Bitbol, A. F., Dwyer, R. S., Colwell, L. J. & Wingreen, N. S. Inferring interaction partners from protein sequences. Proc. Natl Acad. Sci. USA 113, 12180–12185 (2016).
Pazos, F., Helmer-Citterich, M., Ausiello, G. & Valencia, A. Correlated mutations contain information about protein-protein interaction. J. Mol. Biol. 271, 511–523 (1997).
Baldassi, C. et al. Fast and accurate multivariate Gaussian modeling of protein families: predicting residue contacts and protein-interaction partners. PLoS ONE 9, e92721 (2014).
Cong, Q., Anishchenko, I., Ovchinnikov, S. & Baker, D. Protein interaction networks revealed by proteome coevolution. Science 365, 185–189 (2019). This study represents a major expansion of the utility of coevolution by applying it to predict PPIs on a proteome-wide scale in E. coli and M. tuberculosis.
Stiffler, M. A. et al. Protein structure from experimental evolution. Cell Syst. 10, 15–24 e15 (2020).
Ekeberg, M., Lovkvist, C., Lan, Y., Weigt, M. & Aurell, E. Improved contact prediction in proteins: using pseudolikelihoods to infer Potts models. Phys. Rev. E Stat. Nonlin Soft Matter Phys. 87, 012707 (2013).
Ovchinnikov, S. et al. Protein structure determination using metagenome sequence data. Science 355, 294–298 (2017).
Wang, S., Sun, S., Li, Z., Zhang, R. & Xu, J. Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol. 13, e1005324 (2017).
Zeng, H. et al. ComplexContact: a web server for inter-protein contact prediction using deep learning. Nucleic Acids Res. 46, W432–W437 (2018).
Jones, D. T. & Kandathil, S. M. High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinformatics 34, 3308–3315 (2018).
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). This deep learning approach allows for efficient prediction of protein structures at near experimental accuracy.
Burley, S. K. et al. RCSB Protein Data Bank: powerful new tools for exploring 3D structures of biological macromolecules for basic and applied research and education in fundamental biology, biomedicine, biotechnology, bioengineering and energy sciences. Nucleic Acids Res. 49, D437–D451 (2021).
Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).
Tunyasuvunakool, K. et al. Highly accurate protein structure prediction for the human proteome. Nature 596, 590–596 (2021).
Akdel, M. et al. A structural biology community assessment of AlphaFold 2 applications. Preprint at bioRxiv https://doi.org/10.1101/2021.09.26.461876 (2021).
Bryant, P., Pozzati, G. & Elofsson, A. Improved prediction of protein-protein interactions using AlphaFold2 and extended multiple-sequence alignments. Preprint at bioRxiv https://doi.org/10.1101/2021.09.15.460468 (2021).
Ghani, U. et al. Improved docking of protein models by a combination of Alphafold2 and ClusPro. Preprint at bioRxiv https://doi.org/10.1101/2021.09.07.459290 (2021).
Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. Preprint at bioRxiv https://doi.org/10.1101/2021.10.04.463034 (2021).
Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021). This deep learning approach allows for efficient prediction of protein structures at near experimental accuracy.
Humphreys, I. R. et al. Computed structures of core eukaryotic protein complexes. Science https://doi.org/10.1126/science.abm4805 (2021).
Gupta, M. et al. CryoEM and AI reveal a structure of SARS-CoV-2 Nsp2, a multifunctional protein involved in key host processes. Preprint at bioRxiv https://doi.org/10.1101/2021.05.10.443524 (2021).
Beltrao, P., Cagney, G. & Krogan, N. J. Quantitative genetic interactions reveal biological modularity. Cell 141, 739–745 (2010).
Boone, C., Bussey, H. & Andrews, B. J. Exploring genetic interactions and networks with yeast. Nat. Rev. Genet. 8, 437–449 (2007).
Pan, X. et al. A robust toolkit for functional profiling of the yeast genome. Mol. Cell 16, 487–496 (2004).
Collins, S. R., Schuldiner, M., Krogan, N. J. & Weissman, J. S. A strategy for extracting and analyzing large-scale quantitative epistatic interaction data. Genome Biol. 7, R63 (2006).
Schuldiner, M., Collins, S. R., Weissman, J. S. & Krogan, N. J. Quantitative genetic analysis in Saccharomyces cerevisiae using epistatic miniarray profiles (E-MAPs) and its application to chromatin functions. Methods 40, 344–352 (2006).
Costanzo, M. et al. A global genetic interaction network maps a wiring diagram of cellular function. Science 353, aaf1420 (2016).
Costanzo, M. et al. The genetic landscape of a cell. Science 327, 425–431 (2010).
Fiedler, D. et al. Functional organization of the S. cerevisiae phosphorylation network. Cell 136, 952–963 (2009).
Kapitzky, L. et al. Cross-species chemogenomic profiling reveals evolutionarily conserved drug mode of action. Mol. Syst. Biol. 6, 451 (2010).
Nichols, R. J. et al. Phenotypic landscape of a bacterial cell. Cell 144, 143–156 (2011).
Chang, M., Bellaoui, M., Boone, C. & Brown, G. W. A genome-wide screen for methyl methanesulfonate-sensitive mutants reveals genes required for S phase progression in the presence of DNA damage. Proc. Natl Acad. Sci. USA 99, 16934–16939 (2002).
Hillenmeyer, M. E. et al. The chemical genomic portrait of yeast: uncovering a phenotype for all genes. Science 320, 362–365 (2008).
Butland, G. et al. eSGA: E. coli synthetic genetic array analysis. Nat. Methods 5, 789–795 (2008).
Typas, A. et al. High-throughput, quantitative analyses of genetic interactions in E. coli. Nat. Methods 5, 781–787 (2008).
Lehner, B., Crombie, C., Tischler, J., Fortunato, A. & Fraser, A. G. Systematic mapping of genetic interactions in Caenorhabditis elegans identifies common modifiers of diverse signaling pathways. Nat. Genet. 38, 896–903 (2006).
Roguev, A. et al. Conservation and rewiring of functional modules revealed by an epistasis map in fission yeast. Science 322, 405–410 (2008).
Horn, T. et al. Mapping of signaling networks through synthetic genetic interaction analysis by RNAi. Nat. Methods 8, 341–346 (2011).
Jinek, M. et al. A programmable dual-RNA-guided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 (2012).
Du, D. et al. Genetic interaction mapping in mammalian cells using CRISPR interference. Nat. Methods 14, 577–580 (2017).
Shen, J. P. et al. Combinatorial CRISPR–Cas9 screens for de novo mapping of genetic interactions. Nat. Methods 14, 573–576 (2017).
Roguev, A. et al. Quantitative genetic-interaction mapping in mammalian cells. Nat. Methods 10, 432–437 (2013).
Laufer, C., Fischer, B., Billmann, M., Huber, W. & Boutros, M. Mapping genetic interactions in human cancer cells with RNAi and multiparametric phenotyping. Nat. Methods 10, 427–431 (2013).
Bassik, M. C. et al. A systematic mammalian genetic interaction map reveals pathways underlying ricin susceptibility. Cell 152, 909–922 (2013).
Haarer, B., Viggiano, S., Hibbs, M. A., Troyanskaya, O. G. & Amberg, D. C. Modeling complex genetic interactions in a simple eukaryotic genome: actin displays a rich spectrum of complex haploinsufficiencies. Genes Dev. 21, 148–159 (2007).
Ryan, C. J. et al. High-resolution network biology: connecting sequence with function. Nat. Rev. Genet. 14, 865–879 (2013).
Zhang, Z., Shibahara, K. & Stillman, B. PCNA connects DNA replication to epigenetic inheritance in yeast. Nature 408, 221–225 (2000).
Braberg, H. et al. From structure to systems: high-resolution, quantitative genetic analysis of RNA polymerase II. Cell 154, 775–788 (2013).
Braberg, H., Moehle, E. A., Shales, M., Guthrie, C. & Krogan, N. J. Genetic interaction analysis of point mutations enables interrogation of gene function at a residue-level resolution: exploring the applications of high-resolution genetic interaction mapping of point mutations. Bioessays 36, 706–713 (2014).
Fowler, D. M. & Fields, S. Deep mutational scanning: a new style of protein science. Nat. Methods 11, 801–807 (2014).
Melamed, D., Young, D. L., Gamble, C. E., Miller, C. R. & Fields, S. Deep mutational scanning of an RRM domain of the Saccharomyces cerevisiae poly(A)-binding protein. RNA 19, 1537–1551 (2013).
Olson, C. A., Wu, N. C. & Sun, R. A comprehensive biophysical description of pairwise epistasis throughout an entire protein domain. Curr. Biol. 24, 2643–2651 (2014).
Sahoo, A., Khare, S., Devanarayanan, S., Jain, P. C. & Varadarajan, R. Residue proximity information and protein model discrimination using saturation-suppressor mutagenesis. eLife 4, e09532 (2015).
Perica, T. et al. Systems-level effects of allosteric perturbations to a model molecular switch. Nature 599, 152–157 (2021).
Rollins, N. J. et al. Inferring protein 3D structure from deep mutation scans. Nat. Genet. 51, 1170–1176 (2019). This study describes the use of deep mutational scanning to generate restraints for determining the structures of small proteins or domains.
Schmiedel, J. M. & Lehner, B. Determining protein structures using deep mutagenesis. Nat. Genet. 51, 1177–1186 (2019). This study describes the use of deep mutational scanning to generate restraints for determining the structures of small proteins or domains.
Eccleston, R. C., Pollock, D. D. & Goldstein, R. A. Selection for cooperativity causes epistasis predominately between native contacts and enables epistasis-based structure reconstruction. Proc. Natl Acad. Sci. USA 118, e2010057 (2021).
Araya, C. L. et al. A fundamental protein property, thermodynamic stability, revealed solely from large-scale measurements of protein function. Proc. Natl Acad. Sci. USA 109, 16858–16863 (2012).
Diss, G. & Lehner, B. The genetic landscape of a physical interaction. eLife 7, e32472 (2018).
Kobori, S. & Yokobayashi, Y. High-throughput mutational analysis of a twister ribozyme. Angew. Chem. Int. Ed. Engl. 55, 10354–10357 (2016).
Newberry, R. W., Leong, J. T., Chow, E. D., Kampmann, M. & DeGrado, W. F. Deep mutational scanning reveals the structural basis for alpha-synuclein activity. Nat. Chem. Biol. 16, 653–659 (2020).
Bolognesi, B. et al. The mutational landscape of a prion-like domain. Nat. Commun. 10, 4162 (2019).
Braberg, H. et al. Genetic interaction mapping informs integrative structure determination of protein complexes. Science 370, eaaz4910 (2020). This study describes the modelling of protein complex structures, using restraints derived from genome-scale genetic interaction data and chemical–genetic interaction data.
Rout, M. P. & Sali, A. Principles for integrative structural biology studies. Cell 177, 1384–1403 (2019). This publication describes integrative structural biology, which serves as a crucial tool for integrating different types of dataset for the structural modelling of protein complexes.
Shiver, A. L. et al. Chemical-genetic interrogation of RNA polymerase mutants reveals structure-function relationships and physiological tradeoffs. Mol. Cell 81, 2201–2215 e2209 (2021).
Hockenberry, A. J. & Wilke, C. O. Evolutionary couplings detect side-chain interactions. PeerJ 7, e7280 (2019).
Roy, K. R. et al. Multiplexed precision genome editing with trackable genomic barcodes in yeast. Nat. Biotechnol. 36, 512–520 (2018).
Collins, S. R. et al. Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae. Mol. Cell Proteom. 6, 439–450 (2007).
Anzalone, A. V. et al. Search-and-replace genome editing without double-strand breaks or donor DNA. Nature 576, 149–157 (2019). This CRISPR–Cas9-based genome editing approach allows for all base-to-base conversions, insertions or deletions, without the need of a double-stranded break or donor DNA, and with lower off-target activity than Cas9 nuclease.
Ma, L. et al. CRISPR-Cas9-mediated saturated mutagenesis screen predicts clinical drug resistance with improved accuracy. Proc. Natl Acad. Sci. USA 114, 11751–11756 (2017).
Anzalone, A. V., Koblan, L. W. & Liu, D. R. Genome editing with CRISPR-Cas nucleases, base editors, transposases and prime editors. Nat. Biotechnol. 38, 824–844 (2020).
Findlay, G. M. et al. Accurate classification of BRCA1 variants with saturation genome editing. Nature 562, 217–222 (2018).
Erwood, S. et al. Saturation variant interpretation using CRISPR prime editing. Preprint at bioRxiv https://doi.org/10.1101/2021.05.11.443710 (2021).
McGuffee, S. R. & Elcock, A. H. Diffusion, crowding & protein stability in a dynamic molecular model of the bacterial cytoplasm. PLoS Comput. Biol. 6, e1000694 (2010).
Singla, J. et al. Opportunities and challenges in building a spatiotemporal multi-scale model of the human pancreatic β cell. Cell 173, 11–19 (2018).
Takamori, S. et al. Molecular anatomy of a trafficking organelle. Cell 127, 831–846 (2006).
Thul, P. J. et al. A subcellular map of the human proteome. Science 356, eaal3321 (2017).
Wilhelm, B. G. et al. Composition of isolated synaptic boutons reveals the amounts of vesicle trafficking proteins. Science 344, 1023–1028 (2014).
Eckhardt, M., Hultquist, J. F., Kaake, R. M., Huttenhain, R. & Krogan, N. J. A systems approach to infectious disease. Nat. Rev. Genet. 21, 339–354 (2020).
Gordon, D. E. et al. Comparative host-coronavirus protein interaction networks reveal pan-viral disease mechanisms. Science 370, eabe9403 (2020).
Gordon, D. E. et al. A SARS-CoV-2 protein interaction map reveals targets for drug repurposing. Nature 583, 459–468 (2020).
Ramage, H. R. et al. A combined proteomics/genomics approach links hepatitis C virus infection with nonsense-mediated mRNA decay. Mol. Cell 57, 329–340 (2015).
Jager, S. et al. Global landscape of HIV-human protein complexes. Nature 481, 365–370 (2011).
Gordon, D. E. et al. A quantitative genetic interaction map of HIV infection. Mol. Cell 78, 197–209.e197 (2020).
Tenthorey, J. L., Young, C., Sodeinde, A., Emerman, M. & Malik, H. S. Mutational resilience of antiviral restriction favors primate TRIM5alpha in host-virus evolutionary arms races. eLife 9, e59988 (2020).
Starr, T. N. et al. Deep mutational scanning of SARS-CoV-2 receptor binding domain reveals constraints on folding and ACE2 binding. Cell 182, 1295–1310 e1220 (2020).
Greaney, A. J. et al. Complete mapping of mutations to the SARS-CoV-2 spike receptor-binding domain that escape antibody recognition. Cell Host Microbe 29, 44–57 e49 (2021).
Gong, L. I., Suchard, M. A. & Bloom, J. D. Stability-mediated epistasis constrains the evolution of an influenza protein. eLife 2, e00631 (2013).
Wong, A. H. M. et al. Receptor-binding loops in alphacoronavirus adaptation and evolution. Nat. Commun. 8, 1735 (2017).
Sali, A. From integrative structural biology to cell biology. J. Biol. Chem. 296, 100743 (2021).
Kim, S. J. et al. Integrative structure and functional anatomy of a nuclear pore complex. Nature 555, 475–482 (2018).
Lasker, K. et al. Molecular architecture of the 26S proteasome holocomplex determined by an integrative approach. Proc. Natl Acad. Sci. USA 109, 1380–1387 (2012).
Gutierrez, C. et al. Structural dynamics of the human COP9 signalosome revealed by cross-linking mass spectrometry and integrative modeling. Proc. Natl Acad. Sci. USA 117, 4088–4098 (2020).
Kwon, Y. et al. Structural basis of CD4 downregulation by HIV-1 Nef. Nat. Struct. Mol. Biol. 27, 822–828 (2020).
Luo, J. et al. Architecture of the human and yeast general transcription and DNA repair factor TFIIH. Mol. Cell 59, 794–806 (2015).
Wang, S., Li, W., Liu, S. & Xu, J. RaptorX-Property: a web server for protein structure property prediction. Nucleic Acids Res. 44, W430–W435 (2016).
Fernandez-de-Cossio-Diaz, J., Uguzzoni, G. & Pagnani, A. Unsupervised inference of protein fitness landscape from deep mutational scan. Mol. Biol. Evol. 38, 318–328 (2021).
Schaarschmidt, J., Monastyrskyy, B., Kryshtafovych, A. & Bonvin, A. Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age. Proteins 86 (Suppl. 1), 51–66 (2018).
Viswanath, S. & Sali, A. Optimizing model representation for integrative structure determination of macromolecular assemblies. Proc. Natl Acad. Sci. USA 116, 540–545 (2019).
Saltzberg, D. J. et al. Using Integrative Modeling Platform to compute, validate, and archive a model of a protein complex structure. Protein Sci. 30, 250–261 (2021).
Viswanath, S., Chemmama, I. E., Cimermancic, P. & Sali, A. Assessing exhaustiveness of stochastic sampling for integrative modeling of macromolecular structures. Biophys. J. 113, 2344–2353 (2017).
Russel, D. et al. Putting the pieces together: integrative modeling platform software for structure determination of macromolecular assemblies. PLoS Biol. 10, e1001244 (2012).
Acknowledgements
The authors thank P. Beltrao and R. B. Babu for helpful discussion and comments on the manuscript. This research was funded by grants from the National Institutes of Health (NIH) (U54CA209891, U54NS100717, 1U01MH115747, U19 AI135990, U19AI135972, and P50AI150476 to N.J.K; R01GM083960 and P41GM109824 to A.S.). This work was supported by the Defense Advanced Research Projects Agency (DARPA) under Cooperative Agreements HR00111920020 and HR00112020029 to N.J.K. The views, opinions and/or findings contained in this material are those of the authors and should not be interpreted as representing the official views or policies of the Department of Defense or the US Government.
Author information
Authors and Affiliations
Contributions
The authors contributed equally to all aspects of the article.
Corresponding author
Ethics declarations
Competing interests
The Krogan Laboratory has received research support from Vir Biotechnology and F. Hoffmann-La Roche. N.J.K. has consulting agreements with the Icahn School of Medicine at Mount Sinai, New York, Maze Therapeutics and Interline Therapeutics. N.J.K. is a shareholder in Tenaya Therapeutics, Maze Therapeutics and Interline Therapeutics, and a financially compensated Scientific Advisory Board Member for GEn1E Lifesciences, Inc. The other authors declare no competing interests.
Additional information
Peer review information
Nature Reviews Genetics thanks the anonymous reviewers for their contribution to the peer review of this work.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Glossary
- Multiple sequence alignment
-
An alignment of the sequences from multiple proteins. The multiple sequence alignment defines how the residue positions in each protein relate to those of the other proteins.
- Protein family
-
A group of evolutionarily related proteins. The members of a protein family will typically have similar sequences and/or structures and related functions.
- Orthologues
-
Evolutionarily related genes in different species. The proteins encoded by orthologous genes are typically responsible for the same function in the respective organisms.
- Paralogues
-
Genes with similar sequences that originated via a duplication event within a genome. Paralogues belong to the same species and their encoded proteins are typically not involved in the same function.
- Neural network
-
A category of machine learning that is inspired by the human brain and is central to deep learning algorithms.
- Homology modelling
-
A method for determining the structure of a protein on the basis of sequence similarity with another protein of known structure by satisfying spatial restraints.
- Subunits
-
Single proteins in the context of a protein complex.
- Knockdowns
-
Genes whose expression has been reduced.
- Complex haploinsufficiencies
-
Negative genetic interactions observed in cells that are hemizygous for two different genes. The phenotype of the two hemizygous loci combined is more severe than expected if the genes were unrelated.
- Hemizygous
-
A diploid cell is hemizygous for a gene if it harbours only one functional allele of the gene.
- Allostery
-
A process whereby an active site in a protein (enzyme) is regulated by the binding of a molecule to a different site (typically distal in space).
- Knockouts
-
Genes that have been inactivated (for example, deleted).
Rights and permissions
About this article
Cite this article
Braberg, H., Echeverria, I., Kaake, R.M. et al. From systems to structure — using genetic data to model protein structures. Nat Rev Genet 23, 342–354 (2022). https://doi.org/10.1038/s41576-021-00441-w
Accepted:
Published:
Version of record:
Issue date:
DOI: https://doi.org/10.1038/s41576-021-00441-w
This article is cited by
-
A structurally informed human protein–protein interactome reveals proteome-wide perturbations caused by disease mutations
Nature Biotechnology (2025)
-
Contextual AI models for single-cell protein biology
Nature Methods (2024)
-
Y12F mutation in Pseudomonas plecoglossicida S7 lipase enhances its thermal and pH stability for industrial applications: a combination of in silico and in vitro study
World Journal of Microbiology and Biotechnology (2023)
-
In Silico Comparative Structural and Residue Interaction Network Analysis of MATE Efflux Proteins in P. aeruginosa and S. aureus
Chemistry Africa (2022)


