Template-based protein structure modeling using the RaptorX web server

Källberg, Morten; Wang, Haipeng; Wang, Sheng; Peng, Jian; Wang, Zhiyong; Lu, Hui; Xu, Jinbo

doi:10.1038/nprot.2012.085

Protocol
Published: 19 July 2012

Template-based protein structure modeling using the RaptorX web server

Morten Källberg^1,2^na1,
Haipeng Wang¹^na1,
Sheng Wang¹,
Jian Peng¹,
Zhiyong Wang¹,
Hui Lu² &
…
Jinbo Xu¹

Nature Protocols volume 7, pages 1511–1522 (2012)Cite this article

8882 Accesses
1142 Citations
30 Altmetric
Metrics details

Subjects

Abstract

A key challenge of modern biology is to uncover the functional role of the protein entities that compose cellular proteomes. To this end, the availability of reliable three-dimensional atomic models of proteins is often crucial. This protocol presents a community-wide web-based method using RaptorX (http://raptorx.uchicago.edu/) for protein secondary structure prediction, template-based tertiary structure modeling, alignment quality assessment and sophisticated probabilistic alignment sampling. RaptorX distinguishes itself from other servers by the quality of the alignment between a target sequence and one or multiple distantly related template proteins (especially those with sparse sequence profiles) and by a novel nonlinear scoring function and a probabilistic-consistency algorithm. Consequently, RaptorX delivers high-quality structural models for many targets with only remote templates. At present, it takes RaptorX ∼35 min to finish processing a sequence of 200 amino acids. Since its official release in August 2011, RaptorX has processed ∼6,000 sequences submitted by ∼1,600 users from around the world.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Performance assessment of core prediction modules in the RaptorX server.**

**Figure 2: Workflow used by the RaptorX server.**

**Figure 4: Secondary structure result interface.**

**Figure 5: Tertiary structure result interface.**

**Figure 6: Disorder prediction result display.**

**Figure 7: Custom alignment result interface.**

**Figure 8: Domain parsing result display.**

Highly accurate protein structure prediction with AlphaFold

Article Open access 15 July 2021

John Jumper, Richard Evans, … Demis Hassabis

Genomic language model predicts protein co-regulation and function

Article Open access 03 April 2024

Yunha Hwang, Andre L. Cornman, … Peter R. Girguis

A 5′ UTR language model for decoding untranslated regions of mRNA and function predictions

Article 05 April 2024

Yanyi Chu, Dan Yu, … Mengdi Wang

References

Aebersold, R. & Mann, M. Mass spectrometry-based proteomics. Nature 422, 198–207 (2003).
Article PubMed CAS Google Scholar
Källberg, M. & Lu, H. An improved machine learning protocol for the identification of correct Sequest search results. BMC Bioinformatics 11, 591 (2010).
Article PubMed PubMed Central CAS Google Scholar
Bairoch, A. The ENZYME database in 2000. Nucleic Acids Res 28, 304–305 (2000).
Article PubMed PubMed Central CAS Google Scholar
Hannum, G. et al. Genome-wide association data reveal a global map of genetic interactions among protein complexes. PLoS Genet 5, e1000782 (2009).
Article PubMed PubMed Central CAS Google Scholar
Berman, H.M. et al. The Protein Data Bank. Nucleic Acids Res 28, 235–242 (2000).
Article PubMed PubMed Central CAS Google Scholar
Martí-Renom, M.A. et al. Comparative protein structure modeling of genes and genomes. Annu. Rev. Biophys. Biomol. Struct. 29, 291–325 (2000).
Article PubMed Google Scholar
Soding, J. Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960 (2005).
Article PubMed Google Scholar
Bowie, J.U., Lüthy, R. & Eisenberg, D. A method to identify protein sequences that fold into a known three-dimensional structure. Science 253, 164–170 (1991).
Article PubMed CAS Google Scholar
Jones, D.T., Taylor, W.R. & Thornton, J.M. A new approach to protein fold recognition. Nature 358, 86–89 (1992).
Article PubMed CAS Google Scholar
Wu, S. & Zhang, Y. MUSTER: Improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins 72, 547–556 (2008).
Article PubMed PubMed Central CAS Google Scholar
Zhang, C., Liu, S., Zhou, H. & Zhou, Y. An accurate, residue-level, pair potential of mean force for folding and binding based on the distance-scaled, ideal-gas reference state. Protein Sci. 13, 400–411 (2004).
Article PubMed PubMed Central CAS Google Scholar
Zhang, W., Liu, S. & Zhou, Y. SP5: improving protein fold recognition by using torsion angle profiles and profile-based gap penalty model. PLoS ONE 3, e2325 (2008).
Article PubMed PubMed Central CAS Google Scholar
Xu, J. & Li, M. Assessment of RAPTOR's linear programming approach in CAFASP3. Proteins 53, 579–584 (2003).
Article PubMed CAS Google Scholar
Xu, J., Li, M., Kim, D. & Xu, Y. RAPTOR: optimal protein threading by linear programming. J. Bioinform. Comput. Biol. 1, 95–117 (2003).
Article PubMed CAS Google Scholar
Xu, J., Li, M., Lin, G., Kim, D. & Xu, Y. Protein threading by linear programming. Pac. Symp. Biocomput. 264–275 (2003).
Baker, D. & Sali, A. Protein structure prediction and structural genomics. Science 294, 93–96 (2001).
Article PubMed CAS Google Scholar
Liwo, A., Lee, J., Ripoll, D.R., Pillardy, J. & Scheraga, H.A. Protein structure prediction by global optimization of a potential energy function. Proc. Natl. Acad. Sci. USA 96, 5482–5485 (1999).
Article PubMed PubMed Central CAS Google Scholar
Simons, K.T., Kooperberg, C., Huang, E. & Baker, D. Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J. Mol. Biol. 268, 209–225 (1997).
Article PubMed CAS Google Scholar
Wu, S., Skolnick, J. & Zhang, Y. Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biol. 5, 17 (2007).
Article PubMed PubMed Central CAS Google Scholar
Zhang, Y. I-TASSER: fully automated protein structure prediction in CASP8. Proteins 77, 100–113 (2009).
Article PubMed PubMed Central CAS Google Scholar
Pieper, U. et al. MODBASE, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 37, D347–D354 (2009).
Article PubMed CAS Google Scholar
Peng, J. & Xu, J. RaptorX: exploiting structure information for protein alignment by statistical inference. Proteins 79, 161–171 (2011).
Article PubMed PubMed Central CAS Google Scholar
Peng, J. & Xu, J. Low-homology protein threading. Bioinformatics 26, i294–i300 (2010).
Article PubMed PubMed Central CAS Google Scholar
Peng, J. & Xu, J. Boosting Protein Threading Accuracy. Lect. Notes Comput. Sci. 5541, 31–45 (2009).
Article CAS Google Scholar
Peng, J. & Xu, J. A multiple-template approach to protein threading. Proteins 79, 1930–1939 (2011).
Article PubMed PubMed Central CAS Google Scholar
Roy, A., Kucukural, A. & Zhang, Y. I-TASSER: a unified platform for automated protein structure and function prediction. Nat. Protoc. 5, 725–738 (2010).
Article PubMed PubMed Central CAS Google Scholar
Mariani, V., Kiefer, F., Schmidt, T., Haas, J. & Schwede, T. Assessment of template based protein structure predictions in CASP9. Proteins 79, 37–58 (2011).
Article PubMed CAS Google Scholar
Peng, J., Bo, L. & Xu, J. Conditional neural fields. In Advances in Neural Information Processing Systems 22 (eds. Bengio Y., Schuurmans D., Lafferty J., Williams C.K.I. and Culotta A.) 1419–1427 (Neural Information Processing Systems Foundation, 2009).
Eickholt, J., Deng, X. & Cheng, J. DoBo: Protein domain boundary prediction by integrating evolutionary signals and machine learning. BMC Bioinformatics 12, 43 (2011).
Article PubMed PubMed Central CAS Google Scholar
Buchan, D.W. et al. Protein annotation and modelling servers at University College London. Nucleic Acids Res 38, W563–W568 (2010).
Article PubMed PubMed Central CAS Google Scholar
Pollastri, G., Przybylski, D., Rost, B. & Baldi, P. Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins 47, 228–235 (2002).
Article PubMed CAS Google Scholar
Punta, M. et al. The Pfam protein families database. Nucleic Acids Res. 40, D290–D301 (2012).
Article PubMed CAS Google Scholar
Fiser, A. & Sali, A. Modeller: generation and refinement of homology-based protein structure models. Methods Enzymol. 374, 461–491 (2003).
Article PubMed CAS Google Scholar
Zhao, H., Yang, Y. & Zhou, Y. Highly accurate and high-resolution function prediction of RNA binding proteins by fold recognition and binding affinity prediction. RNA Biol. 8, 988–996 (2011).
Article PubMed PubMed Central CAS Google Scholar
Kulkarni-Kale, U., Bhosle, S. & Kolaskar, A.S. CEP: a conformational epitope prediction server. Nucleic Acids Res. 33, W168–W171 (2005).
Article PubMed PubMed Central CAS Google Scholar
Morris, G.M. et al. AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility. J. Comput. Chem. 30, 2785–2791 (2009).
Article PubMed PubMed Central CAS Google Scholar
Lorber, D.M. & Shoichet, B.K. Hierarchical docking of databases of multiple ligand conformations. Curr. Top Med. Chem. 5, 739–749 (2005).
Article PubMed PubMed Central CAS Google Scholar
Singh, R., Park, D., Xu, J., Hosur, R. & Berger, B. Struct2Net: a web service to predict protein-protein interactions using a structure-based approach. Nucleic Acids Res. 38, W508–W515 (2010).
Article PubMed PubMed Central CAS Google Scholar
Singh, R., Xu, J. & Berger, B. Struct2net: integrating structure into protein-protein interaction prediction. Pac. Symp. Biocomput. 403–414 (2006).
Carson, M.B., Langlois, R. & Lu, H. NAPS: a residue-level nucleic acid-binding prediction server. Nucleic Acids Res. 38, W431–W435 (2010).
Article PubMed PubMed Central CAS Google Scholar
Wallace, I.M., O'Sullivan, O., Higgins, D.G. & Notredame, C. M-Coffee: combining multiple sequence alignment methods with T-Coffee. Nucleic Acids Res. 34, 1692–1699 (2006).
Article PubMed PubMed Central CAS Google Scholar
Notredame, C., Higgins, D.G. & Heringa, J. T-Coffee: a novel method for fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217 (2000).
Article PubMed CAS Google Scholar
Zhang, Y. & Skolnick, J. Scoring function for automated assessment of protein structure template quality. Proteins 57, 702–710 (2004).
Article PubMed CAS Google Scholar
Charniak, E. Statistical Language Learning (MIT Press, 1993).
Murzin, A.G., Brenner, S.E., Hubbard, T. & Chothia, C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 247, 536–540 (1995).
PubMed CAS Google Scholar
Andreeva, A. et al. Data growth and its impact on the SCOP database: new developments. Nucleic Acids Res. 36, D419–D425 (2008).
Article PubMed CAS Google Scholar
Wang, Z., Zhao, F., Peng, J. & Xu, J. Protein 8-class secondary structure prediction using conditional neural fields. Proteomics 11, 3786–3792 (2011).
Article PubMed PubMed Central CAS Google Scholar
Finn, R.D. et al. The Pfam protein families database. Nucleic Acids Res. 38, D211–D222 (2010).
Article PubMed CAS Google Scholar
Ward, J.J., McGuffin, L.J., Bryson, K., Buxton, B.F. & Jones, D.T. The DISOPRED server for the prediction of protein disorder. Bioinformatics 20, 2138–2139 (2004).
Article PubMed CAS Google Scholar
Kelley, L.A. & Sternberg, M.J.E. Protein structure prediction on the Web: a case study using the Phyre server. Nat. Protoc. 4, 363–371 (2009).
Article PubMed CAS Google Scholar
Soding, J., Biegert, A. & Lupas, A.N. The HHpred interactive server for protein homology detection and structure prediction. Nucleic Acids Res. 33, W244–W248 (2005).
Article PubMed PubMed Central CAS Google Scholar
Kim, D.E., Chivian, D. & Baker, D. Protein structure prediction and analysis using the Robetta server. Nucleic Acids Res. 32, W526–W531 (2004).
Article PubMed PubMed Central CAS Google Scholar

Download references

Acknowledgements

This work is supported by the US National Institutes of Health grants R01GM0897532, a US National Science Foundation grant DBI-0960390, a Microsoft PhD Research Fellowship, an FMC Educational Fund Fellowship and the Toyota Technical Institute at Chicago summer intern program. We are grateful to the University of Chicago Beagle team, TeraGrid and Canada's Shared Hierarchical Academic Research Computing Network (SHARCNet) for their support of computational resources.

Author information

Morten Källberg and Haipeng Wang: These authors contributed equally to this work.

Authors and Affiliations

Toyota Technological Institute at Chicago, Chicago, Illinois, USA
Morten Källberg, Haipeng Wang, Sheng Wang, Jian Peng, Zhiyong Wang & Jinbo Xu
Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, USA
Morten Källberg & Hui Lu

Authors

Morten Källberg
View author publications
You can also search for this author in PubMed Google Scholar
Haipeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Sheng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Peng
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Wang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Lu
View author publications
You can also search for this author in PubMed Google Scholar
Jinbo Xu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

J.X. conceived and supervised the project. M.K. and H.W. designed and developed the web server. H.L. oversaw server development. J.P. developed the threading algorithm. S.W. designed the template database. Z.W. developed the protein secondary structure prediction algorithm. M.K. and J.X. wrote the paper.

Corresponding author

Correspondence to Jinbo Xu.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Källberg, M., Wang, H., Wang, S. et al. Template-based protein structure modeling using the RaptorX web server. Nat Protoc 7, 1511–1522 (2012). https://doi.org/10.1038/nprot.2012.085

Download citation

Published: 19 July 2012
Issue Date: August 2012
DOI: https://doi.org/10.1038/nprot.2012.085

This article is cited by

A lethal mitonuclear incompatibility in complex I of natural hybrids
- Benjamin M. Moran
- Cheyenne Y. Payne
- Molly Schumer
Nature (2024)
Designing a novel and combinatorial multi-antigenic epitope-based vaccine “MarVax” against Marburg virus—a reverse vaccinology and immunoinformatics approach
- Bishal Debroy
- Sribas Chowdhury
- Kuntal Pal
Journal of Genetic Engineering and Biotechnology (2023)
Fasciola gigantica vaccine construct: an in silico approach towards identification and design of a multi-epitope subunit vaccine using calcium binding EF-hand proteins
- Kanhu Charan Das
- Ruchishree Konhar
- Devendra Kumar Biswal
BMC Immunology (2023)
In silico formulation of a next-generation multiepitope vaccine for use as a prophylactic candidate against Crimean-Congo hemorrhagic fever
- Rahat Alam
- Abdus Samad
- Tomasz M. Karpiński
BMC Medicine (2023)
Designing multi-epitope vaccine against important colorectal cancer (CRC) associated pathogens based on immunoinformatics approach
- Hamid Motamedi
- Marzie Mahdizade Ari
- Ramin Abiri
BMC Bioinformatics (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.