Preferences help
enabled [disable] Abstract
Number of results
2004 | 51 | 2 | 349-371
Article title

Protein modeling and structure prediction with a reduced representation.

Title variants
Languages of publication
Protein modeling could be done on various levels of structural details, from simplified lattice or continuous representations, through high resolution reduced models, employing the united atom representation, to all-atom models of the molecular mechanics. Here I describe a new high resolution reduced model, its force field and applications in the structural proteomics. The model uses a lattice representation with 800 possible orientations of the virtual alpha carbon-alpha carbon bonds. The sampling scheme of the conformational space employs the Replica Exchange Monte Carlo method. Knowledge-based potentials of the force field include: generic protein-like conformational biases, statistical potentials for the short-range conformational propensities, a model of the main chain hydrogen bonds and context-dependent statistical potentials describing the side group interactions. The model is more accurate than the previously designed lattice models and in many applications it is complementary and competitive in respect to the all-atom techniques. The test applications include: the ab initio structure prediction, multitemplate comparative modeling and structure prediction based on sparse experimental data. Especially, the new approach to comparative modeling could be a valuable tool of the structural proteomics. It is shown that the new approach goes beyond the range of applicability of the traditional methods of the protein comparative modeling.
Physical description
  • Faculty of Chemistry, Warsaw University, Warszawa, Poland
  • Alm E, Morozov AV, Kortemme T, Baker D. (2002) Simple physical models connect theory and experiment in protein folding kinetics. J Mol Biol.; 322: 463-76.
  • Altschul SF, Madden TL, Schaefer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res.; 25: 3389-402.
  • Anfinsen CB. (1973) Principles that govern the folding of protein chains. Science.; 181: 223-30.
  • Anfinsen CB, Scheraga HA. (1975) Experimental and theoretical aspects of protein folding. Adv Protein Chem.; 29: 205-300.
  • Arnold J, Hilton N. (2003) Genome sequencing: Revelations from a bread mould. Nature.; 422: 821-2.
  • Baker D, Sali A. (2001) Protein structure prediction and structural genomics. Science.; 294: 93-6.
  • Bernstein FC, Koetzle TF, Williams GJB, Meyer Jr EF, Brice MD, Rodgers JR, Kennard O, Simanouchi T, Tasumi M. (1977) The protein data bank: a computer-based archival file for macromolecular structures. J Mol Biol.; 112: 535-42.
  • Betancourt MR. (2003) A reduced protein model with accurate native-structure identification ability. Proteins.; 53: 889-907.
  • Betancourt MR, Skolnick J. (2000) Finding the needle in a haystack. Educing protein native folds from ambigous ab initio folding predictions. J Comput Chem.; 22: 339-53.
  • Boniecki M, Rotkiewicz P, Skolnick J, Kolinski A. (2003) Protein fragment reconstruction using various modeling techniques. J Comput Aided Mol Des.; 17: 725-38.
  • Brooks CL 3rd, Gruebele M, Onuchic JN, Wolynes PG. (1998) Chemical physics of protein folding. Proc Natl Acad Sci USA.; 95: 11037-8.
  • Buchete NV, Straub JE, Thirumalai D. (2004) Orientational potentials extracted from protein structures improve native fold recognition. Protein Sci.; 13: 862-74.
  • Bujnicki JM, Rotkiewicz P, Kolinski A, Rychlewski L. (2001) Three-dimensional modeling of the I-TevI homing endonuclease catalytic domain, a GIY-YIG superfamily member, using NMR restraints and Monte Carlo dynamics. Protein Eng.; 14: 717-21.
  • Chance MR, Bresnick AR, Burley SK, Jiang JS, Lima CD, Sali A, Almo SC, Bonanno JB, Buglino JA, Boulton S, Chen H, Eswar N, He G, Huang R, Ilyin V, McMahan L, Pieper U, Ray S, Vidal M, Wang LK. (2002) Structural genomics: a pipeline for providing structures for the biologist. Protein Sci.; 11: 723-38.
  • Cherkasov AR, Jones SJ. (2004) Structural characterization of genomes by large scale sequence-structure threading. BMC Bioinformatics.; 5: 37.
  • Clark MS. (1999) Comparative genomics: the key to understand the Human Genome Project. Bioessays.; 21: 121-30.
  • Covell DG. (1992) Folding protein alpha-carbon chains into compact forms by Monte Carlo methods. Proteins.; 14: 409-20.
  • Covell DG, Jernigan RL. (1990) Conformations of folded proteins in restricted spaces. Biochemistry.; 29: 3287-94.
  • Crippen GM. (1991) Prediction of protein folding from amino acid sequence over discrete conformation spaces. Biochemistry.; 30: 4232-7.
  • Ding F, Dokholyan NV, Buldyrev SV, Stanley HE, Shakhnovich EI. (2002) Direct molecular dynamics observation of protein folding transition state ensemble. Biophys J.; 83: 3525-32.
  • Evers A, Gohlke H, Klebe G. (2003) Ligand-supported homology modelling of protein binding-sites using knowledge-based potentials. J Mol Biol.; 334: 327-45.
  • Feig M, Rotkiewicz P, Kolinski A, Skolnick J, Brooks CLI. (2000) Accurate reconstruction of all-atom protein representation from side chain based low resolution models. Proteins.; 41: 86-97.
  • Fiser A, Sali A. (2003) ModLoop: automated modeling of loops in protein structures. Bioinformatics.; 19: 2500-1.
  • Fiser A, Do RK, Sali A. (2000) Modeling of loops in protein structures. Protein Sci.; 9: 1753-73.
  • Ginalski K, Rychlewski L. (2003) Protein structure prediction of CASP5 comparative modeling and fold recognition targets using consensus alignment approach and 3D assessment. Proteins.; 53 Suppl 6: 410-7.
  • Go N, Abe H, Mizuno H, Taketomi H. (1980) Protein Folding; pp 167-81. Elsevier/North Holland, Amsterdam.
  • Godzik A. (1996) Knowledge-based potentials for protein folding: what can we learn from known protein structures? Structure.; 4: 363-6.
  • Godzik A, Kolinski A, Skolnick J. (1992) Topology fingerprint approach to the inverse protein folding problem. J Mol Biol.; 227: 227-38.
  • Godzik A, Kolinski A, Skolnick J. (1993a) Lattice representation of globular proteins: How good are they? J Comp Chem.; 14: 1194-202.
  • Godzik A, Skolnick J, Kolinski A. (1993b) Regularities in interaction patterns of globular proteins. Protein Eng.; 6: 801-10.
  • Godzik A, Kolinski A, Skolnick J. (1995) Are proteins ideal mixtures of amino acids? Analysis of energy parameter sets. Protein Sci.; 4: 2107-17.
  • Gront D, Kolinski A, Skolnick J. (2000) Comparison of three Monte Carlo search strategies for a proteinlike homopolymer model: folding thermodynamics and identification of low-energy structures. J Chem Phys.; 113: 5065-71.
  • Hagler AT, Honig B. (1978) On the formation of protein tertiary structure on a computer. Proc Natl Acad Sci USA.; 75: 554-8.
  • Hansmann UH, Okamoto Y. (1999) New Monte Carlo algorithms for protein folding. Curr Opin Struct Biol.; 9: 177-83.
  • Hardin C, Eastwood MP, Prentiss M, Luthey-Schulten Z, Wolynes PG. (2002) Folding funnels: the key to robust protein structure prediction. J Comput Chem.; 23: 138-46.
  • Harrison PM, Gerstein M. (2002) Studying genomes through the aeons: protein families, pseudogenes and proteome evolution. J Mol Biol.; 318: 1155-74.
  • Hinds DA, Levitt M. (1992) A lattice model for protein structure prediction at low resolution. Proc Natl Acad Sci USA.; 89: 2536-40.
  • Hinds DA, Levitt M. (1994) Exploring conformational space with a simple lattice model for protein structure. J Mol Biol.; 243: 668-82.
  • Holm L, Sander C. (1996) Mapping the protein universe. Science.; 273: 595-602.
  • Jernigan RL. (1992) Protein folds. Curr Opin Struct Biol.; 2: 248-56.
  • Jones DT. (1999) GenTHREADER: an efficient and reliable protein fold recognition method for genomic sequences. J Mol Biol.; 287: 797-815.
  • Kabsch W, Sander C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers.; 22: 2577-637.
  • Kihara D, Skolnick J. (2004) Microbial genomes have over 72% structure assignment by the threading algorithm PROSPECTOR_Q. Proteins.; 55: 464-73.
  • Kihara D, Lu H, Kolinski A, Skolnick J. (2001) TOUCHSTONE: An ab initio protein structure prediction method that uses threading-based tertiary restraints. Proc Natl Acad Sci USA.; 98: 10125-30.
  • Klose J. (1989) Systematic analysis of the total proteins of a mammalian organism: principles, problems and implications for sequencing the human genome. Electrophoresis.; 10: 140-52.
  • Kolinski A, Skolnick J. (1994a) Monte Carlo simulations of protein folding. II. Application to protein A, ROP, and crambin. Proteins.; 18: 353-66.
  • Kolinski A, Skolnick J. (1994b) Monte Carlo simulations of protein folding. I. Lattice model and interaction scheme. Proteins.; 18: 338-52.
  • Kolinski A, Skolnick J. (1966) Lattice Models of Protein Folding, Dynamics and Thermodynamics; p 200 R.G. Landes, Austin, TX.
  • Kolinski A, Skolnick J. (1997a) Determinants of seconadry structure of polypeptide chains: interplay between short range and burial interactions. J Chem Phys.; 107: 953-64.
  • Kolinski A, Skolnick J. (1997b) High coordination lattice models of protein structure, dynamics and thermodynamics. Acta Biochim Polon.; 44: 389-422.
  • Kolinski A, Skolnick J. (1998) Assembly of protein structure from sparse experimental data: an efficient Monte Carlo Model. Proteins.; 32: 475-94.
  • Kolinski A, Skolnick J. (2004) Reduced models of proteins and their applications. Polymer.; 45: 511-24.
  • Kolinski A, Godzik A, Skolnick J. (1993) A General method for the prediction of the three dimensional structure and folding pathway of globular proteins. Application to designed helical proteins. J Chem Phys.; 98: 7420-33.
  • Kolinski A, Galazka W, Skolnick J. (1995) Computer design of idealized β-motifs. J Chem Phys.; 103: 10286-97.
  • Kolinski A, Galazka W, Skolnick J. (1998a) Monte Carlo studies of the thermodynamics and kinetics of reduced protein models: application to small helical, b and a/b proteins. J Chem Phys.; 108: 2608-17.
  • Kolinski A, Jaroszewski L, Rotkiewicz P, Skolnick J. (1998b) An efficient Monte Carlo model of protein chains. Modeling the short-range correlations between side groups centers of mass. J Phys Chem.; 102: 4628-37.
  • Kolinski A, Ilkowski B, Skolnick J. (1999) Folding dynamics and thermodynamics of b-hairpin assembly: Insight from various simulation techniques. Biophys, J.; 77: 2942-52.
  • Kolinski A, Rotkiewicz P, Ilkowski B, Skolnick J. (2000) Protein folding: flexible lattice models. Progress of Theoretical Physics (Kyoto).; 138 Suppl: 292-300.
  • Kolinski A, Betancourt M, Kihara D, Rotkiewicz P, Skolnick J. (2001) Generalized comparative modeling (GENECOMP): a combination of sequence comparison, threading, lattice and off-lattice modeling for protein structure prediction and refinement. Proteins.; 44: 133-49.
  • Kolinski A, Klein P, Romiszowski P, Skolnick J. (2003a) Unfolding of globular proteins: Monte Carlo dynamics of a realistic reduced model. Biophys J.; 85: 3271-8.
  • Kolinski A, Gront D, Pokarowski P, Skolnick J. (2003b) A simple lattice model that exhibits a protein-like cooperative all-or-none folding transition. Biopolymers.; 69: 399-405.
  • Kuznetsov IB, Rackovsky S. (2002) Discriminative ability with respect to amino acid types: assessing the performance of knowledge-based potentials without threading. Proteins.; 49: 266-84.
  • Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, Devon K, Dewar K, Doyle M, FitzHugh W, Funke R, Gage D, Harris K, Heaford A, Howland J, Kann L, Lehoczky J, LeVine R, McEwan P, McKernan K, Meldrim J, Mesirov JP, Miranda C, Morris W, Naylor J, Raymond C, Rosetti M, Santos R, Sheridan A, Sougnez C, Stange-Thomann N, Stojanovic N, Subramanian A, Wyman D, Rogers J, Sulston J, Ainscough R, Beck S, Bentley D, Burton J, Clee C, Carter N, Coulson A, Deadman R, Deloukas P, Dunham A, Dunham I, Durbin R, French L, Grafham D, Gregory S, Hubbard T, Humphray S, Hunt A, Jones M, Lloyd C, McMurray A, Matthews L, Mercer S, Milne S, Mullikin JC, Mungall A, Plumb R, Ross M, Shownkeen R, Sims S, Waterston RH, Wilson RK, Hillier LW, McPherson JD, Marra MA, Mardis ER, Fulton LA, Chinwalla AT, Pepin KH, Gish WR, Chissoe SL, Wendl MC, Delehaunty KD, Miner TL, Delehaunty A, Kramer JB, Cook LL, Fulton RS, Johnson DL, Minx PJ, Clifton SW, Hawkins T, Branscomb E, Predki P, Richardson P, Wenning S, Slezak T, Doggett N, Cheng JF, Olsen A, Lucas S, Elkin C, Uberbacher E, Frazier M, Gibbs RA, Muzny DM, Scherer SE, Bouck JB, Sodergren EJ, Worley KC, Rives CM, Gorrell JH, Metzker ML, Naylor SL, Kucherlapati RS, Nelson DL, Weinstock GM, Sakaki Y, Fujiyama A, Hattori M, Yada T, Toyoda A, Itoh T, Kawagoe C, Watanabe H, Totoki Y, Taylor T, Weissenbach J, Heilig R, Saurin W, Artiguenave F, Brottier P, Bruls T, Pelletier E, Robert C, Wincker P, Smith DR, Doucette-Stamm L, Rubenfield M, Weinstock K, Lee HM, Dubois J, Rosenthal A, Platzer M, Nyakatura G, Taudien S, Rump A, Yang H, Yu J, Wang J, Huang G, Gu J, Hood L, Rowen L, Madan A, Qin S, Davis RW, Federspiel NA, Abola AP, Proctor MJ, Myers RM, Schmutz J, Dickson M, Grimwood J, Cox DR, Olson MV, Kaul R, Shimizu N, Kawasaki K, Minoshima S, Evans GA, Athanasiou M, Schultz R, Roe BA, Chen F, Pan H, Ramser J, Lehrach H, Reinhardt R, McCombie WR, de la Bastide M, Dedhia N, Blocker H, Hornischer K, Nordsiek G, Agarwala R, Aravind L, Bailey JA, Bateman A, Batzoglou S, Birney E, Bork P, Brown DG, Burge CB, Cerutti L, Chen HC, Church D, Clamp M, Copley RR, Doerks T, Eddy SR, Eichler EE, Furey TS, Galagan J, Gilbert JG, Harmon C, Hayashizaki Y, Haussler D, Hermjakob H, Hokamp K, Jang W, Johnson LS, Jones TA, Kasif S, Kaspryzk A, Kennedy S, Kent WJ, Kitts P, Koonin EV, Korf I, Kulp D, Lancet D, Lowe TM, McLysaght A, Mikkelsen T, Moran JV, Mulder N, Pollara VJ, Ponting CP, Schuler G, Schultz J, Slater G, Smit AF, Stupka E, Szustakowski J, Thierry-Mieg D, Thierry-Mieg J, Wagner L, Wallis J, Wheeler R, Williams A, Wolf YI, Wolfe KH, Yang SP, Yeh RF, Collins F, Guyer MS, Peterson J, Felsenfeld A, Wetterstrand KA, Patrinos A, Morgan MJ, Szustakowki J, de Jong P, Catanese JJ, Osoegawa K, Shizuya H, Choi S, Chen YJ. (2001) Initial sequencing and analysis of the human genome. Nature.; 409: 860-921.
  • Lee J, Liwo A, Scheraga HA. (1999) Energy-based de novo protein folding by conformational space annealing and an off-lattice united-residue force field: application to the 10-55 fragment of staphylococcal protein A and to apo calbindin D9K. Proc Natl Acad Sci USA.; 96: 2025-30.
  • Lee J, Ripoll DR, Czaplewski C, Pilardy J, Wedemeyer WJ, Scheraga HA. (2001) Optimization of parameters in macromolecular potential energy functions by conformational space annealing. J Phys Chem.; 105: 7291-8.
  • Levitt M. (1976) A simplified representation of protein conformations for rapid simulation of protein folding. J Mol Biol.; 104: 59-107.
  • Levitt M, Warshel A. (1975) Computer simulation of protein folding. Nature.; 253: 694-8.
  • Li W, Zhang Y, Kihara D, Huang YJ, Zheng D, Montelione GT, Kolinski A, Skolnick J. (2003) TOUCHSTONEX: protein structure prediction with sparse NMR data. Proteins.; 53: 290-306.
  • Liwo A, Arlukowicz P, Czaplewski C, Oldziej S, Pilardy J, Scheraga HA. (2002) A method for optimizing potential-energy functions by a hierarchical design of the potential-energy landscape: Application to the UNRES force field. Proc Natl Acad Sci USA.; 99: 1937-42.
  • Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. (1953) Equation of state calculations by fast computing machines. J Chem Phys.; 51: 1087-92.
  • Mirny LA, Finkelstein AV, Shakhnovich EI. (2000) Statistical significance of protein structure prediction by threading. Proc Natl Acad Sci USA.; 97: 9978-83.
  • Miyazawa S, Jernigan RL. (1985) Estimation of effective interresidue contact energies from protein crystal structures: quasi-chemical approximation. Macromolecules.; 18: 534-52.
  • Miyazawa S, Jernigan RL. (1999a) An empirical energy potential with a reference state for protein fold and sequence recognition. Proteins.; 36: 357-69.
  • Miyazawa S, Jernigan RL. (1999b) Evaluation of short-range interactions as secondary structure energies for protein fold and sequence recognition. Proteins.; 36: 347-56.
  • Mohanty D, Dominy BN, Kolinski A, Brooks CL 3rd, Skolnick J. (1999) Correlation between knowledge-based and detailed atomic potentials: application to the unfolding of the GCN4 leucine zipper. Proteins.; 35: 447-52.
  • Monge A, Lathrop EJP, Gunn JR, Shenkin PS, Friesner RA. (1995) Computer modeling of protein folding: conformational and energetic analysis of reduced and detailed protein models. J Mol Biol.; 247: 995-1012.
  • Montelione GT, Zheng D, Huang YJ, Gunsalus KC, Szyperski T. (2000) Protein NMR spectroscopy in structural genomics. Nat Struct Biol.; 7 Suppl: 982-5.
  • Pilardy J, Czaplewski C, Liwo A, Wedemeyer WJ, Lee J, Ripoll DR, Arlukowicz P, Oldziej S, Arnautova YA, Scheraga HA. (2001) Development of physics-based energy functions that predict medium-resolution structures for proteins of the a,b, and a/b structural classes. J Phys Chem.; 105: 7299-311.
  • Reva BA, Finkelstein AV, Sanner MF, Olson AJ. (1996) Adjusting potential energy functions for lattice models of chain molecules. Proteins.; 25: 379-88.
  • Rey A, Skolnick J. (1993) Computer modeling and folding of four-helix bundles. Proteins.; 16: 8-28.
  • Rey A, Skolnick J. (1994) Computer simulation of the folding of coiled coils. J Chem Phys.; 100: 2267-76.
  • Rooman M, Gilis D. (1998) Different derivations of knowledge-based potentials and analysis of their robustness and context-dependent predictive power. Eur J Biochem.; 254: 135-43.
  • Rost B, Schneider R, Sander C. (1997) Protein fold recognition by prediction-based threading. J Mol Biol.; 270: 471-80.
  • Rotkiewicz P, Sicinska W, Kolinski A, DeLuca HF. (2001) Model of three-dimensional structure of vitamin D receptor and its binding mechanism with 1a, 25-dihydroxivitamin D. Proteins.; 44: 188-99.
  • Sali A. (1995) Comparative protein modeling by satisfaction of spatial restraints. Mol Med Today.; 1: 270-7.
  • Sali A. (1998) 100,000 protein structures for the biologist. Nat Struct Biol.; 5: 1029-32.
  • Sali A. (2001) Target practice. Nat Struct Biol.; 8: 482-4.
  • Schonbrun J, Wedemeyer WJ, Baker D. (2002) Protein structure prediction in 2002. Curr Opin Struct Biol.; 12: 348-54.
  • Shakhnovich EI. (1997) Theoretical studies of protein-folding thermodynamics and kinetics. Curr Opin Struct Biol.; 7: 29-40.
  • Shimada J, Ishchenko AV, Shakhnovich EI. (2000) Analysis of knowledge-based protein-ligand potentials using a self-consistent method. Protein Sci.; 9: 765-75.
  • Simons KT, Strauss C, Baker D. (2001) Prospects for ab initio protein structural genomics. J Mol Biol.; 306: 1191-9.
  • Skolnick J, Fetrow JS, Kolinski A. (2000) Structural genomics and its importance for gene function analysis. Nat Biotechnol.; 18: 283-7.
  • Skolnick J, Zhang Y, Arakaki AK, Kolinski A, Boniecki M, Szilagyi A, Kihara D. (2003) TOUCHSTONE: a unified approach to protein structure prediction. Proteins.; 53 Suppl 6: 469-79.
  • Sun S. (1993) Reduced representation model of protein structure prediction: Statistical potential and genetic algorithms. Protein Sci.; 2: 762-85.
  • Sun Z, Xia X, Guo Q, Xu D. (1999) Protein structure prediction in a 210-type lattice model: parameter optimization in the genetic algorithm using orthogonal array. J Protein Chem.; 18: 39-46.
  • Swendsen RH, Wang JS. (1986) Relica Monte Carlo simulations. Phys Rev Lett.; 57: 2607-9.
  • Vajda S, Vakser IA, Sternberg MJ, Janin J. (2002) Modeling of protein interactions in genomes. Proteins.; 47: 444-6.
  • Wolynes PG, Onuchic JN, Thirumalai D. (1995) Navigating the folding routes. Science.; 267: 1619-20.
  • Zacharias M. (2003) Protein-protein docking with a reduced protein model accounting for side-chain flexibility. Protein Sci.; 12: 1271-82.
Document Type
Publication order reference
YADDA identifier
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.