Comparison of proteins based on segments structural similarity.
Languages of publication
We present here a simple method for fast and accurate comparison of proteins using their structures. The algorithm is based on structural alignment of segments of Ca chains (with size of 99 or 199 residues). The method is optimized in terms of speed and accuracy. We test it on 97 representative proteins with the similarity measure based on the SCOP classification. We compare our algorithm with the LGscore2 automatic method. Our method has the same accuracy as the LGscore2 algorithm with much faster processing of the whole test set, which is promising. A second test is done using the ToolShop structure prediction evaluation program and shows that our tool is on average slightly less sensitive than the DALI server. Both algorithms give a similar number of correct models, however, the final alignment quality is better in the case of DALI. Our method was implemented under the name 3D-Hit as a web server at http://3dhit.bioinfo.pl/ free for academic use, with a weekly updated database containing a set of 5000 structures from the Protein Data Bank with non-homologous sequences.
- Alexandrov NN, Fischer D. (1996) Analysis of topological and nontopological structural similarities in the PDB: new examples with old structures. Proteins.; 25: 354-65.
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. (2000) The Protein Data Bank. Nucleic Acids Res.; 28: 235-42.
- de Brevern AG, Etchebest C, Hazout S. (2000) Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks. Proteins.; 41: 271-87.
- Bujnicki JM. (2000) Phylogeny of the restriction endonuclease-like superfamily inferred from comparison of protein structures. J Mol Evol.; 50: 39-44.
- Bujnicki JM, Elofsson A, Fischer D, Rychlewski L. (2001) LiveBench-2: Large-scale automated evaluation of protein structure prediction servers. Proteins.; S5: 184-91.
- Bystroff C, Baker D. (1998) Prediction of local structure in proteins using a library of sequence-structure motifs. J Mol Biol.; 281: 565-77.
- Chothia C, Lesk AM. (1986) The relation between the divergence of sequence and structure in proteins. EMBO J.; 5: 823-6.
- Cristobal S, Zemla A, Fischer D, Rychlewski L, Elofsson A. (2001) A study of quality measures for protein threading models. BMC Bioinformatics.; 2: 5.
- de Filippis V, Sander C, Vriend G. (1994) Predicting local structural changes that result from point mutations. Protein Eng.; 7: 1203-8.
- Fischer D, Bachar O, Nussinov R, Wolfson H. (1992) An efficient automated computer vision based technique for detection of three dimensional structural motifs in proteins. J Biomol Struct Dyn.; 9: 769-89.
- Gibrat JF, Madej T, Bryant SH. (1996) Surprising similarities in structure comparison. Curr Opin Struct Biol.; 6: 377-85.
- Holm L, Sander C. (1994) The FSSP database of structurally aligned protein fold families. Nucleic Acids Res.; 22: 3600-9.
- Holm L, Sander C. (1998) Touring protein fold space with Dali/FSSP. Nucleic Acids Res.; 26: 316-9.
- Johnson MS, Sutcliffe MJ, Blundell TL. (1990) Molecular anatomy: Phyletic relationships derived from three-dimensional structures of proteins. J Mol Evol.; 30: 43-59.
- Levitt M, Gerstein M. (1998) A unified statistical framework for sequence comparison and structure comparison. Proc Natl Acad Sci USA.; 95: 5913-20.
- Murzin AG. (1994) New Protein Folds. Curr Opin Struct Biol.; 4: 441-9.
- Murzin AG, Brenner SE, Hubbard T, Chothia C. (1995) SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol.; 247: 536-40.
- Orengo CA, Jones DT, Thornton JM. (1994) Protein superfamilies and domain super-folds. Nature.; 372: 631-4.
- Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM. (1997) CATH - a hierarchic classification of protein domain structures. Structure.; 5: 1093-108.
- Plewczynski D, Pas J, von Grotthuss M, Rychlewski L. (2002) 3D-Hit: fast structural comparison of proteins. Appl Bioinformatics.; 1: 223-5.
- Pointing CP, Russel RB. (1995) Swaposins: circular permutations within genes encoding saposin homologues. Trends Biochem Sci.; 20: 179-80.
- Rychlewski L. (2001) ToolShop: prerelease inspections for protein structure prediction servers. Bioinformatics.; 17: 1240-1.
- Shindyalov IN, Bourne PE. (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng.; 11: 739-47.
- Siew N, Elofsson A, Rychlewski L, Fischer D. (2000) MaxSub: an automated measure for the assessment of protein structure prediction quality. Bioinformatics.; 16: 776-85.
Publication order reference