StructAnalyzer - a tool for sequence versus structure similarity analysis
Languages of publication
In the world of RNAs and proteins, similarities at the level of primary structures of two comparable molecules usually correspond to structural similarities at the tertiary level. In other words, measures of sequence and structure similarities are in general correlated - a high value of sequence similarity imposes a high value of structural similarity. However, important exceptions that stay in contrast to this general rule can be identified. It is possible to find similar structures with very different sequences, as well as similar sequences with very different structures. In this paper, we focus our attention on the latter case and propose a tool, called StructAnalyzer, supporting analysis of relations between the sequence and structure similarities. Recognition of tertiary structure diversity of molecules with very similar primary structures may be the key for better understanding of mechanisms influencing folding of RNAs or proteins, and as a result for better understanding of their function. StructAnalyzer allows exploration and visualization of structural diversity in relation to sequence similarity. We show how this tool can be used to screen RNA structures in Protein Data Bank (PDB) for sequences with structural variants.
- Alexander PA, He Y, Chen Y, Orban J, Bryan PN (2009) A minimal sequence code for switching protein structure and function. Proc Natl Acad Sci USA 106: 21149-21154. https://doi.org/10.1073/pnas.0906408106.
- Berman HM, Westbrook J, Feng Z., Gilliland G., Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28: 235-242. https://doi.org/10.1093/nar/28.1.235.
- Cheng CY, Chou FC, Das R (2015) Modeling complex RNA tertiary folds with Rosetta. Methods Enzymol 55: 35-64. https://doi.org/10.1016/bs.mie.2014.10.051.
- Das R, Baker D (2007) Automated de novo prediction of native-like RNA tertiary structures. Proc Natl Acad Sci USA 104: 14664-14669. https://doi.org/10.1073/pnas.0703836104.
- Edgar RC (2004a) MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res 32: 1792-1797. Print 2004. PubMed PMID: 15034147. https://doi.org/10.1093/nar/gkh340.
- Edgar RC (2004b) MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5: 113. PubMed PMID: 15318951. https://doi.org/10.1186/1471-2105-5-113.
- Lukasiak P, Blazewicz J, Milostan M (2010) Some operations research methods for analyzing protein sequences and structures. Annals Operations Res 175: 9-35
- Lukasiak P, Antczak M, Ratajczak T, Szachniuk M, Popenda M, Adamiak RW, Blazewicz J (2015) RNAssess - a webserver for quality assessment of RNA 3D structures. Nucleic Acids Res 43: W502-W506. https://doi.org/10.1093/nar/gkv557.
- Lukasiak P, Antczak M, Ratajczak T, Bujnicki JM, Szachniuk M, Adamiak RW, Popenda M, Blazewicz J (2013) RNAlyzer - novel approach for quality analysis of RNA structural models. Nucleic Acids Res 41: 5978-5990. https://doi.org/10.1093/nar/gkt318.
- Miao Z, Adamiak RW, Blanchet MF, Boniecki M, Bujnicki JM, Chen SJ, Cheng C, Chojnowski G, Chou FC, Cordero P, Cruz JA, Ferre-D'Amare A, Das R, Ding F, Dokholyan NV, Dunin-Horkawicz S, Kladwang W, Krokhotin A, Lach G, Magnus M, Major F, Mann TH, Masquida B, Matelska D, Meyer M, Peselis A, Popenda M, Purzycka KJ, Serganov A, Stasiewicz J, Szachniuk M, Tandon A, Tian S, Wang J, Xiao Y, Xu X, Zhang J, Zhao P, Zok T, Westhof E (2015) RNA-puzzles round II: Assessment of RNA structure prediction programs applied to three large RNA structures. RNA 21: 1066-1084. https://doi.org/10.1261/rna.049502.114.
- Popenda M, Szachniuk M, Antczak M, Purzycka KJ, Lukasiak P, Bartol N, Blazewicz J, Adamiak RW (2012) Automated 3D structure composition for large RNAs. Nucleic Acids Res 40: e112. https://doi.org/10.1093/nar/gks339.
- Pruitt KD, Tatusova T, Brown GR, Maglott DR (2012) NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy. Nucleic Acids Res 40: D130-D5. PMID: 22121212. PMCID: PMC3245008. https://doi.org/10.1093/nar/gkr1079.
- Rother K, Rother M, Boniecki M, Puton T, Bujnicki JM (2011) RNA and protein 3D structure modeling: similarities and differences. J Mol Model 17: 2325-2336. https://doi.org/10.1007/s00894-010-0951-x.
- Zok T, Popenda M, Szachniuk M (2014) MCQ4Structures to compute similarity of molecule structures. Central Eur J Operations Res 22: 457-474. https://doi.org/10. 1007/s10100-013-0296-5
Publication order reference