Smooth muscle contamination analysis in clinical oncology gene expression research
Languages of publication
Gene expression profiling is one of the most explored methods for studying cancers and microarray data repositories have become a rich and important resource. The most common human cancers develop in organs that are walled by smooth muscles. The only method of sample extraction free of unintentional contamination with surrounding tissue is microdissection. Nevertheless, such an approach is implemented infrequently. In the light of the above, there is a possibility of smooth muscle contamination in a large portion of publicly available data. In this study, 2292 publicly available microarrays were analysed to develop a simple screening method for detecting smooth muscle contamination. Microarray Inspector software was used to perform the tests since it has the unique ability to use many selected genes and probesets in a single group as a tissue definition. Furthermore, the test was dataset-independent. Two strategies of tissue definition were explored and compared. The first one depended on Tissue Specific Genes Database (TiSGeD) and BioGPS web resources, which themselves were based on meta-analysis of thousands of microarrays. The second method was based on a differential gene expression analysis of a few hundred preselected arrays. The comparison of the two methods proved the latter to be superior. Among the tested samples of undefined contamination, nearly half were identified to possibly contain significant smooth muscle traces. The obtained results equip researches with a simple method of examining microarray data for smooth muscle contamination. The presented work serves as an example of how to create definitions when searching for other possible contaminations.
- Department of Gastroenterology and Hepatology, Medical Center for Postgraduate Education, Warsaw, Poland and Transition Technologies S.A., Warsaw, Poland
- Transition Technologies S.A., Warsaw, Poland
- Transition Technologies S.A., Warsaw, Poland and Institute of Heat Engineering, Warsaw University of Technology, Warszawa, Poland
- Transition Technologies S.A., Warsaw, Poland
- Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B Stat Methodol 57: 289-300.
- Carlson M (2012) GO.db: A set of annotation maps describing the entire Gene Ontology.
- Carlson M (2012) KEGG.db: A set of annotation maps for KEGG.
- Chi JT, Rodriguez EH, Wang Z, Nuyten DS, Mukherjee S, van de Rijn M, van de Vijver MJ, Hastie T, Brown PO (2007) Gene expression programs of human smooth muscle cells: tissue-specific differentiation and prognostic significance in breast cancers. PLoS Genet 3: 1770-1784.
- Chunlei CO, Wu C, Orozco C, Boyer J, Leglise M, Goodale J, Batalov S, Hodge CL, Haase J, Janes J, Huss JW, Su AI (2009) BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources. Genome Biol 10: R130.
- Förster S, Gretschel S, Jöns T, Yashiro M, Kemmner W (2011) THBS4, a novel stromal molecule of diffuse-type gastric adenocarcinomas, identified by transcriptome-wide expression profiling. Mod Pathol 24: 1390-403.
- Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5: R80.
- Gentleman R, Carey V, Huber W, Hahne F (2012) Genefilter: methods for filtering genes from microarray experiments.
- Gentleman R (2012) annotate: Annotation for microarrays. (
- Huret JL, Ahmad M, Arsaban M, Bernheim A, Cigna J, Desangles F, Guignard JC, Jacquemot-Perbal MC, Labarussias M, Leberre V, Malo A, Morel-Pair C, Mossafa H, Potier JC, Texier G, Viguié F, Yau Chun Wan-Senon S, Zasadzinski A, Dessen P (2013) Atlas of genetics and cytogenetics in oncology and haematology in 2013. Nucleic Acids Res 41 (Database issue): D920-D924.
- Kornmann M, Ishiwata T, Beger HG, Korc M (1997) Fibroblast growth factor-5 stimulates mitogenic signaling and is overexpressed in human pancreatic cancer: evidence for autocrine and paracrine actions. Oncogene 15: 1417-1424.
- Lang DT (2012) XML: Tools for parsing and generating XML within R and S-Plus. (
- Lähdesmäki H, Shmulevich L, Dunmire V, Yli-Harja O, Zhang W (2005) In silico microdissection of microarray data from heterogeneous cell populations. BMC Bioinformatics 6: 54.
- Mengual L, Burset M, Ars E, Lozano JJ, Villavicencio H, Ribal MJ, Alcaraz A (2009) DNA microarray expression profiling of bladder cancer allows identification of noninvasive diagnostic markers. J Urol 182: 741-748.
- Parkinson H, Kapushesky M, Shojatalab M, Abeygunawardena N, Coulson R, Farne A, Holloway E, Kolesnykov N, Lilja P, Lukk M, Mani R, Rayner T, Sharma A, William E, Sarkans U, Brazma A (2007) ArrayExpress - a public database of microarray experiments and gene expression profiles. Nucleic Acids Res 35: D747-D750.
- Paweletz CP, Liotta LA, Petricoin EF 3rd (2001) New technologies for biomarker analysis of prostate cancer progression: Laser capture microdissection and tissue proteomics. Urology 57: 160-163.
- Riester M, Taylor JM, Feifer A, Koppie T, Rosenberg JE, Downey RJ, Bochner BH, Michor F (2012) Combination of a novel gene expression signature with a clinical nomogram improves the prediction of survival in high-risk bladder cancer. Clin Cancer Res 18: 1323-1333.
- Roepman P, De Koning E, van Leenen D, De Weger RA, Kummer JA, Slootweg PJ, Holstege FC (2006) Dissection of a metastatic gene expression signature into distinct components. Genome Biol 7: R117.
- Stępniak P, Maycock M, Wojdan K, Markowska M, Perun S, Srivastava A, Wyrwicz LS, Świrski K (2013) Microarray Inspector: tissue cross contamination detection tool for microarray data. Acta Biochim Pol 60: 647-655.
- Safran M, Dalah I, Alexander J, Rosen N, Iny Stein T, Shmoish M, Nativ N, Bahir I, Doniger T, Krug H., Sirota-Madi A, Olender T, Golan Y, Steltzer G, Harel A, Lancet D (2010) GeneCards Version 3: the human gene integrator. Database (Oxford) 2010: baq020.
- Shi L, Reid LH, Jones WD et al. (2006) The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 24: 1151-1161.
- Smith CA (2010) Annaffy: Annotation tools for Affymetrix biological metadata. (
- Team RC (2012) R: A Language and Environment for Statistical Computing. (
- Wang M, Master SR, Chodosh LA (2006) Computational expression deconvolution in a complex mammalian organ. BMC Bioinformatics 7: 328.
- Wang Y, Xia XQ, Jia Z, Sawyers A, Yao H, Wang-Rodriquez J, Mercola D, McClelland M (2010) In silico estimates of tissue components in surgical samples based on expression profiling data. Cancer Res 70: 6448-6455.
- Wu J, with contributions from James MacDonald Jeff Gentry RI (2012) gcrma: Background Adjustment Using Sequence Information. (
- Xiao SJ, Zhang C, Zou Q, Ji ZL (2010) TiSGeD: a database for tissue-specific genes. Bioinformatics 26: 1273-1275.
Publication order reference