Preferences help
enabled [disable] Abstract
Number of results
2009 | 56 | 2 | 271-277
Article title

Calculation of reliable transcript levels of annotated genes on the basis of multiple probe-sets in Affymetrix microarrays

Title variants
Languages of publication
Microarray methods have become a basic tool in studies of global gene expression and changes in transcript levels. Affymetrix microarrays from the HGU133 series contain multiple probe-sets complementary to the same gene (4742 genes are represented by more than one probe-set in a microarray HGU133A). Individual probe-sets annotated to the same gene often show different hybridization signals and even opposite trends, which may result from some of them matching transcripts of more than one gene and from the existence of different splice-variant transcripts. Existing methods that redefine probe-sets and develop custom probe-set definitions use mathematical tools such as Matlab or the R statistical environment with the Bioconductor package (Gentleman et al., 2004, Genome Biol. 5: 280) and thus are directed to researchers with a good knowledge of bioinformatics. We propose here a new approach based on the principle that a probe-set which hybridizes to more than one transcript can be recognized because it produces a signal significantly different from others assigned to the particular gene, allowing it to be detected as an outlier in the group and eliminated from subsequent analyses. A simple freeware application has been developed (available at that detects and removes outlying probe-sets and calculates average signal values for individual genes using the latest annotation database provided by Affymetrix. We illustrate this procedure using microarray data from our experiments aiming to study changes of transcription profile induced by ionizing radiation in human cells.
Physical description
  • System Engineering Group, Institute of Automation, Silesian University of Technology, Gliwice, Poland
  • System Engineering Group, Institute of Automation, Silesian University of Technology, Gliwice, Poland
  • Department of Experimental and Clinical Radiobiology, Maria Skłodowska-Curie Memorial Cancer Center and Institute of Oncology, Gliwice, Poland
  • System Engineering Group, Institute of Automation, Silesian University of Technology, Gliwice, Poland
  • Affymetrix (2004) GeneChip Expression Analysis - Technical Manual. pp 185.
  • Bolstad B (2008) RMAExpress.
  • Bourquin J, Subramanian A, Langebrake C, Reinhardt D, Bernard O, Ballerini P, Baruchel A, Cave H, Dastugue N, Hasle H, Kaspers G, Lessard M, Michaux L, Vyas P, Wering E, Zwaan C, Golub T, Orkinar S (2006) Identification of distinct molecular phenotypes in acute megakaryoblastic leukemia by gene expression profiling. Proc Natl Acad Sci USA 103: 3339-3344.
  • Buck K, Vanek M, Groner B, Ball RK (1992) Multiple forms of prolactin receptor messenger ribonucleic acid are specifically expressed and regulated in murine tissues and the mammary cell line HC11. Endocrinology 130: 1108-1114.
  • Carter SL, Eklund AC, Mecham BH, Kohane IS, Szallasi Z (2005) Redefinition of Affymetrix probe sets by sequence overlap with cDNA microarray probes reduces cross-platform inconsistencies in cancer-associated gene expression measurements. BMC Bioinformatics 6: 107.
  • Chalifa-Caspi V, Yanai I, Ophir R, Rosen N, Shmoish M, Benjamin-Rodrig H, Shklar M, Stein TI, Shmueli O, Safran M, Lancet D (2004) GeneAnnot: comprehensive two-way linking between oligonucleotide array probesets and GeneCards genes. Bioinformatics 20: 1457-1458.
  • Dai M, Wang P, Boyd AD, Kostov G, Athey B, Jones EG, Bunney WE, Myers RM, Speed TP, Akil H, Watson SJ, Meng F (2005) Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data. Nucleic Acids Res 33: e175.
  • Dixon WJ (1953) Processing data for outliers. Biometrics 9: 74-89.
  • Elbez Y, Farkash-Amar S, Simon I (2006) An analysis of intra array repeat: the good, the bad and the noninformative. BMC Genomics 7: 136.
  • Elo LL, Lahti L, Skottman H, Kylaniemi M, Lahesmaa R, Aittokallio T ( 2005) Integrating probe-level expression changes across generations of Affymetrix arrays. Nucleic Acids Res 33: e193.
  • Ferrari F, Bortoluzzi S, Coppe A, Sirota A, Safran M, Shmoish M, Ferrari S, Lancet D, Danieli GA, Bicciato S (2007) Novel definition files for human GeneChips based on GeneAnnot. BMC Bioinformatics 8: 446-452.
  • Gautier L, Møller M, Friis-Hansen L, Knudsen S (2004) Alternative mapping of probes to genes for Affymetrix chips. BMC Bioinformatics 5: 111.
  • Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, Hornik K, Hothorn T, Huber W, Iacus S, Irizarry R, Leisch F, Li C, Maechler M, Rossini AJ, Sawitzki G, Smith C, Smyth G, Tierney L, Yang JY, Zhang J (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5: R80.
  • Györffy B, Schäfer R (2008) Meta-analysis of gene expression profiles related to relapse-free survival in 1,079 breast cancer patients. Breast Cancer Res Treat Epub ahead of print.
  • Harbig J, Sprinkle R, Enkemann SA (2005) A sequence-based identification of the genes detected by probesets on the Affymetrix U133 plus 2.0 array. Nucleic Acids Res 33: e31.
  • Heydebreck A, Huber W, Gentleman R (2004) Differential Expression with the Bioconductor Project. Bioconductor Project Working Papers Working Paper 7.
  • Hwang KB, Kong SW, Greenberg SA, Park PJ (2004) Combining gene expression data from different generations of oligonucleotide arrays. BMC Bioinformatics 5: 159.
  • Irizarry RA, Hobbs B, Collin F, Beazer-Barclay YD, Antonellis KJ, Scherf U, Speed TP (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4: 249-264.
  • Jordan I, Marino-Ramirez L, Koonin E (2005) Evolutionary significance of gene expression divergence. Gene 345: 119-126.
  • Karlsson E, Delle U, Danielsson A, Olsson B, Abel F, Karlsson P, Helou K (2008) Gene expression variation to predict 10-year survival in lymph-node-negative breast cancer. BMC Cancer 8: 254.
  • Kothapalli R, Yoder SJ, Mane S, Loughran TPJ (2002) Microarray results: how accurate are they? BMC Bioinformatics 3: 22.
  • Kuo WP, Jenssen TK, Butte AJ, Ohno-Machado L, Kohane IS (2002) Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics 18: 405-412.
  • Ladd AN, Cooper TA (2002) Finding signals that regulate alternative splicing in the post-genomic era. Genome Biol 3: reviews0008.
  • Li C, Wong WH (2001) Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detection. Proc Natl Acad Sci 98: 31-36.
  • Li H, Zhu D, Cook M (2008) A statistical framework for consolidating sibling probe sets for Affymetrix GeneChip data. BMC Genomics 9: 188.
  • Liao B, Zhang J (2006) Evolutionary conservation of expression profiles between human and mouse orthologous genes. Mol Biol Evol 23: 530-540.
  • Lim SJ, Jung HH, Cho YA (2006) Postnatal development of myosin heavy chain isoforms in rat extraocular muscles. Mol Vis 12: 243-250.
  • Liu G, Loraine AE, Shigeta R, Cline M, Cheng J, Valmeekam V, Sun S, Kulp D, Siani-Rose MA (2003) NetAffx: Affymetrix probesets and annotations. Nucleic Acids Res 31: 82-86.
  • Lockhart DJ, Dong H, Byrne MC, Follettie MT, Gallo MV, Chee MS, Mittmann M, Wang C, Kobayashi M, Horton H, Brown EL (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 14: 1675-1680.
  • Lu X, Zhang X (2006) The effect of GeneChip gene definitions on the microarray study of cancers. Bioessays 28: 739-746.
  • Lu J, Lee JC, Salit ML, Cam MC (2007) Transcript-based redefinition of grouped oligonucleotide probe sets using AceView: high-resolution annotation for microarrays. BMC Bioinformatics 8: 108.
  • Mecham BH, Klus GT, Strovel J, Augustus M, Byrne D, Bozso P, Wetmore DZ, Mariani TJ, Kohane IS, Szallasi Z (2004) Sequence-matched probes produce increased cross-platform consistency and more reproducible biological results in microarray-based gene expression measurements. Nucleic Acids Res 32: e74.
  • Okoniewski MJ, Miller CJ (2006) Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations. BMC Bioinformatics 7: 276.
  • Perez-Iratxeta C, Palidwor G, Porter CJ, Sanche NA, Huska MR, Suomela BP, Muro EM, Krzyzanowski PM, Hughes E, Campbell PA, Rudnicki MA, Andrade MA (2005) Study of stem cell function using microarray experiments. FEBS Lett 579: 1795-1801.
  • Ramsay G (1998) DNA chips: state-of-the art. Nat Biotechnol 16: 40-44.
  • Stalteri MA, Harrison AP (2007) Interpretation of multiple probe sets mapping to the same gene in Affymetrix GeneChips. BMC Bioinformatics 8: 13-27.
  • Stoughton RB (2005) Applications of DNA microarrays in biology. Annu Rev Biochem 74: 53-82.
  • Yu H, Wang F, Tu K, Xie L, Li YY, Li YX (2007) Transcript-level annotation of Affymetrix probesets improves the interpretation of gene expression data. BMC Bioinformatics 8: 194.
  • Zhang J, Finney RP, Clifford RJ, Derr LK, Buetow KH (2005) Detecting false expression signals in high-density oligonucleotide arrays by an in silico approach. Genomics 85: 297-308.
Document Type
Publication order reference
YADDA identifier
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.