Full-text resources of PSJD and other databases are now available in the new Library of Science.
Visit https://bibliotekanauki.pl

PL EN


Preferences help
enabled [disable] Abstract
Number of results
2022 | 44 | 43-62

Article title

Applications and Limitations of Integrative Robust Approaches in Multiple Omics Analysis

Content

Title variants

Languages of publication

EN

Abstracts

EN
Comprehensive biological research shows that genomic information garnered over the years is not enough to completely understand biological systems even at the cellular level. Integrative omics focuses on the integration of multiple omics data types, with an unceasing improvement of high-content, real-time, multimodal, multi-omics technologies. This will lead to a deep understanding of biological systems. Multi-omics can be used to profile genetic, transcriptomic, epigenetic, spatial, proteomic and lineage information in single cells. This transformative method provides bioinformatics and integrative methods that can be used through multiple types of data, and it can identify relationships within cellular modalities, provide a deeper representation of cell state, and aid assembly of data sets to provide useful knowledge. Here, we discuss the challenges of multiple omics datatype integration, limitations of the complex machine learning models and recent technology advances in multi-omics data integration.

Year

Volume

44

Pages

43-62

Physical description

Contributors

  • Covenant Applied Informatics and Communication African Centre of Excellence, Department of Computer and Information Sciences, Covenant University, Ota, Ogun State, Nigeria
author
  • Covenant University Bioinformatics Research (CUBRe), Covenant University, Ota, Ogun State, Nigeria
  • Covenant Applied Informatics and Communication African Centre of Excellence, Department of Biochemistry, Covenant University, Ota, Ogun State, Nigeria
  • Covenant Applied Informatics and Communication African Centre of Excellence, Department of Computer and Information Sciences, Covenant University, Ota, Ogun State, Nigeria
  • Department of Computer and Information Sciences, Covenant University, Ota, Ogun State, Nigeria

References

  • [1] S. T. O’Donnell, R. P. Ross, and C. Stanton, The Progress of Multi-Omics Technologies: Determining Function in Lactic Acid Bacteria Using a Systems Level Approach, Frontiers in Microbiology, vol. 10. Frontiers Media S.A., p. 3084, Jan. 28, 2020, doi: 10.3389/fmicb.2019.03084
  • [2] P. S. Reel, S. Reel, E. Pearson, E. Trucco, and E. Jefferson, Using machine learning approaches for multi-omics data analysis: A review, Biotechnology Advances, vol. 49. Elsevier Inc., p. 107739, Jul. 01, 2021, doi: 10.1016/j.biotechadv.2021.107739
  • [3] A. Tebani, C. Afonso, S. Marret, and S. Bekri, Omics-based strategies in precision medicine: Toward a paradigm shift in inborn errors of metabolism investigations, International Journal of Molecular Sciences, vol. 17, no. 9. MDPI AG, Sep. 14, 2016, doi: 10.3390/ijms17091555
  • [4] D. M. Rotroff and A. A. Motsinger-Reif, Embracing Integrative Multiomics Approaches, Int. J. Genomics, vol. 2016, 2016, doi: 10.1155/2016/1715985
  • [5] Z. Y. Yang, Y. Liang, H. Zhang, H. Chai, B. Zhang, and C. Peng, Robust Sparse Logistic Regression with the (0 < q < 1) Regularization for Feature Selection Using Gene Expression Data, IEEE Access, vol. 6, pp. 68586–68595, 2018, doi: 10.1109/ACCESS.2018.2880198
  • [6] C. Wu, F. Zhou, J. Ren, X. Li, Y. Jiang, and S. Ma, A selective review of multi-level omics data integration using variable selection, High-Throughput, vol. 8, no. 1, pp. 1–25, 2019, doi: 10.3390/ht8010004
  • [7] N. J. Mulder et al., Development of Bioinformatics Infrastructure for Genomics Research in H3Africa, Glob. Heart, pp. 1–8, 2017, doi: 10.1016/j.gheart.2017.01.005
  • [8] J. Oyelade et al., Clustering Algorithms: Their Application to Gene Expression Data, Bioinform. Biol. Insights, vol. 10, p. 237, Nov. 2016, doi: 10.4137/BBI.S38316
  • [9] B. Mirza, W. Wang, J. Wang, H. Choi, N. C. Chung, and P. Ping, Machine learning and integrative analysis of biomedical big data, Genes, vol. 10, no. 2. MDPI AG, Jan. 01, 2019, doi: 10.3390/genes10020087
  • [10] B. De Meulder et al., A computational framework for complex disease stratification from multiple large-scale datasets, BMC Syst. Biol., vol. 12, no. 1, p. 60, Dec. 2018, doi: 10.1186/s12918-018-0556-z
  • [11] N. Rappoport and R. Shamir, Multi-omic and multi-view clustering algorithms: review and cancer benchmark. Nucleic Acids Res., vol. 46, no. 20, pp. 10546–10562, Nov. 2018, doi: 10.1093/nar/gky889.
  • [12] Z. M. Hira and D. F. Gillies, A review of feature selection and feature extraction methods applied on microarray data, Adv. Bioinformatics, vol. 2015, 2015, doi: 10.1155/2015/198363
  • [13] L. Wang, Y. Wang, and Q. Chang, Feature selection methods for big data bioinformatics: A survey from the search perspective, Methods, vol. 111, pp. 21–31, 2016, doi: 10.1016/j.ymeth.2016.08.014
  • [14] I. Subramanian, S. Verma, S. Kumar, A. Jere, and K. Anamika, Multi-omics Data Integration, Interpretation, and Its Application, Bioinform. Biol. Insights, vol. 14, p. 117793221989905, Jan. 2020, doi: 10.1177/1177932219899051
  • [15] B. Wang et al., Similarity network fusion for aggregating data types on a genomic scale, Nat. Methods, vol. 11, no. 3, pp. 333–337, 2014, doi: 10.1038/nmeth.2810.
  • [16] M. W. Libbrecht and W. S. Noble, Machine learning applications in genetics and genomics, Nat. Rev. Genet., vol. 16, no. 6, pp. 321–332, Jun. 2015, doi: 10.1038/nrg3920
  • [17] I. Triguero, S. Del Río, V. López, J. Bacardit, J. M. Benítez, and F. Herrera, ROSEFW-RF: The winner algorithm for the ECBDL’14 big data competition: An extremely imbalanced big data bioinformatics problem, Knowledge-Based Syst. vol. 87, pp. 69–79, 2015, doi: 10.1016/j.knosys.2015.05.027
  • [18] J. C. Aledo, F. R. Cantón, and F. J. Veredas, A machine learning approach for predicting methionine oxidation sites, BMC Bioinformatics, vol. 18, no. 1, p. 430, Sep. 2017, doi: 10.1186/s12859-017-1848-9
  • [19] J. Hu, Y. Li, M. Zhang, X. Yang, H. Bin Shen, and D. J. Yu, Predicting Protein-DNA Binding Residues by Weightedly Combining Sequence-Based Features and Boosting Multiple SVMs, IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 14, no. 6, pp. 1389–1398, 2017, doi: 10.1109/TCBB.2016.2616469
  • [20] Z. Liu, X. Xiao, W. R. Qiu, and K. C. Chou, IDNA-Methyl: Identifying DNA methylation sites via pseudo trinucleotide composition, Anal. Biochem. vol. 474, pp. 69–77, Apr. 2015, doi: 10.1016/j.ab.2014.12.009
  • [21] W. Zhang, T. D. Spector, P. Deloukas, J. T. Bell, and B. E. Engelhardt, Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements, Genome Biol. vol. 16, no. 1, p. 14, Jan. 2015, doi: 10.1186/s13059-015-0581-9
  • [22] Z.-S. Wei, J.-Y. Yang, H.-B. Shen, and D.-J. Yu, A Cascade Random Forests Algorithm for Predicting Protein-Protein Interaction Sites, IEEE Trans. Nanobioscience, vol. 14, no. 7, pp. 746–60, Oct. 2015, doi: 10.1109/TNB.2015.2475359
  • [23] Z.-S. Wei, K. Han, J.-Y. Yang, H.-B. Shen, and D.-J. Yu, Protein–protein interaction sites prediction by ensembling SVM and sample-weighted random forests, Neurocomputing, vol. 193, pp. 201–212, Jun. 2016, doi: 10.1016/j.neucom.2016.02.022
  • [24] W. Lin and D. Xu, Imbalanced multi-label learning for identifying antimicrobial peptides and their functional types, Bioinformatics, vol. 32, no. 24, pp. 3745–3752, Dec. 2016, doi: 10.1093/bioinformatics/btw560
  • [25] R. Argelaguet et al., Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets, Mol. Syst. Biol., vol. 14, no. 6, Jun. 2018, doi: 10.15252/msb.20178124
  • [26] L. De Cecco et al., Integrative miRNA-Gene expression analysis enables refinement of associated biology and prediction of response to cetuximab in head and neck squamous cell cancer, Genes (Basel)., vol. 8, no. 1, p. 35, 2017, doi: 10.3390/genes8010035
  • [27] K. Kira and L. A. Rendell, Feature selection problem: traditional methods and a new algorithm, in Proceedings Tenth National Conference on Artificial Intelligence, 1992, pp. 129–134
  • [28] A. Adorada, R. Permatasari, P. W. Wirawan, A. Wibowo, and A. Sujiwo, Support Vector Machine - Recursive Feature Elimination (SVM - RFE) for Selection of MicroRNA Expression Features of Breast Cancer, in 2018 2nd International Conference on Informatics and Computational Sciences (ICICoS), Oct. 2018, pp. 1–4, doi: 10.1109/ICICOS.2018.8621708
  • [29] M. B. Kursa, A. Jankowski, and W. R. Rudnicki, Boruta – A System for Feature Selection, Fundam. Informaticae, vol. 101, no. 4, pp. 271–285, 2010, doi: 10.3233/FI-2010-288
  • [30] F. Santosa and W. W. Symes, Linear Inversion of Band-Limited Reflection Seismograms, SIAM J. Sci. Stat. Comput., vol. 7, no. 4, pp. 1307–1330, Oct. 1986, doi: 10.1137/0907087
  • [31] H. Zou and T. Hastie, ‘Regularization and variable selection via the elastic net’, J. R. Stat. Soc. Ser. B Stat. Methodol., vol. 67, no. 2, pp. 301–320, 2005, doi: 10.1111/j.1467-9868.2005.00503.x
  • [32] C. Meng, O. A. Zeleznik, G. G. Thallinger, B. Kuster, A. M. Gholami, and A. C. Culhane, ‘Dimension reduction techniques for the integrative analysis of multi-omics data’, Brief. Bioinform., vol. 17, no. 4, pp. 628–641, Jul. 2016, doi: 10.1093/bib/bbv108
  • [33] Q. Mo, R. Shen, C. Guo, M. Vannucci, K. S. Chan, and S. G. Hilsenbeck, ‘A fully Bayesian latent variable model for integrative clustering analysis of multi-type omics data’, Biostatistics, vol. 19, no. 1, pp. 71–86, 2018, doi: 10.1093/biostatistics/kxx017
  • [34] J. C. Costello et al., ‘A community effort to assess and improve drug sensitivity prediction algorithms’, Nat. Biotechnol., vol. 32, no. 12, pp. 1202–1212, Dec. 2014, doi: 10.1038/nbt.2877
  • [35] G. Tini, L. Marchetti, C. Priami, and M. P. Scott-Boyer, ‘Multi-omics integration-A comparison of unsupervised clustering methodologies’, Brief. Bioinform., vol. 20, no. 4, pp. 1269–1279, 2018, doi: 10.1093/bib/bbx167
  • [36] C. Dimitrakopoulos et al., ‘Network-based integration of multi-omics data for prioritizing cancer genes’, Bioinformatics, vol. 34, no. 14, pp. 2441–2448, Jul. 2018, doi: 10.1093/bioinformatics/bty148
  • [37] C. J. Vaske et al., ‘Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM’, Bioinformatics, vol. 26, no. 12, Jun. 2010, doi: 10.1093/bioinformatics/btq182
  • [38] A. Hosseini, T. Chen, W. Wu, Y. Sun, and M. Sarrafzadeh, ‘Heteromed: Heterogeneous information network for medical diagnosis’, Int. Conf. Inf. Knowl. Manag. Proc., pp. 763–772, 2018, doi: 10.1145/3269206.3271805
  • [39] Y. Zhang, A. Li, C. Peng, and M. Wang, ‘Improve Glioblastoma Multiforme Prognosis Prediction by Using Feature Selection and Multiple Kernel Learning’, IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 13, no. 5, pp. 825–835, Sep. 2016, doi: 10.1109/TCBB.2016.2551745
  • [40] A. Rakotomamonjy, ‘SimpleMKL’, vol. 9, pp. 2491–2521, 2008.
  • [41] J. Barretina et al., ‘The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity’, Nature, vol. 483, no. 7391, pp. 603–607, Mar. 2012, doi: 10.1038/nature11003
  • [42] Y. Zhang et al., ‘Integrative functional genomics identifies regulatory genetic variant modulating RAB31 expression and altering susceptibility to breast cancer.’, Mol. Carcinog., vol. 57, no. 12, pp. 1845–1854, Dec. 2018, doi: 10.1002/mc.22902
  • [43] N. Aben, D. J. Vis, M. Michaut, and L. F. A. Wessels, ‘TANDEM: A two-stage approach to maximize interpretability of drug response models based on multiple molecular data types’, in Bioinformatics, Sep. 2016, vol. 32, no. 17, pp. i413–i420, doi: 10.1093/bioinformatics/btw449
  • [44] R. Shen, A. B. Olshen, and M. Ladanyi, ‘Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis’, Bioinformatics, vol. 25, no. 22, pp. 2906–2912, Nov. 2009, doi: 10.1093/bioinformatics/btp543
  • [45] K. A. Hoadley et al., ‘Cell-of-Origin Patterns Dominate the Molecular Classification of 10,000 Tumors from 33 Types of Cancer’, Cell, vol. 173, no. 2, pp. 291-304.e6, Apr. 2018, doi: 10.1016/j.cell.2018.03.022
  • [46] Q. Mo et al., ‘Pattern discovery and cancer gene identification in integrated cancer genomic data’, Proc. Natl. Acad. Sci. U. S. A., vol. 110, no. 11, pp. 4245–4250, Mar. 2013, doi: 10.1073/pnas.1208949110
  • [47] J. Barretina et al., ‘The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity’, Nature, vol. 483, no. 7391, pp. 603–607, Mar. 2012, doi: 10.1038/nature11003
  • [48] T. D. Nguyen, T. Tran, D. Phung, and S. Venkatesh, ‘Latent patient profile modelling and applications with mixed-variate restricted Boltzmann machine’, in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2013, vol. 7818 LNAI, no. PART 1, pp. 123–135, doi: 10.1007/978-3-642-37453-1_11
  • [49] M. Liang, Z. Li, T. Chen, and J. Zeng, ‘Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach’, IEEE/ACM Trans. Comput. Biol. Bioinforma., vol. 12, no. 4, pp. 928–937, 2015, doi: 10.1109/TCBB.2014.2377729
  • [50] M. Kim, I. Oh, and J. Ahn, ‘An Improved Method for Prediction of Cancer Prognosis by Network Learning’, Genes (Basel)., vol. 9, no. 10, p. 478, Oct. 2018, doi: 10.3390/genes9100478
  • [51] L. Zhang et al., ‘Deep Learning-Based Multi-Omics Data Integration Reveals Two Prognostic Subtypes in High-Risk Neuroblastoma’, Front. Genet., vol. 9, no. OCT, p. 477, Oct. 2018, doi: 10.3389/fgene.2018.00477
  • [52] L. Nanni, C. Fantozzi, and N. Lazzarini, ‘Coupling different methods for overcoming the class imbalance problem’, Neurocomputing, vol. 158, pp. 48–61, Jun. 2015, doi: 10.1016/j.neucom.2015.01.068
  • [53] V. H. Barella, E. P. Costa, and A. C. P. L. F. Carvalho, ‘ClusterOSS : a new undersampling method for imbalanced learning’, Brazilian Conf. Intell. Syst., pp. 1–6, 2014
  • [54] L. Nanni, C. Fantozzi, and N. Lazzarini, ‘Coupling different methods for overcoming the class imbalance problem’, Neurocomputing, vol. 158, pp. 48–61, 2015, doi: 10.1016/j.neucom.2015.01.068
  • [55] B. Gu, X. Quan, Y. Gu, V. S. Sheng, and G. Zheng, ‘Chunk incremental learning for cost-sensitive hinge loss support vector machine’, Pattern Recognit., vol. 83, pp. 196–208, 2018, doi: 10.1016/j.patcog.2018.05.023
  • [56] L. Zhang and P. N. Suganthan, ‘A comprehensive evaluation of random vector functional link networks’, Inf. Sci. (Ny)., vol. 367–368, pp. 1094–1105, 2016, doi: 10.1016/j.ins.2015.09.025
  • [57] T. V Nguyen and B. Mirza, ‘Dual-layer kernel extreme learning machine for action recognition’, Neurocomputing, vol. 260, pp. 123–130, 2017, doi: 10.1016/j.neucom.2017.04.007
  • [58] M. Zaharia et al., ‘This open source computing framework unifies streaming, batch, and interactive big data workloads to unlock new applications’, Commun. ACM, vol. 59, no. 11, 2016, doi: 10.1145/2934664
  • [59] X. Meng et al., ‘MLlib: Machine Learning in Apache Spark’, J. Mach. Learn. Res., vol. 17, pp. 1–7, 2016
  • [60] R. Anil et al., ‘Apache Mahout: Machine Learning on Distributed Dataflow Systems’, J. Mach. Learn. Res., vol. 21, pp. 1–6, 2020
  • [61] M. Sherar and F. Zulkernine, ‘Particle swarm optimization for large-scale clustering on apache spark’, 2017 IEEE Symp. Ser. Comput. Intell. SSCI 2017 - Proc., vol. 2018-Janua, pp. 1–8, 2018, doi: 10.1109/SSCI.2017.8285208
  • [62] A. H. Foss and M. Markatou, ‘kamila: Clustering mixed-type data in R and hadoop’, J. Stat. Softw., vol. 83, pp. 1–44, 2018, doi: 10.18637/jss.v083.i13
  • [63] A. L’Heureux, K. Grolinger, H. F. Elyamany, and M. A. M. Capretz, ‘Machine Learning with Big Data: Challenges and Approaches’, IEEE Access, vol. 5, no. April, pp. 7776–7797, 2017, doi: 10.1109/ACCESS.2017.2696365
  • [64] P. Gupta, A. Sharma, and R. Jindal, ‘Scalable machine-learning algorithms for big data analytics: a comprehensive review’, Wiley Interdiscip. Rev. Data Min. Knowl. Discov., vol. 6, no. 6, pp. 194–214, 2016, doi: 10.1002/widm.1194
  • [65] I. A. T. Hashem, I. Yaqoob, N. B. Anuar, S. Mokhtar, A. Gani, and S. Ullah Khan, ‘The rise of “big data” on cloud computing: Review and open research issues’, Inf. Syst., vol. 47, pp. 98–115, 2015, doi: 10.1016/j.is.2014.07.006
  • [66] M. Oh, S. Park, S. Kim, and H. Chae, ‘Machine learning-based analysis of multi-omics data on the cloud for investigating gene regulations’, Briefings in Bioinformatics, vol. 22, no. 1. Oxford University Press, pp. 66–76, Jan. 01, 2021, doi: 10.1093/bib/bbaa032
  • [67] K. M. Fisch et al., ‘Omics Pipe: a community-based framework for reproducible multi-omics data analysis’, Bioinformatics, vol. 31, no. 11, p. 1724, Jun. 2015, doi: 10.1093/BIOINFORMATICS/BTV061
  • [68] J. Chong et al., ‘MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis’, Nucleic Acids Res., vol. 46, no. Web Server issue, p. W486, Jul. 2018, doi: 10.1093/NAR/GKY310
  • [69] E. M. Forsberg et al., ‘Data processing, multi-omic pathway mapping, and metabolite activity analysis using XCMS Online’, Nat. Protoc., vol. 13, no. 4, pp. 633–651, Apr. 2018, doi: 10.1038/nprot.2017.151
  • [70] E. G. Dada, J. S. Bassi, H. Chiroma, S. M. Abdulhamid, A. O. Adetunmbi, and O. E. Ajibuwa, ‘Machine learning for email spam filtering: review, approaches and open research problems’, Heliyon, vol. 5, no. 6, p. e01802, 2019, doi: 10.1016/j.heliyon.2019.e01802
  • [71] J. Fan, F. Han, and H. Liu, ‘Challenges of Big Data Analysis.’, Natl. Sci. Rev., vol. 1, no. 2, pp. 293–314, Jun. 2014, doi: 10.1093/nsr/nwt032
  • [72] J. Martorell-Marugán et al., ‘Deep Learning in Omics Data Analysis and Precision Medicine’, in Computational Biology, Codon Publications, 2019, pp. 37–53
  • [73] D. Grapov, J. Fahrmann, K. Wanichthanarak, and S. Khoomrung, ‘Rise of Deep Learning for Genomic, Proteomic, and Metabolomic Data Integration in Precision Medicine’, doi: 10.1089/omi.2018.0097
  • [74] D. Ravi et al., ‘Deep Learning for Health Informatics’, IEEE J. Biomed. Heal. Informatics, vol. 21, no. 1, pp. 4–21, Jan. 2017, doi: 10.1109/JBHI.2016.2636665
  • [75] R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley, ‘Deep learning for healthcare: Review, opportunities and challenges’, Brief. Bioinform., vol. 19, no. 6, pp. 1236–1246, May 2017, doi: 10.1093/bib/bbx044
  • [76] D. Ruiz-Perez et al., ‘Dynamic Bayesian Networks for Integrating Multi-omics Time Series Microbiome Data’, mSystems, vol. 6, no. 2, Apr. 2021, doi: 10.1128/msystems.01105-20
  • [77] Q. Shi, B. Hu, T. Zeng, and C. Zhang, ‘Multi-view subspace clustering analysis for aggregating multiple heterogeneous omics data’, Front. Genet., vol. 10, no. JUL, p. 744, Aug. 2019, doi: 10.3389/fgene.2019.00744
  • [78] T. Ma and A. Zhang, ‘Integrate multi-omic data using affinity network fusion (ANF) for cancer patient clustering’, in 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Nov. 2017, vol. 2017-Janua, pp. 398–403, doi: 10.1109/BIBM.2017.8217682
  • [79] N. D. Nguyen and D. Wang, ‘Multiview learning for understanding functional multiomics’, PLOS Comput. Biol., vol. 16, no. 4, p. e1007677, Apr. 2020, doi: 10.1371/journal.pcbi.1007677
  • [80] L. Y. Guo, A. H. Wu, Y. X. Wang, L. P. Zhang, H. Chai, and X. F. Liang, ‘Deep learning-based ovarian cancer subtypes identification using multi-omics data’, BioData Min., vol. 13, no. 1, pp. 1–12, Aug. 2020, doi: 10.1186/s13040-020-00222-x

Document Type

article

Publication order reference

Identifiers

YADDA identifier

bwmeta1.element.psjd-71755c1c-9c3a-433c-bb3c-d976ee66faa1
JavaScript is turned off in your web browser. Turn it on to take full advantage of this site, then refresh the page.