Vol. 1 No. 1 (2023): Jan/June - Issue - 01
Articles

Bi-Cluster Based Analysis on Gene Ontology

Meenakshi Sundaram A
School of Computer Science and Engineering, REVA University, Bangalore, Karnataka, India.
Anooja Ali
School of Computer Science and Engineering, REVA University, Bangalore, Karnataka, India.
S S Patil
Department of Agricultural Statistics, University of Agriculture Sciences, Bengaluru, India
Ajil A
School of Computer Science and Engineering, REVA University, Bangalore, Karnataka, India.

Published 2023-06-23

Keywords

  • Bicluster Centrality,
  • Gene Ontology,
  • Mutual Information,
  • medical

How to Cite

Meenakshi Sundaram A, Anooja Ali, S S Patil, & Ajil A. (2023). Bi-Cluster Based Analysis on Gene Ontology. Milestone Transactions on Medical Technometrics, 1(1), 10–17. https://doi.org/10.5281/zenodo.8073114

Abstract

Understanding biological activity requires the detection of crucial proteins. The identification of significant genes throughout the entire genome is advantageous for a number of reasons, including the categorization of critical genes for health and sickness, the rational creation of drugs, etc. Statistical methods have been suggested for predicting essential or requisite proteins/gene/GO terms, employed in protein networks.  The computational approaches focusing on the topological characteristics or centrality approaches ignore the biologically relevant intrinsic features of essential proteins. Hence, considering the biological aspects like expression data, subcellular information, annotation data, and orthologous relationships can improve accuracy. So, in this research, bi-clustering algorithm is used to detect the essential Gene Ontology (GO) terms in molecular, cellular and biological processes by evaluating the protein associations and encoding the associations with ontology terms and pathways. The proposed method encodes each protein in terms of Mutual Information (MI) score, GO annotation and vector-based GO encoded matrix is generated and the essential proteins are extracted. The validation of the proposed method is verified using different statistical measures on the datasets.

References

  1. Samish, I., Bourne, P. E., & Najmanovich, R. J. (2015). Achievements and challenges in structural bioinformatics and computational biophysics. Bioinformatics, 31(1), 146-150.
  2. Ali, A., Viswanath, R., Patil, S. S., & Venugopal, K. R. (2017, May). A review of aligners for protein protein interaction networks. In 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT) (pp. 1651-1655). IEEE.
  3. Xenarios, I., Fernandez, E., Salwinski, L., Duan, X. J., Thompson, M. J., Marcotte, E. M., & Eisenberg, D. (2001). DIP: the database of interacting proteins: 2001 update. Nucleic acids research, 29(1), 239-241.
  4. Stark, C., Breitkreutz, B. J., Reguly, T., Boucher, L., Breitkreutz, A., & Tyers, M. (2006). BioGRID: a general repository for interaction datasets. Nucleic acids research, 34(suppl_1), D535-D539.
  5. Szklarczyk, D., Franceschini, A., Kuhn, M., Simonovic, M., Roth, A., Minguez, P., ... & Jensen, L. J. (2011). Mering Cv. 2011. The STRING database in.
  6. Ali, A., Hulipalled, V. R., & Patil, S. S. (2020, December). Centrality Measure Analysis on Protein Interaction Networks. In 2020 IEEE International Conference on Technology, Engineering, Management for Societal impact using Marketing, Entrepreneurship and Talent (TEMSMET) (pp. 1-5). IEEE.
  7. Ali, A., Hulipalled, V. R., Patil, S. S., & Abdulkader, R. (2021). DPEBic: detecting essential proteins in gene expressions using encoding and biclustering algorithm. Journal of Ambient Intelligence and Humanized Computing, 1-8.
  8. Ahmed, H., Howton, T. C., Sun, Y., Weinberger, N., Belkhadir, Y., & Mukhtar, M. S. (2018). Network biology discovers pathogen contact points in host protein-protein interactomes. Nature communications, 9(1), 2312.
  9. Jere, S., Jayannavar, L., Ali, A., & Kulkarni, C. (2017, February). Recruitment graph model for hiring unique competencies using social media mining. In Proceedings of the 9th International Conference on Machine Learning and Computing (pp. 461-466).
  10. Zhang, X., Xu, J., & Xiao, W. X. (2013). A new method for the discovery of essential proteins. PloS one, 8(3), e58763.
  11. Orzechowski, P., Boryczko, K., & Moore, J. H. (2019). Scalable biclustering—the future of big data exploration, GigaScience, 8(7), giz078.
  12. Ali, A., Ajil, A., Meenakshi Sundaram, A., & Joseph, N. (2023). Detection of Gene Ontology Clusters Using Biclustering Algorithms. SN Computer Science, 4(3), 217.
  13. Cheng, Y., & Church, G. M. (2000, August). Biclustering of expression data. In Ismb (Vol. 8, No. 2000, pp. 93-103).
  14. Ali, A., Hulipalled, V. R., & Patil, S. S. (2022). A Novel Semantic Similarity Score For Protein Data Analysis. Computing Technology Research Journal, 1(1), 1-4.
  15. Patil, S. S., Ali, A., & Ajil, A. (2023). Approaches for Network Analysis in Protein Interaction Network. International Journal of Human Computations & Intelligence, 2(2), 47-54.
  16. Li, G., Ma, Q., Tang, H., Paterson, A. H., & Xu, Y. (2009). QUBIC: a qualitative biclustering algorithm for analyses of gene expression data. Nucleic acids research, 37(15), e101-e101.
  17. Szklarczyk, D., Gable, A. L., Lyon, D., Junge, A., Wyder, S., Huerta-Cepas, J., ... & Mering, C. V. (2019). STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic acids research, 47(D1), D607-D613.
  18. Ben-Dor, A., Chor, B., Karp, R., & Yakhini, Z. (2002, April). Discovering local structure in gene expression data: the order-preserving submatrix problem. In Proceedings of the sixth annual international conference on Computational biology (pp. 49-57).
  19. Sathiyamoorthi, V., Ilavarasi, A. K., Murugeswari, K., Ahmed, S. T., Devi, B. A., & Kalipindi, M. (2021). A deep convolutional neural network based computer aided diagnosis system for the prediction of Alzheimer's disease in MRI images. Measurement, 171, 108838.
  20. Petrovic, S. (2006, October). A comparison between the silhouette index and the davies-bouldin index in labelling ids clusters. In Proceedings of the 11th Nordic workshop of secure IT systems (Vol. 2006, pp. 53-64). Citeseer.
  21. Kumar, S. S., Ahmed, S. T., Vigneshwaran, P., Sandeep, H., & Singh, H. M. (2021). Two phase cluster validation approach towards measuring cluster quality in unstructured and structured numerical datasets. Journal of Ambient Intelligence and Humanized Computing, 12, 7581-7594.
  22. Swamy, R., Ahmed, S. T., Thanuja, K., Ashwini, S., Siddiqha, S., & Fathima, A. (2021, January). Diagnosing the level of Glaucoma from Fundus Image Using Empirical Wavelet Transform. In Proceedings of the First International Conference on Advanced Scientific Innovation in Science, Engineering and Technology, ICASISET 2020, 16-17 May 2020, Chennai, India.