Supervised Gene Function Prediction Using Spectral Clustering on Gene Co-expression Networks

Miguel Romero, Óscar Ramírez, Jorge Finke, Camilo Rocha

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Gene annotation addresses the problem of predicting unknown functions that are associated to the genes of a specific organism (e.g., biological processes). Despite recent advances, the cost and time demanded by annotation procedures that rely largely on in vivo biological experiments remain prohibitively high. This paper presents an in silico approach to the annotation of genes that follows a network-based representation, and combines techniques from multivariate statistics (spectral clustering) and machine learning (gradient boosting). Spectral clustering is used to enrich the gene co-expression network (GCN) with currently known gene annotations. Gradient boosting is trained on features of the GCN to build an estimator of the probability that a gene is involved in a given biological process. The proposed approach is applied to a case study on Zea mays, one of the world’s most dominant and productive crop. Broadly speaking, the main results illustrate how computational experimentation narrows down the time and costs in efforts to annotate the functions of genes. More specifically, the results highlight the importance of network science, multivariate statistics, and machine learning techniques in reducing types I and II prediction errors.

Original languageEnglish
Title of host publicationComplex Networks and Their Applications X - Volume 2, Proceedings of the 10th International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2021
EditorsRosa Maria Benito, Chantal Cherifi, Hocine Cherifi, Esteban Moro, Luis M. Rocha, Marta Sales-Pardo
PublisherSpringer Science and Business Media Deutschland GmbH
Pages652-663
Number of pages12
ISBN (Print)9783030934125
DOIs
StatePublished - 2022
Event10th International Conference on Complex Networks and Their Applications, COMPLEX NETWORKS 2021 - Madrid, Spain
Duration: 30 Nov 202102 Dec 2021

Publication series

NameStudies in Computational Intelligence
Volume1016
ISSN (Print)1860-949X
ISSN (Electronic)1860-9503

Conference

Conference10th International Conference on Complex Networks and Their Applications, COMPLEX NETWORKS 2021
Country/TerritorySpain
CityMadrid
Period30/11/2102/12/21

Fingerprint

Dive into the research topics of 'Supervised Gene Function Prediction Using Spectral Clustering on Gene Co-expression Networks'. Together they form a unique fingerprint.

Cite this