TY - GEN
T1 - Machine learning techniques applied to the cleavage site prediction problem
AU - Alvarez, Gloria Ineś
AU - Bravo, Enrique
AU - Linares, Diego
AU - Vargas, Jheyson Faride
AU - Velasco, Jairo Andreś
N1 - Funding Information:
The translation for publication in English was done by John Field Palencia Roth, assistant professor in the Department of Communication and Language of the Faculty of Humanities and Social Sciences at the Pontificia Universidad Javeriana Cali. This work is funded by the Departamento Administrativo de Ciencia, Tecnología e Innovación de Colombia ( COLCIENCIAS) under the grant project code 1251-521-28290.
PY - 2013
Y1 - 2013
N2 - The Genome of the Potyviridae virus family is usually expressed as a polyprotein which can be divided into ten proteins through the action of enzymes or proteases which cut the chain in specific places called cleavage sites. Three different techniques were employed to model each cleavage site: Hidden Markov Models (HMM), grammatical inference OIL algorithm (OIL), and Artificial Neural Networks (ANN). Based on experimentation, the Hidden Markov Model has the best classification performance as well as a high robustness in relation to class imbalance. However, the Order Independent Language (OIL) algorithm is found to exhibit the ability to improve when models are trained using a greater number of samples without regard to their huge imbalance.
AB - The Genome of the Potyviridae virus family is usually expressed as a polyprotein which can be divided into ten proteins through the action of enzymes or proteases which cut the chain in specific places called cleavage sites. Three different techniques were employed to model each cleavage site: Hidden Markov Models (HMM), grammatical inference OIL algorithm (OIL), and Artificial Neural Networks (ANN). Based on experimentation, the Hidden Markov Model has the best classification performance as well as a high robustness in relation to class imbalance. However, the Order Independent Language (OIL) algorithm is found to exhibit the ability to improve when models are trained using a greater number of samples without regard to their huge imbalance.
UR - http://www.scopus.com/inward/record.url?scp=84894110937&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-45114-0_39
DO - 10.1007/978-3-642-45114-0_39
M3 - Conference contribution
AN - SCOPUS:84894110937
SN - 9783642451133
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 497
EP - 507
BT - Advances in Artificial Intelligence and Its Applications - 12th Mexican International Conference on Artificial Intelligence, MICAI 2013, Proceedings
T2 - 12th Mexican International Conference on Artificial Intelligence, MICAI 2013
Y2 - 24 November 2013 through 30 November 2013
ER -