Orphelia is a metagenomic ORF finding tool for the prediction of protein coding genes in short, environmental DNA sequences with unknown phylogenetic origin . Orphelia is based on a two-stage machine learning approach that was recently introduced by our group. After the initial extraction of open reading frames (ORFs), linear discriminants are used to extract features from those ORFs. Subsequently, an artificial neural network combines the features and computes a gene probability for each ORF in a fragment. A greedy strategy computes a likely combination of high scoring ORFs with an overlap constraint.
the Department of Bioinformatics of the University of Göttingen
:: MORE INFORMATION
K. J. Hoff, T. Lingner, P. Meinicke, M. Tech (2009)
Orphelia: predicting genes in metagenomic sequencing reads
Nucleic Acids Research, 37:W101-W105.