تجاوز إلى المحتوى الرئيسي
User Image

Achraf El Allali

Assistant Professor

Faculty

علوم الحاسب والمعلومات
Building 31, 2nd floor, room 2119
المنشورات
مقال فى مجلة
2018
تم النشر فى:

Feature selection for gene prediction in metagenomic fragments

Background

Computational approaches, specifically machine-learning techniques, play an important role in many metagenomic analysis algorithms, such as gene prediction. Due to the large feature space, current de novo gene prediction algorithms use different combinations of classification algorithms to distinguish between coding and non-coding sequences.

Results

In this study, we apply a filter method to select relevant features from a large set of known features instead of combining them using linear classifiers or ignoring their individual coding potential. We use minimum redundancy maximum relevance (mRMR) to select the most relevant features. Support vector machines (SVM) are trained using these features, and the classification score is transformed into the posterior probability of the coding class. A greedy algorithm uses the probability of overlapped candidate genes to select the final genes. Instead of using one model for all sequences, we train an ensemble of SVM models on mutually exclusive datasets based on GC content and use the appropriated model to classify candidate genes based on their read’s GC content.

Conclusion

Our proposed algorithm achieves an improvement over some existing algorithms. mRMR produces promising results in gene prediction. It improves classification performance and feature interpretation. Our research serves as a basis for future studies on feature selection for gene prediction.

رقم المجلد
11
رقم الانشاء
9
مزيد من المنشورات
publications

Next-generation sequencing approaches and genome-wide studies have become essential for characterizing the mechanisms of human diseases.

2019
publications

The development of next-generation sequencing facilitates the study of metagenomics. Computational gene prediction aims to find the location of genes in a given DNA sequence. Gene prediction in…

بواسطة Achraf El Allali
2019
publications

Accurate gene prediction in metagenomics fragments is a computationally challenging task due to the short-read length, incomplete, and fragmented nature of the data. Most gene-prediction programs…

2018