تجاوز إلى المحتوى الرئيسي
User Image

Achraf El Allali

Assistant Professor

Faculty

علوم الحاسب والمعلومات
Building 31, 2nd floor, room 2119
المنشورات
مقال فى مجلة
2018

CNN-MGP: Convolutional Neural Networks for Metagenomics Gene Prediction

Accurate gene prediction in metagenomics fragments is a computationally challenging task due to the short-read length, incomplete, and fragmented nature of the data. Most gene-prediction programs are based on extracting a large number of features and then applying statistical approaches or supervised classification approaches to predict genes. In our study, we introduce a convolutional neural network for metagenomics gene prediction (CNN-MGP) program that predicts genes in metagenomics fragments directly from raw DNA sequences, without the need for manual feature extraction and feature selection stages. CNN-MGP is able to learn the characteristics of coding and non-coding regions and distinguish coding and non-coding open reading frames (ORFs). We train 10 CNN models on 10 mutually exclusive datasets based on pre-defined GC content ranges. We extract ORFs from each fragment; then, the ORFs are encoded numerically and inputted into an appropriate CNN model based on the fragment-GC content. The output from the CNN is the probability that an ORF will encode a gene. Finally, a greedy algorithm is used to select the final gene list. Overall, CNN-MGP is effective and achieves a 91% accuracy on testing dataset. CNN-MGP shows the ability of deep learning to predict genes in metagenomics fragments, and it achieves an accuracy higher than or comparable to state-of-the-art gene-prediction programs that use pre-defined features.

مجلة/صحيفة
Interdisciplinary Sciences: Computational Life Sciences
مزيد من المنشورات
publications

Next-generation sequencing approaches and genome-wide studies have become essential for characterizing the mechanisms of human diseases.

2019
publications

The development of next-generation sequencing facilitates the study of metagenomics. Computational gene prediction aims to find the location of genes in a given DNA sequence. Gene prediction in…

بواسطة Achraf El Allali
2019
publications

Accurate gene prediction in metagenomics fragments is a computationally challenging task due to the short-read length, incomplete, and fragmented nature of the data. Most gene-prediction programs…

2018