AJUNTAMENT D'ALCOI
Website
Generalitat Valenciana
Website
Ayuntamiento de Valencia
Website
Cicloplast
Website
Ayuntamiento de Onil
Website
Anarpla
Website
Ayuntamiento de Mislata
Website
nlWA, North London Waste Authority
Website
Ayuntamiento de Salinas
Website
Zicla
Website
Fondazione Ecosistemi
Website
PEFC
Website
ALQUIENVAS
Website
DIPUTACI� DE VAL�NCIA
Website
AYUNTAMIENTO DE REQUENA
Website
UNIVERSIDAD DE ZARAGOZA
Website
OBSERVATORIO CONTRATACIÓN PÚBLICA
Website
AYUNTAMIENTO DE PAIPORTA
Website
AYUNTAMIENTO DE CUENCA
Website
BERL� S.A.
Website
CM PLASTIK
Website
TRANSFORMADORES INDUSTRIALES ECOL�GICOS
INDUSTRIAS AGAPITO
Website
RUBI KANGURO
Website
If you want to support our LIFE project as a STAKEHOLDER, please contact with us: life-future-project@aimplas.es
In this section, you can access to the latest technical information related to the FUTURE project topic.
Classification of bacterial plasmid and chromosome derived sequences using machine learning
Plasmids are important genetic elements that facilitate horizonal gene transfer between bacteria and contribute to the spread of virulence and antimicrobial resistance. Most bacterial genome sequences in the public archives exist in draft form with many contigs, making it difficult to determine if a contig is of chromosomal or plasmid origin. Using a training set of contigs comprising 10,584 chromosomes and 10,654 plasmids from the PATRIC database, we evaluated several machine learning models including random forest, logistic regression, XGBoost, and a neural network for their ability to classify chromosomal and plasmid sequences using nucleotide k-mers as features. Based on the methods tested, a neural network model that used nucleotide 6-mers as features that was trained on randomly selected chromosomal and plasmid subsequences 5kb in length achieved the best performance, outperforming existing out-of-the-box methods, with an average accuracy of 89.38% ? 2.16% over a 10-fold cross validation. The model accuracy can be improved to 92.08% by using a voting strategy when classifying holdout sequences. In both plasmids and chromosomes, subsequences encoding functions involved in horizontal gene transfer?including hypothetical proteins, transporters, phage, mobile elements, and CRISPR elements?were most likely to be misclassified by the model. This study provides a straightforward approach for identifying plasmid-encoding sequences in short read assemblies without the need for sequence alignment-based tools.
» Author: Xiaohui Zou, Marcus Nguyen, Jamie Overbeek, Bin Cao, James J. Davis
» Reference: https://doi.org/10.1371/journal.pone.0279280
» Publication Date: 16/12/2022
C/ Gustave Eiffel, 4
(València Parc Tecnològic) - 46980
PATERNA (Valencia) - SPAIN
(+34) 96 136 60 40
Project Management department - Sustainability and Industrial Recovery
life-future-project@aimplas.es