Use este identificador para citar ou linkar para este item: https://locus.ufv.br//handle/123456789/12157
Tipo: Artigo
Título: MUMAL2: Improving sensitivity in shotgun proteomics using cost sensitive artificial neural networks and a threshold selector algorithm
Autor(es): Cerqueira, Fabio Ribeiro
Ricardo, Adilson Mendes
Oliveira, Alcione de Paiva
Graber, Armin
Baumgartner, Christian
Abstract: This work presents a machine learning strategy to increase sensitivity in tandem mass spectrometry (MS/MS) data analysis for peptide/protein identification. MS/MS yields thousands of spectra in a single run which are then interpreted by software. Most of these computer programs use a protein database to match peptide sequences to the observed spectra. The peptide-spectrum matches (PSMs) must also be assessed by computational tools since manual evaluation is not practicable. The target-decoy database strategy is largely used for error estimation in PSM assessment. However, in general, that strategy does not account for sensitivity. In a previous study, we proposed the method MUMAL that applies an artificial neural network to effectively generate a model to classify PSMs using decoy hits with increased sensitivity. Nevertheless, the present approach shows that the sensitivity can be further improved with the use of a cost matrix associated with the learning algorithm. We also demonstrate that using a threshold selector algorithm for probability adjustment leads to more coherent probability values assigned to the PSMs. Our new approach, termed MUMAL2, provides a two-fold contribution to shotgun proteomics. First, the increase in the number of correctly interpreted spectra in the peptide level augments the chance of identifying more proteins. Second, the more appropriate PSM probability values that are produced by the threshold selector algorithm impact the protein inference stage performed by programs that take probabilities into account, such as ProteinProphet. Our experiments demonstrate that MUMAL2 reached around 15% of improvement in sensitivity compared to the best current method. Furthermore, the area under the ROC curve obtained was 0.93, demonstrating that the probabilities generated by our model are in fact appropriate. Finally, Venn diagrams comparing MUMAL2 with the best current method show that the number of exclusive peptides found by our method was nearly 4-fold higher, which directly impacts the proteome coverage. The inclusion of a cost matrix and a probability threshold selector algorithm to the learning task further improves the target-decoy database analysis for identifying peptides, which optimally contributes to the challenging task of protein level identification, resulting in a powerful computational tool for shotgun proteomics.
Palavras-chave: Artificial neural network
Cost sensitive classification
Peptide/protein identification
Phosphoproteomics
Shotgun proteomics
Data mining
Editor: BMC Bioinformatics
Tipo de Acesso: Open Access
URI: https://doi-org.ez35.periodicos.capes.gov.br/10.1186/s12859-016-1341-x
http://www.locus.ufv.br/handle/123456789/12157
Data do documento: 15-Dez-2016
Aparece nas coleções:Artigos

Arquivos associados a este item:
Arquivo Descrição TamanhoFormato 
document.pdftexto completo1,94 MBAdobe PDFThumbnail
Visualizar/Abrir


Os itens no repositório estão protegidos por copyright, com todos os direitos reservados, salvo quando é indicado o contrário.