Comparação de metodologias para identificação de genes diferencialmente expressos em experimentos de RNA-Seq de suínos
Arquivos
Data
2015-04-08
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de Viçosa
Resumo
Um dos principais desafios da biologia molecular é medir e avaliar os perfis de expressão gênica em diferentes condições com o objetivo de entender os mecanismos de transformação molecular. Para tanto, o método RNA-Seq usa o transcriptoma obtido a partir de tecnologias de sequenciamentos de nova geração (NGS), as quais são utilizadas para converter RNA em uma biblioteca de fragmentos de cDNA, e, assim, produzir milhões reads. Após a mensuração dos níveis de expressão dos genes, por meio de técnicas de mapeamento, surge a necessidade de verificar hipóteses a respeito da existência de expressão diferencial (ED) entre as condições avaliadas. Assim, faz-se necessária à descoberta e o aprimoramento de metodologias estatísticas para aperfeiçoar as análises de dados gerados em plataformas de sequenciamento de genomas. O objetivo geral desse estudo consistiu em avaliar o comportamento de três metodologias (DEGSeq, bayseq e DESeq) para verificação da expressão diferencial em longissimus dorsi (LD) do músculo de suínos da raça Piau e Comercial, em 21e 90 dias depois do coito, por meio de dados provenientes de RNA-Seq, em cenários sem repetição . De acordo com os resultados gerados nas análises e sob as condições utilizadas no desenvolver do experimento concluiu-se que, na comparação dos métodos bayseq com DEGSeq e baySeq com DESeq, respectivamente, observou-se, a partir da relação do nível de expressão (fold-change) entre as duas raças suínas (comercial e piau), que os métodos apresentaram desempenho diferentes entre si, pois apresentaram um nível de expressão desigual em ambos os métodos. No entanto, na comparação entre os métodos DESeq e DEGSeq, houve um desempenho comparável, deste modo, houve concordância entre os métodos. Como um todo, a maioria dos genes DE identificados, se deu na fase pós- natal tardia, ou seja, 90 dpc. Além disso, a maioria deles foram down na fase pré-natal inicial (21 dpc) e foram up na fase pré-natal tardia (90 dpc) relacionando as raças, comercial e piau e comparando os métodos.
One of the main challenges of molecular biology is to measure and assess the gene expression profiles in different biological tissues in order to understand the molecular mechanisms of transformation. The RNA-Seq method uses transcriptome from young generation sequencing technologies (NGS), used to convert RNA into a cDNA fragment library, and thus produce millions reads. After the measurement of levels of gene expression, the need arises to test hypotheses about the existence of differential expression (DE) between the evaluated conditions. Thus, it is necessary to the discovery and improvement of efficient statistical methods to improve data analysis generated by genome sequencing platforms. The overall objective of this study was to evaluate the behavior of three methodologies (DEGSeq, bayseq and DESeq) to verify the differential expression in longissimus dorsi (LD) muscle of the pig Piau and Commercial race in 21 and 90 days after intercourse, by using data from RNA Seq in scenarios without repetition. According to the results generated by the analysis and under the conditions used to develop the experiment it was concluded that, in comparison with the methods bayseq DEGSeq and baySeq with DESeq respectively, was observed from the relation of expression level (fold-change) between the two pig breeds (commercial and piau), the methods showed different performance between them, they showed an uneven level of expression in both methods. However, when comparing the DESeq and DEGSeq methods, there was a comparable performance thus there was agreement between the methods. As a whole, the majority of the identified genes occurred in the late post-natal period, namely 90 dpc. Moreover, most of them were down in early postnatal stage (21 dpc) were up in late postnatal period (90 dpc) relating races, commercial and piau and comparing the methods.
One of the main challenges of molecular biology is to measure and assess the gene expression profiles in different biological tissues in order to understand the molecular mechanisms of transformation. The RNA-Seq method uses transcriptome from young generation sequencing technologies (NGS), used to convert RNA into a cDNA fragment library, and thus produce millions reads. After the measurement of levels of gene expression, the need arises to test hypotheses about the existence of differential expression (DE) between the evaluated conditions. Thus, it is necessary to the discovery and improvement of efficient statistical methods to improve data analysis generated by genome sequencing platforms. The overall objective of this study was to evaluate the behavior of three methodologies (DEGSeq, bayseq and DESeq) to verify the differential expression in longissimus dorsi (LD) muscle of the pig Piau and Commercial race in 21 and 90 days after intercourse, by using data from RNA Seq in scenarios without repetition. According to the results generated by the analysis and under the conditions used to develop the experiment it was concluded that, in comparison with the methods bayseq DEGSeq and baySeq with DESeq respectively, was observed from the relation of expression level (fold-change) between the two pig breeds (commercial and piau), the methods showed different performance between them, they showed an uneven level of expression in both methods. However, when comparing the DESeq and DEGSeq methods, there was a comparable performance thus there was agreement between the methods. As a whole, the majority of the identified genes occurred in the late post-natal period, namely 90 dpc. Moreover, most of them were down in early postnatal stage (21 dpc) were up in late postnatal period (90 dpc) relating races, commercial and piau and comparing the methods.
Descrição
Palavras-chave
Estastísticas, Biometria, Metodologias - Análise, Bioinformática, Suínos, Sequenciamento de nucleotídeos, Regulação da expressão gênica
Citação
SOUZA, Pâmela Tamiris Caldas Serra de. Comparação de metodologias para identificação de genes diferencialmente expressos em experimentos de RNA-Seq de suínos. 2015. 37 f. Dissertação (Mestrado em Estatística Aplicada e Biometria) - Universidade Federal de Viçosa, Viçosa. 2015.