Informática
URI permanente desta comunidadehttps://locus.ufv.br/handle/123456789/11780
Navegar
9 resultados
Resultados da Pesquisa
Item Ambiente para busca e visualização de documentos históricos na Web(Perspectivas em Ciência da Informação, 2011-07) Possi, Maurilio de Araujo; Oliveira, Alcione de Paiva; Mendes, Fábio; Queiroz, Jonas Marçal; Moreira, AlexandraDocumentos históricos são ferramentas essenciais para historiadores. Em muitos casos, o acesso aos documentos pode ser difícil em função de inúmeros fatores, tais como a distância, segurança e fragilidade do documento. Uma forma de contornar esse problema é a digitalização e disponibilização do acesso em uma rede de comunicação, a Internet. Este artigo apresenta um ambiente para visualização de documentos históricos, desenvolvido dentro do projeto de Digitalização de manuscritos dos acervos dos Cartórios do 1º. e 2º. Ofícios do Arquivo Histórico da Casa Setecentista de Mariana - PHAN.Item Erratum to: Mirnacle: machine learning with SMOTE and random forest for improving selectivity in pre-miRNA ab initio prediction(BMC Bioinformatics, 2017) Marques, Yuri Bento; Oliveira, Alcione de Paiva; Vasconcelos, Ana Tereza Ribeiro; Cerqueira, Fabio RibeiroMicroRNAs (miRNAs) are key gene expression regulators in plants and animals. Therefore, miRNAs are involved in several biological processes, making the study of these molecules one of the most relevant topics of molecular biology nowadays. However, characterizing miRNAs in vivo is still a complex task. As a consequence, in silico methods have been developed to predict miRNA loci. A common ab initio strategy to find miRNAs in genomic data is to search for sequences that can fold into the typical hairpin structure of miRNA precursors (pre-miRNAs). The current ab initio approaches, however, have selectivity issues, i.e., a high number of false positives is reported, which can lead to laborious and costly attempts to provide biological validation. This study presents an extension of the ab initio method miRNAFold, with the aim of improving selectivity through machine learning techniques, namely, random forest combined with the SMOTE procedure that copes with imbalance datasets. By comparing our method, termed Mirnacle, with other important approaches in the literature, we demonstrate that Mirnacle substantially improves selectivity without compromising sensitivity. For the three datasets used in our experiments, our method achieved at least 97% of sensitivity and could deliver a two-fold, 20-fold, and 6-fold increase in selectivity, respectively, compared with the best results of current computational tools. The extension of miRNAFold by the introduction of machine learning techniques, significantly increases selectivity in pre-miRNA ab initio prediction, which optimally contributes to advanced studies on miRNAs, as the need of biological validations is diminished. Hopefully, new research, such as studies of severe diseases caused by miRNA malfunction, will benefit from the proposed computational tool.Item Multi-objective variable neighborhood search algorithms for a single machine scheduling problem with distinct due windows(Electronic Notes in Theoretical Computer Science, 2011-12-29) Arroyo, José Elias Claudio; Ottoni, Rafael dos Santos; Oliveira, Alcione de PaivaIn this paper, we compare three multi-objective algorithms based on Variable Neighborhood Search (VNS) heuristic. The algorithms are applied to solve the single machine scheduling problem with sequence dependent setup times and distinct due windows. In this problem, we consider minimizing the total weighted earliness/tardiness and the total flowtime criteria. We introduce two intensification procedures to improve a multi-objective VNS (MOVNS) algorithm proposed in the literature. The performance of the algorithms is tested on a set of medium and larger instances of the problem. The computational results show that the proposed algorithms outperform the original MOVNS algorithm in terms of solution quality. A statistical analysis is conducted in order to analyze the performance of the proposed methods.Item NICeSim: An open-source simulator based on machine learning techniques to support medical research on prenatal and perinatal care decision making(Artificial Intelligence in Medicine, 2014-10-05) Cerqueira, Fabio Ribeiro; Ferreira, Tiago Geraldo; Oliveira, Alcione de Paiva; Augusto, Douglas Adriano; Krempser, Eduardo; Barbosa, Helio José Corrêa; Franceschini, Sylvia do Carmo Castro; Freitas, Brunnella Alcantara Chagas de; Gomes, Andreia Patricia; Siqueira-Batista, RodrigoThis paper describes NICeSim, an open-source simulator that uses machine learning (ML) techniques to aid health professionals to better understand the treatment and prognosis of premature newborns. The application was developed and tested using data collected in a Brazilian hospital. The available data were used to feed an ML pipeline that was designed to create a simulator capable of predicting the outcome (death probability) for newborns admitted to neonatal intensive care units. However, unlike previous scoring systems, our computational tool is not intended to be used at the patients bedside, although it is possible. Our primary goal is to deliver a computational system to aid medical research in understanding the correlation of key variables with the studied outcome so that new standards can be established for future clinical decisions. In the implemented simulation environment, the values of key attributes can be changed using a user-friendly interface, where the impact of each change on the outcome is immediately reported, allowing a quantitative analysis, in addition to a qualitative investigation, and delivering a totally interactive computational tool that facilitates hypothesis construction and testing. Our statistical experiments showed that the resulting model for death prediction could achieve an accuracy of 86.7% and an area under the receiver operating characteristic curve of 0.84 for the positive class. Using this model, three physicians and a neonatal nutritionist performed simulations with key variables correlated with chance of death. The results indicated important tendencies for the effect of each variable and the combination of variables on prognosis. We could also observe values of gestational age and birth weight for which a low Apgar score and the occurrence of respiratory distress syndrome (RDS) could be less or more severe. For instance, we have noticed that for a newborn with 2000 g or more the occurrence of RDS is far less problematic than for neonates weighing less. The significant accuracy demonstrated by our predictive model shows that NICeSim might be used for hypothesis testing to minimize in vivo experiments. We observed that the model delivers predictions that are in very good agreement with the literature, demonstrating that NICeSim might be an important tool for supporting decision making in medical practice. Other very important characteristics of NICeSim are its flexibility and dynamism. NICeSim is flexible because it allows the inclusion and deletion of variables according to the requirements of a particular study. It is also dynamic because it trains a just-in-time model. Therefore, the system is improved as data from new patients become available. Finally, NICeSim can be extended in a cooperative manner because it is an open-source system.Item Uso de ontologias para a extração de informações em atos jurídicos em uma instituição pública(Revista Eletrônica de biblioteconomia e Ciência da informação, 2004-11-16) Batres, Eduardo Jaime Quirós; Oliveira, Alcione de Paiva; Gabrielli, Bruno Ventorim; Amorim, Vinci Pegoretti; Moreira, AlexandraCom a expansão da Internet e a disponibilidade das informações em geral, surge um crescente anseio por parte de cidadãos e organizações de terem à sua disposição não só informações que dizem respeito a terceiros, mas também as informações a seu respeito ou que diretamente os afetem. Dentro deste contexto incluem-se as normas em geral e mais especificamente os atos emanados do serviço público. Este trabalho apresenta uma ferramenta automatizada, utilizando técnicas de extração automática de informações, com o intuito de extrair as principais informações contidas nos atos administrativos da Universidade Federal de Viçosa (UFV), visando a facilitar a utilização ampla dessas informações que, por serem de natureza pública, expandem seu interesse além das fronteiras do órgão emissor. Para isto se fez necessária a extração e estruturação das informações contidas nos mais diversos documentos eletrônicos dispersos pelos órgãos emissores. A ferramenta faz uso de uma ontologia construída especificamente para este propósito, possibilitando a geração de uma base de conhecimento cujo conteúdo reflete os campos obrigatórios e necessários para caracterizar um ato administrativo.Item Domain class diagram validation procedure based on mereological analysis for part-whole relations(Revista Brasileira de Computação Aplicada, 2014-10) Catossi, Bruna Carolina de Melo; Oliveira, Alcione de Paiva; Lisboa Filho, Jugurta; Braga, José LuisA dificuldade dos desenvolvedores de software para construir modelos conceituais fiéis à realidade é antiga. Existem algumas técnicas de análise ontológica para ajudar o modelador durante o processo de criação do diagrama de classes. No entanto, elas acabam não sendo práticas e não refletem os seus reais benefícios em suas aplicações, pois envolvem muitos conceitos filosóficos, o que as tornam complexas para modeladores comuns. Por esse motivo, procedimentos capazes de simplificar o entendimento desses conceitos e que se aproximam da realidade prática dos desenvolvedores tem surgido, como o PrOntoCon, que será discutido neste trabalho. O objetivo principal do PrOntoCon é guiar o modelador durante o processo de validação de um diagrama de classes UML para qualquer domínio, focando, especialmente, os relacionamentos de agregação/composição e de associação simples, visto que são os tipos de relacionamentos que geram mais dúvidas e controvérsias durante a modelagem. Assim, esse procedimento dá o suporte necessário para a correta identificação dessas relações, promovendo um estudo mais aprofundado sobre as restrições do domínio em questão. Portanto, o PrOntoCon combina o poder de modelagem da UML com a teoria da análise ontológica sobre relacionamentos parte-todo e de associação para criar um procedimento capaz de conceber modelos conceituais mais claros e confiáveis e que possam gerar sistemas mais robustos e manuteníveis.Item Ontology supported system for searching evidence of wild animals trafficking in social network posts(Revista Brasileira de Computação Aplicada, 2014-04) Carrasco, Rafael da Silva; Oliveira, Alcione de Paiva; Lisboa Filho, Jugurta; Moreira, AlexandraO comércio ilegal de animais silvestres é uma das atividades criminais mais lucrativas da atualidade. No Brasil, a grande variedade de fauna nativa tem alimentado o mercado ilegal, o que gera sérias implicações ambientais e sociais. A luta contra o comércio ilegal de animais silvestres é crucial para ajudar a proteger os recursos naturais e evitar a disseminação de outras formas de crime. Esse tipo de comércio ilegal usa cada vez mais, a internet para realizar suas atividades. A fim de combater tal crime, um sistema automático de monitorização é essencial. No entanto, para realizar essa tarefa de forma eficaz, o sistema deve ser capaz de analisar as mensagens trocadas durante essa prática. Para isso, é necessário o conhecimento dos conceitos e relações que ocorrem nesse domínio. Este artigo apresenta um sistema multiagente apoiado por ontologia de domínio e frames semânticos para buscar evidências de comércio ilegal de animais silvestres. No artigo, é mostrado como o sistema pode ser usado na tarefa de rastreamento do comércio ilegal de animais silvestres, além de apresentar os resultados da aplicação do sistema em um pequeno corpus.Item Mirnacle: machine learning with SMOTE and random forest for improving selectivity in pre-miRNA ab initio prediction(BMC Bioinformatics, 2016-12-15) Marques, Yuri Bento; Oliveira, Alcione de Paiva; Vasconcelos, Ana Tereza Ribeiro; Cerqueira, Fabio RibeiroMicroRNAs (miRNAs) are key gene expression regulators in plants and animals. Therefore, miRNAs are involved in several biological processes, making the study of these molecules one of the most relevant topics of molecular biology nowadays. However, characterizing miRNAs in vivo is still a complex task. As a consequence, in silico methods have been developed to predict miRNA loci. A common ab initio strategy to find miRNAs in genomic data is to search for sequences that can fold into the typical hairpin structure of miRNA precursors (pre-miRNAs). The current ab initio approaches, however, have selectivity issues, i.e., a high number of false positives is reported, which can lead to laborious and costly attempts to provide biological validation. This study presents an extension of the ab initio method miRNAFold, with the aim of improving selectivity through machine learning techniques, namely, random forest combined with the SMOTE procedure that copes with imbalance datasets. By comparing our method, termed Mirnacle, with other important approaches in the literature, we demonstrate that Mirnacle substantially improves selectivity without compromising sensitivity. For the three datasets used in our experiments, our method achieved at least 97% of sensitivity and could deliver a two-fold, 20-fold, and 6-fold increase in selectivity, respectively, compared with the best results of current computational tools. The extension of miRNAFold by the introduction of machine learning techniques, significantly increases selectivity in pre-miRNA ab initio prediction, which optimally contributes to advanced studies on miRNAs, as the need of biological validations is diminished. Hopefully, new research, such as studies of severe diseases caused by miRNA malfunction, will benefit from the proposed computational tool.Item MUMAL2: Improving sensitivity in shotgun proteomics using cost sensitive artificial neural networks and a threshold selector algorithm(BMC Bioinformatics, 2016-12-15) Cerqueira, Fabio Ribeiro; Ricardo, Adilson Mendes; Oliveira, Alcione de Paiva; Graber, Armin; Baumgartner, ChristianThis work presents a machine learning strategy to increase sensitivity in tandem mass spectrometry (MS/MS) data analysis for peptide/protein identification. MS/MS yields thousands of spectra in a single run which are then interpreted by software. Most of these computer programs use a protein database to match peptide sequences to the observed spectra. The peptide-spectrum matches (PSMs) must also be assessed by computational tools since manual evaluation is not practicable. The target-decoy database strategy is largely used for error estimation in PSM assessment. However, in general, that strategy does not account for sensitivity. In a previous study, we proposed the method MUMAL that applies an artificial neural network to effectively generate a model to classify PSMs using decoy hits with increased sensitivity. Nevertheless, the present approach shows that the sensitivity can be further improved with the use of a cost matrix associated with the learning algorithm. We also demonstrate that using a threshold selector algorithm for probability adjustment leads to more coherent probability values assigned to the PSMs. Our new approach, termed MUMAL2, provides a two-fold contribution to shotgun proteomics. First, the increase in the number of correctly interpreted spectra in the peptide level augments the chance of identifying more proteins. Second, the more appropriate PSM probability values that are produced by the threshold selector algorithm impact the protein inference stage performed by programs that take probabilities into account, such as ProteinProphet. Our experiments demonstrate that MUMAL2 reached around 15% of improvement in sensitivity compared to the best current method. Furthermore, the area under the ROC curve obtained was 0.93, demonstrating that the probabilities generated by our model are in fact appropriate. Finally, Venn diagrams comparing MUMAL2 with the best current method show that the number of exclusive peptides found by our method was nearly 4-fold higher, which directly impacts the proteome coverage. The inclusion of a cost matrix and a probability threshold selector algorithm to the learning task further improves the target-decoy database analysis for identifying peptides, which optimally contributes to the challenging task of protein level identification, resulting in a powerful computational tool for shotgun proteomics.