Aprendizagem de máquina e técnicas multivariadas no estudo da qualidade do carvão vegetal
Arquivos
Data
2019-02-18
Autores
Título da Revista
ISSN da Revista
Título de Volume
Editor
Universidade Federal de Viçosa
Resumo
Os estudos sobre as variáveis que determinam a qualidade do carvão vegetal e a influência do material de origem norteiam o planejamento de programas de seleção de melhores genótipos para a produção de carvão. O emprego de novos métodos de análise que se adequem ao estudo das propriedades do carvão vegetal, possibilita a avaliação dos dados por ângulos diferentes e amplia as possibilidades das pesquisas na área. Nesse sentido, o objetivo do presente trabalho foi utilizar procedimentos de aprendizagem de máquina e técnicas multivariadas na análise do rendimento e qualidade do carvão vegetal produzido a partir de clones de Corymbia. As amostras analisadas foram obtidas a partir de um plantio clonal com sete anos de idade estabelecido no município de Dionísio, MG. No primeiro capítulo são apresentados os resultados da utilização do algoritmo random forest no estudo da influência das propriedades da madeira sobre o rendimento e propriedades de qualidade do carvão vegetal, bem como a comparação da acurácia dos valores preditos pelo random forest com os preditos pelo support vector regression e regressão linear múltipla. As variáveis teor de holocelulose, relação cerne/alburno e densidade básica da madeira foram as mais importantes para a modelagem via aprendizagem de máquina. Quanto a acurácia, o random forest foi superior aos demais métodos considerando o coeficiente de determinação, correlação linear entre valores observados e preditos, erro médio absoluto e raiz quadrada do erro quadrático médio, inclusive mostrando desempenho adequado para que seja viável a utilização do algoritmo para a estimação das propriedades do carvão vegetal. No segundo capítulo relata-se o emprego da função discriminante de Fisher na classificação dos clones de Corymbia quanto ao potencial para a produção de carvão vegetal em termos de rendimento e qualidade. Os dados foram inicialmente testados quanto às pressuposições de normalidade multivariada e homogeneidade de matrizes de variâncias/covariâncias, para em seguida aplicar a análise de variância multivariada (MANOVA). Pelos resultados da MANOVA, constatou-se que existe diferença no campo multivariado entre os clones e, a partir das matrizes de soma de quadrados e produtos do resíduo e do efeito de clones, foram estimados os coeficientes das duas primeiras funções discriminantes, que juntas retiveram aproximadamente 80% da informação contida no conjunto de dados. As duas funções discriminantes foram utilizadas para calcular duas variáveis canônicas que são funções das variáveis observadas do carvão vegetal. Comparando os clones por meio das médias das variáveis canônicas, verificou-se que o genótipo AMF 1119 é o mais indicado para a produção de carvão vegetal.
The studies on the variables that determine the charcoal quality and the influence of the source material guide the planning of programs to select the best genotypes for the charcoal production. The use of new methods of analysis that are adequate for the study of the charcoal properties, allows the evaluation of the data by different angles and enlarges the possibilities of the researches in the area. In this sense, the objective of the present work was to use machine learning procedures and multivariate techniques to analyze the yield and quality of charcoal produced from Corymbia clones. The analyzed samples were obtained from a clonal plantation with seven years of age established in the city of Dionísio, MG. In the first chapter are presented the results of the use of the random forest algorithm in the study of the influence of the properties of the wood on the yield and quality properties of the charcoal, as well as the comparison of the accuracy of the values predicted by the random forest with those predicted by the support vector regression and multiple linear regression. The variables holocellulose, heartwood/sapwood ratio and basic density of the wood were the most important for modeling through machine learning. As for the accuracy, the random forest was superior to the other methods considering the coefficient of determination, linear correlation between observed and predicted values, absolute mean error and square root of the mean square error, even showing adequate performance so that it is feasible to use the algorithm for the estimation of the charcoal properties. In the second chapter we report the use of Fisher's discriminant function in the classification of Corymbia clones as to the potential for charcoal production in terms of yield and quality. Data were initially tested for assumptions of multivariate normality and homogeneity of variance/covariance matrices, followed by multivariate analysis of variance (MANOVA). From the MANOVA results, it was found that there is a difference in the multivariate field between the clones and, from the matrices of sum of squares and products of the residue and the effect of clones, the coefficients of the two first discriminant functions were estimated, which together almost 80% of the information contained in the data set. The two discriminant functions were used to calculate two canonical variables that are functions of the observed charcoal variables. Comparing the clones by means of the canonical variables, it was verified that the genotype AMF 1119 is the most suitable for the charcoal production.
The studies on the variables that determine the charcoal quality and the influence of the source material guide the planning of programs to select the best genotypes for the charcoal production. The use of new methods of analysis that are adequate for the study of the charcoal properties, allows the evaluation of the data by different angles and enlarges the possibilities of the researches in the area. In this sense, the objective of the present work was to use machine learning procedures and multivariate techniques to analyze the yield and quality of charcoal produced from Corymbia clones. The analyzed samples were obtained from a clonal plantation with seven years of age established in the city of Dionísio, MG. In the first chapter are presented the results of the use of the random forest algorithm in the study of the influence of the properties of the wood on the yield and quality properties of the charcoal, as well as the comparison of the accuracy of the values predicted by the random forest with those predicted by the support vector regression and multiple linear regression. The variables holocellulose, heartwood/sapwood ratio and basic density of the wood were the most important for modeling through machine learning. As for the accuracy, the random forest was superior to the other methods considering the coefficient of determination, linear correlation between observed and predicted values, absolute mean error and square root of the mean square error, even showing adequate performance so that it is feasible to use the algorithm for the estimation of the charcoal properties. In the second chapter we report the use of Fisher's discriminant function in the classification of Corymbia clones as to the potential for charcoal production in terms of yield and quality. Data were initially tested for assumptions of multivariate normality and homogeneity of variance/covariance matrices, followed by multivariate analysis of variance (MANOVA). From the MANOVA results, it was found that there is a difference in the multivariate field between the clones and, from the matrices of sum of squares and products of the residue and the effect of clones, the coefficients of the two first discriminant functions were estimated, which together almost 80% of the information contained in the data set. The two discriminant functions were used to calculate two canonical variables that are functions of the observed charcoal variables. Comparing the clones by means of the canonical variables, it was verified that the genotype AMF 1119 is the most suitable for the charcoal production.
Descrição
Palavras-chave
Fisher, Funções de, Corymbia, Algorítmos, Variáveis (Matemática), Carvão - Qualidade
Citação
PEREIRA, Kaléo Dias. Aprendizagem de máquina e técnicas multivariadas no estudo da qualidade do carvão vegetal. 2019. 53 f. Dissertação (Mestrado em Estatística Aplicada e Biometria) - Universidade Federal de Viçosa, Viçosa. 2019.