Inteligência artificial aplicada à dimensão pedológica global para modelar, mapear e compreender interações biota-solo-clima e dinâmicas do carbono orgânico
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Universidade Federal de Viçosa
Abstract
O solo é um componente central da biosfera terrestre, sustentando funções ecológicas críticas como a produção de alimentos, a regulação de ciclos biogeoquímicos e o sequestro de carbono. Entretanto, sua complexidade funcional, especialmente nos compartimentos subterrâneos, tem sido historicamente negligenciada em modelos globais de carbono. Essa lacuna é particularmente crítica diante das mudanças climáticas, que podem converter o solo de sumidouro a fonte de carbono. Neste contexto, objetivou-se com esta pesquisa desenvolver uma abordagem funcional e explicável, baseada em inteligência artificial, para modelar e compreender a dinâmica do carbono orgânico do solo (SOC) em escala global, com foco nos preditores biológicos: fungos, raízes e respiração. A pesquisa foi estruturada em quatro artigos, todos ancorados no uso do algoritmo quantile random forest (QRF) como estratégia de modelagem preditiva. Essa abordagem permitiu não apenas estimar valores medianos com robustez, mas também quantis e intervalos de incerteza. O primeiro investigou a distribuição global da biodiversidade fúngica do solo, utilizando dados da base GSM e algoritmos de machine learning; o segundo mapeou traços funcionais de raízes a partir de bancos de dados globais integrados; o terceiro quantificou as taxas atuais e futuras de respiração do solo (Rs e Rh) e seus preditores edafoclimáticos; e o quarto integrou variáveis edáficas, biológicas e climáticas em um modelo explicável de SOC, utilizando valores de Shapley, curvas de dependência parcial e mapas de dominância funcional para a interpretação dos resultados. Os resultados revelaram que a distribuição de fungos mostrou-se altamente heterogênea, com baixa representatividade em regiões tropicais e áridas, onde sua diversidade ou especialização funcional é mais crítica. As raízes apresentaram forte controle sobre a estruturação da rizosfera e a ciclagem de nutrientes, com destaque para a colonização micorrízica como preditor-chave em regiões tropicais. A modelagem da respiração evidenciou o papel moderador da textura do solo: argilas protegeram o SOC e amorteceram as respostas ao aquecimento, enquanto solos arenosos acentuaram a vulnerabilidade climática. Finalmente, os modelos de SOC revelaram interações não lineares e contextuais entre biota, clima e solo, com padrões espaciais distintos de dominância preditiva. Os achados evidenciaram a fragilidade de modelos tradicionais que ignoram a biota do solo e dependem unicamente de proxies climáticos ou vegetacionais baseados em sensoriamento remoto. A introdução de variáveis funcionais e abordagens explicáveis permitiu revelar hierarquias e sinergias entre os fatores que controlam o carbono, apontando para um novo paradigma metodológico: o Soil-Informed Machine Learning. Os resultados mostraram que as condições que promovem o acúmulo de SOC em certos contextos (e.g., alta colonização micorrízica em solos argilosos úmidos) podem ter efeitos opostos em outros (e.g., solos arenosos secos). Evidenciou-se também que as incertezas da modelagem estão fortemente associadas à sub-representação de dados em regiões críticas, como zonas tropicais, boreais e semiáridas. Esta pesquisa amplia o entendimento sobre a dimensão pedoecológica do carbono no solo, ao integrar processos subterrâneos, interações biológicas e conhecimento edafológico em modelos globalmente consistentes e interpretáveis. A inteligência artificial explicável mostrou-se uma ferramenta promissora para decifrar relações complexas e orientar políticas mais robustas de conservação e mitigação. No entanto, avanços duradouros dependerão da consolidação de bancos de dados harmonizados, de protocolos de ciência aberta e da valorização institucional do solo como componente central das estratégias climáticas globais. Palavras-chave: aprendizado de máquina; pedometria; fungos do solo; traços radiculares; respiração do solo.
Soil is a central component of the terrestrial biosphere, supporting critical ecological functions such as food production, regulation of biogeochemical cycles, and carbon sequestration. However, its functional complexity—particularly in the subsurface compartments—has historically been neglected in global carbon models. This gap is especially critical in the face of climate change, which may shift soils from carbon sinks to sources. In this context, the objective of this research was to develop a functional and explainable approach, based on artificial intelligence, to model and understand the dynamics of soil organic carbon (SOC) at a global scale, with a focus on biological predictors: fungi, roots, and respiration. The research was structured into four complementary studies, all anchored in the use of the quantile random forest (QRF) algorithm as a predictive modeling strategy. This approach enabled robust estimation not only of median values but also of quantiles and uncertainty intervals. The first study investigated the global distribution of soil fungal biodiversity using GSM database records and machine learning algorithms; the second mapped functional root traits using integrated global datasets; the third quantified current and future rates of soil respiration (Rs and Rh) and their edaphoclimatic predictors; and the fourth integrated soil, biological, and climatic variables into an explainable SOC model, using Shapley values, partial dependence plots, and functional dominance maps to interpret the results. The findings revealed that fungal distribution was highly heterogeneous, with poor representation in tropical and arid regions—precisely where their diversity and functional specialization are most critical. Roots exerted strong control over rhizosphere structure and nutrient cycling, with mycorrhizal colonization emerging as a key predictor in tropical environments. Soil respiration modeling highlighted the moderating role of soil texture: clay soils protected SOC and buffered warming responses, whereas sandy soils amplified climatic vulnerability. Finally, SOC models uncovered nonlinear and context-dependent interactions between biota, climate, and soil, with distinct spatial patterns of predictive dominance. These results exposed the fragility of traditional models that ignore soil biota and rely solely on climatic or vegetation proxies derived from remote sensing. The inclusion of functional variables and explainable modeling approaches revealed hierarchies and synergies among carbon-controlling factors, pointing to a new methodological paradigm: Soil-Informed Machine Learning. The findings also showed that conditions promoting SOC accumulation in certain contexts (e.g., high mycorrhizal colonization in moist clay-rich soils) may have opposite effects in others (e.g., dry sandy soils). Moreover, modeling uncertainties were strongly associated with data underrepresentation in critical regions, such as tropical, boreal, and semi-arid zones. This research advances our understanding of the pedoecological dimension of soil carbon by integrating subsurface processes, biological interactions, and edaphological knowledge into globally consistent and interpretable models. Explainable artificial intelligence proved to be a promising tool to disentangle complex relationships and guide more robust conservation and mitigation policies. Nonetheless, long-term progress will depend on the consolidation of harmonized databases, open science protocols, and the institutional recognition of soil as a central component of global climate strategies Keywords: machine learning; pedometry; soil fungi; root traits; soil respiration.
Soil is a central component of the terrestrial biosphere, supporting critical ecological functions such as food production, regulation of biogeochemical cycles, and carbon sequestration. However, its functional complexity—particularly in the subsurface compartments—has historically been neglected in global carbon models. This gap is especially critical in the face of climate change, which may shift soils from carbon sinks to sources. In this context, the objective of this research was to develop a functional and explainable approach, based on artificial intelligence, to model and understand the dynamics of soil organic carbon (SOC) at a global scale, with a focus on biological predictors: fungi, roots, and respiration. The research was structured into four complementary studies, all anchored in the use of the quantile random forest (QRF) algorithm as a predictive modeling strategy. This approach enabled robust estimation not only of median values but also of quantiles and uncertainty intervals. The first study investigated the global distribution of soil fungal biodiversity using GSM database records and machine learning algorithms; the second mapped functional root traits using integrated global datasets; the third quantified current and future rates of soil respiration (Rs and Rh) and their edaphoclimatic predictors; and the fourth integrated soil, biological, and climatic variables into an explainable SOC model, using Shapley values, partial dependence plots, and functional dominance maps to interpret the results. The findings revealed that fungal distribution was highly heterogeneous, with poor representation in tropical and arid regions—precisely where their diversity and functional specialization are most critical. Roots exerted strong control over rhizosphere structure and nutrient cycling, with mycorrhizal colonization emerging as a key predictor in tropical environments. Soil respiration modeling highlighted the moderating role of soil texture: clay soils protected SOC and buffered warming responses, whereas sandy soils amplified climatic vulnerability. Finally, SOC models uncovered nonlinear and context-dependent interactions between biota, climate, and soil, with distinct spatial patterns of predictive dominance. These results exposed the fragility of traditional models that ignore soil biota and rely solely on climatic or vegetation proxies derived from remote sensing. The inclusion of functional variables and explainable modeling approaches revealed hierarchies and synergies among carbon-controlling factors, pointing to a new methodological paradigm: Soil-Informed Machine Learning. The findings also showed that conditions promoting SOC accumulation in certain contexts (e.g., high mycorrhizal colonization in moist clay-rich soils) may have opposite effects in others (e.g., dry sandy soils). Moreover, modeling uncertainties were strongly associated with data underrepresentation in critical regions, such as tropical, boreal, and semi-arid zones. This research advances our understanding of the pedoecological dimension of soil carbon by integrating subsurface processes, biological interactions, and edaphological knowledge into globally consistent and interpretable models. Explainable artificial intelligence proved to be a promising tool to disentangle complex relationships and guide more robust conservation and mitigation policies. Nonetheless, long-term progress will depend on the consolidation of harmonized databases, open science protocols, and the institutional recognition of soil as a central component of global climate strategies Keywords: machine learning; pedometry; soil fungi; root traits; soil respiration.
Description
Citation
SANTOS, Cássio Marques Moquedace dos. Inteligência artificial aplicada à dimensão pedológica global para modelar, mapear e compreender interações biota-solo-clima e dinâmicas do carbono orgânico. 2025. 25 f. Tese (Doutorado em Solos e Nutrição de Plantas) - Universidade Federal de Viçosa, Viçosa. 2025.
