Machine-Learning Emulation of Satellite Greenhouse Gas Retrievals: Stability over Time
2026-06-08 • Machine Learning
Machine Learning
AI summaryⓘ
The authors looked at how well machine learning models can estimate greenhouse gas levels, like CO2 and methane, using satellite data. They found these models work well only when tested on data from the same time period they were trained on, but their accuracy drops for data from other times. Adding time as an input helped improve predictions, especially for methane. They also discovered that a simpler Lasso model performed just as well or better than more complex neural networks and was more consistent over time. Their results were confirmed by comparing with ground-based measurements from the TCCON network.
greenhouse gasescarbon dioxidemethanesatellite retrievalmachine learningLasso regressionneural networksinverse problemsGOSATTCCON
Authors
Nugzar Gognadze, Motonobu Kanagawa, Yu Someya, Hisashi Yashiro
Abstract
Retrieval algorithms are used to estimate atmospheric concentrations of greenhouse gases (GHGs), such as carbon dioxide (CO2) and methane (CH4), by solving inverse problems from high-spectral-resolution satellite radiance measurements. However, these algorithms are computationally expensive, which makes real-time estimation at scale difficult. Machine-learning models have therefore been proposed as fast emulators of retrieval algorithms. Most existing studies, however, evaluate them only on test data from the same period as the training data. We study the stability over time of such emulators using data from the Greenhouse Gases Observing SATellite (GOSAT). We show that prediction accuracy generally deteriorates when the test period moves away from the training period. We also show that including time as an input feature substantially improves XCH4 prediction for Lasso and neural-network models. Among the methods considered, a simple Lasso model performs as well as or better than more complex methods such as neural networks, and yields more stable predictions over time. We further validate the results using the Total Carbon Column Observing Network (TCCON), a ground-based observation network. On the TCCON-matched dataset, the time-augmented Lasso achieves errors against TCCON that are comparable to the disagreement between GOSAT and TCCON for both XCO2 and XCH4.