Machine learning (ML) and Artificial Intelligence (AI) offer powerful tools to address long-standing scientific challenges. At the molecular scale, we’ve seen projects like AlphaFold discover unknown protein structures and how they might interact with other molecules. At the planetary scale, ML-driven models like GraphCast (Google), AIFS (ECMWF) and ACE (Allen Institute for AI) are revolutionizing weather forecasting with predictive skill and lead times outperforming our leading numerical weather prediction systems. The recent development of AI foundational models, such as Aurora for atmospheric composition, signal the advent of ML techniques that are now mature enough to start reproducing systems of immense chemical and physical complexity.
In this review, we take a snapshot of this rapidly changing field and catalog the contributions that ML/AI techniques have made in atmospheric chemistry research. We identify some common limitations across the scientific literature and make expert recommendations on how ML/AI can improve the study of tropospheric ozone.
Ozone is both an important greenhouse gas in the free troposphere and a harmful pollutant to human and plant health at the surface. Ozone remains complex to simulate accurately using computer models because ozone is not directly emitted: it is photochemically produced from precursors (NOX, VOCs, CH4) in the presence of sunlight and controlled by a complex coupling of chemical and physical processes across scales. Our review extensively covers three core areas where ML has made the largest impact in advancing tropospheric ozone science: Air Pollution Forecasting, Emulating Atmospheric Models, and Enhancing Satellite Observations. We highlight how ML can be used to bypass computational bottlenecks to resolve complex processes, unravel underlying chemical drivers, and bridge spatiotemporal scales that challenge conventional modeling.
Our review identifies Critical Challenges and proposes research priorities for future inquiry. These are:
- The Need for Benchmarks: A major hurdle is the lack of harmonized benchmark datasets for ozone, which are necessary to enable robust comparisons and advance ML methodology (similar to breakthroughs seen in ML weather forecasting).
- Generalizability and Extremes: ML models struggle with generalization when applied to new regions or different climate regimes, and accurately predicting extreme ozone concentrations (low-likelihood, high-impact events) remains challenging because these events occur rarely in training data.
- Interpretability: Deep neural networks (DNNs) can be highly accurate but are often “black boxes,” making interpretability difficult. Although efforts are underway to incorporate physical constraints (like mass conservation) into DNNs to increase scientific realism, which can enhance the interpretability of a model, there is limited evidence on whether these choices actually improve ML model stability or long-term accuracy.
- Foundation Models: The atmospheric science community is moving towards large ML models trained on diverse, massive datasets which can integrate different data scales and pollutants simultaneously. Realizing this potential requires close collaboration between atmospheric chemistry, computational science, and ML experts. For example, how trustworthy can a tropospheric ozone foundation model be if it omits NOx emissions?
Our paper summarises the state of a rapidly accelerating field, showing how ML/AI techniques are ‘quietly revolutionising’ ozone research, and will be useful to the community as it prepares for the third phase of the Tropospheric Ozone Assessment Report. As AI for numerical weather prediction continues to mature, ozone emerges as a priority frontier for AI in atmospheric modeling. Realistic tropospheric ozone prediction demands high accuracy across the coupled Earth system models, including emissions, meteorology, transport, chemistry, and deposition, making it an excellent benchmark for both advancing and understanding the skill of next-generation AI Earth system models.
Read the full article Applications of Machine Learning and Artificial Intelligence in Tropospheric Ozone Research.