RESEARCH

Earth Systems Lab 2025

We chose our mission statement carefully because it acknowledges our responsibility to the future. A steward nurtures and protects what already exists, while carefully guiding new development in a responsible manner. The principle of ‘optimistic stewardship’ is our guiding star for developing impactful technology and for growing our community.

Our FDL community of brilliant and best minds, along with leading organisations are uniquely placed to show how emerging technologies can be a force for good in the world. Over the last eight years, Earth Systems Lab teams have pushed the boundaries of Earth observation: from world-first demonstrations of machine learning in orbit for mapping floods and landscape changes, to predicting global-scale rainfall, to understanding the drivers of extreme wildfires.

3D CLOUDS FOR CLIMATE EXTREMES 

Can we dynamically model 3D cloud structure quantify its importance on extreme climate? 

Advancing global 3D cloud reconstruction is needed to improve our understanding of atmospheric phenomena - and climate.  

Predictions could support a wide variety of scientific use-cases: from better forecasting of hurricane intensity, to more discriminative cloud classification or a more nuanced understanding of how deforestation affects cloud cover and type.

Results

Predicting the intensity and path of tropical cyclones is a major challenge in meteorology. The devastating impacts of storms like Hurricane Dorian, which rapidly intensified from a Category 1 to a Category 5 superstorm, highlight the critical need for more accurate forecasting. A key reason for this difficulty is that the crucial microphysical properties of cyclones, such as ice and water content, are poorly represented in operational forecasting systems. Direct, three-dimensional observations of these storms are extremely limited, as most satellite data only provides a two-dimensional view of cloud tops. This lack of detailed, continuous information makes it difficult to understand the complex processes that drive cyclone intensification and to provide timely warnings to affected populations.

To address these limitations, the 3D Clouds for Climate Extremes team developed a machine learning pipeline capable of reconstructing the three-dimensional structure of tropical cyclone clouds from satellite imagery. The approach leverages two primary data sources: high-temporal-resolution geostationary satellites (GOES, MSG, Himawari), which provide a constant view of cloud tops, and CloudSat, which offers detailed vertical profiles of cloud properties. Adopting a two-step data preparation and training process, the team used a large dataset of geostationary images to pre-train a sensor-independent model, making it adaptable to different satellite systems. Following this, the model was fine-tuned on a smaller dataset of collocated geostationary and CloudSat observations specifically for tropical cyclones. This process allows the machine learning pipeline to maximize the strengths of both data sources, enabling it to infer a cyclone's full 3D structure from cloud-top imagery.

The outputs however, go further than just structure and can simultaneously predict key variables like radar reflectivity, ice water content, and effective droplet radius, providing a comprehensive and continuous view of a storm's internal dynamics. Because the model is sensor-independent, it can achieve near-global coverage by incorporating imagery from various satellites. The system was successfully applied to Hurricane Dorian, where it was able to reconstruct the storm's 3D structure during its rapid intensification phase - a period for which direct observations were previously missing. By producing high-resolution, 3D reconstructions of tropical cyclones, this work has created a massive new dataset that could significantly improve our understanding of intensification processes. This information has the potential to enhance forecasting models and create more reliable early warning systems, ultimately helping to save lives and protect communities.

See this work as a spotlight talk at the NeurIPS 2025 Tackling Climate Change with Machine Learning Workshop: Global 3D Reconstruction of Clouds & Tropical Cyclones.

FOUNDATION MODELS FOR EXTREME ENVIRONMENTS

How can we appropriately use foundation models for uncertainty-aware decision making in poorly sampled environments?

Contemporary Earth Observation foundation models have poor predictive power in environments like the Antarctic - where sparse sampling during training leads to poor generalization (i.e. when the target region is significantly different to training data). 

We aim to reliably assess where a model is likely to perform poorly in advance and explore methods to condition or adjust predictions, such as introducing heterogeneous sources of ground truth.

  • FOUNDATION MODELS IN EXTREME ENVIRONMENTS LIVE SOTA SHOWCASE

Results

In order to enhance the reliability of geospatial foundation models in real-world, often extreme, scenarios where confident but incorrect predictions can have severe consequences, the Foundation Models in Extreme Environments team developed SHRUG-FM - a framework to help foundation models flag when they may fail. This framework addresses two primary types of uncertainty, i.e., “not knowing because of the data” and “not knowing because of the model”.

To assess data-related uncertainty, SHRUG-FM compares input images to the foundation model's training data, both in raw input space and in the model's embedding space. By clustering training data and measuring the distance of new inputs to these clusters, the framework identifies whether the model has encountered similar situations before or if the input represents uncharted territory. Findings indicate that out-of-distribution (OOD) signals, particularly Nearest Centroid Distance Deficit (NCDD), correlate strongly with decreased model performance in specific environmental conditions, such as low elevation or large river areas, suggesting data underrepresentation in pretraining.

For model-related uncertainty, SHRUG-FM employs ensemble techniques, training multiple models with randomness and analyzing their agreement or disagreement on predictions. Higher predictive variance among ensemble members indicates greater uncertainty. This uncertainty-based flagging effectively discards unreliable predictions, especially those with high predictive variance, thereby improving the trustworthiness of the foundation model’s outputs.

These three complementary uncertainty signals, input OOD, embedding OOD, and task-specific predictive uncertainty, are integrated into a SHRUG-FM system. This system provides a reliability-aware prediction mechanism that can either provide a prediction, raise a warning, or “shrug2 (indicating it doesn't know) when uncertainty is high. This adaptable framework aims to make geospatial foundation models more transparent and reliable for critical climate-sensitive applications like burn scar segmentation, with future work planned to extend its utility to flood mapping and landslide detection.
See this work as a poster presentation at the NeurIPS 2025 Tackling Climate Change with Machine Learning Workshop: SHRUG-FM: Reliability-Aware Foundation Models for Earth Observations.

STARCOP 2.0: ATMOSPHERIC ANOMALY DETECTION ONBOARD

Can we push the limits of weak signal detection to map transient phenomena like anthropogenic greenhouse gas leaks?

Constellations of small satellites offer great potential for low-latency Earth Observation tasks like alerting or tip-and-cue. 

We will assess the detection limits for weak signals, leveraging rich multi-and hyperspectral data to better identify difficult to detect phenomena like anthropogenic greenhouse gas leaks or other atmospheric anomalies.

Results

The accelerating rate of global warming, largely driven by anthropogenic greenhouse gas emissions, demands a faster and more efficient way to detect and track these emissions. Methane, in particular, is a potent warming agent, and its short atmospheric lifetime makes rapid leak detection and remediation critical. While hyperspectral satellites can identify methane plumes by their unique spectral signatures, a significant challenge remains: knowing when and where to look in a timely manner. Traditional methods involve transmitting large image files back to ground stations for processing, a time-consuming step that delays the identification of leaks and hinders the ability to act quickly. To effectively mitigate methane emissions, a solution is needed that bypasses these delays and provides near real-time data to stakeholders.

To overcome the limitations of traditional methods, the STARCOP2.0 team proposed a novel “Tip and Cue” satellite system. This two-part system leverages onboard machine learning to provide rapid, end-to-end detection and analysis of methane plumes. The “Tip” satellite, equipped with a Vision Transformer model, is designed for fast and accurate plume classification. It scans broad areas and identifies specific image tiles that contain potential methane plumes. Upon detection, it sends an alert to the “Cue” satellite. The “Cue” satellite, with its more sophisticated UNet architecture, then performs detailed plume segmentation and predicts the concentration of the methane. This dual-satellite approach eliminates the need to transmit large image files back to a ground station for processing, significantly reducing the time from detection to data delivery.

This research provides a framework for deploying AI models directly on spacecraft, a task that requires careful consideration of limited computational, memory, and energy resources. The team in this regard has developed and released two machine-learning-ready datasets, one "orthorectified" and one "unorthorectified"—to accelerate future research in this area, with the latter being particularly optimized for onboard processing by eliminating a time-consuming correction step. The models, converted to the ONNX format and tested on hardware simulating spacecraft capabilities, demonstrate promising results: a compressed Vision Transformer model is able to detect plumes in just 5-10 milliseconds. This approach enables a system where the “Tip” satellite quickly alerts the “Cue” satellite, which in turn sends precise, data-driven information on plume location, size, and concentration directly to end-users like the UN International Methane Emissions Observatory. Ultimately, this work represents a significant step towards creating an early warning system for greenhouse gas leaks, empowering policymakers with the timely information needed to take decisive action in the fight against climate change.

See this work as a poster presentation at the NeurIPS 2025 Machine Learning and the Physical Sciences Workshop: Towards Methane Detection Onboard Satellites.