nvidia/earth2studio

Open-source deep-learning framework for exploring, building and deploying AI weather/climate workflows.

725 stars Python 10 components 6 connections

AI weather/climate modeling framework with pre-trained models and workflows

Weather data flows from external sources through standardized interfaces, gets processed by AI models, and outputs to various storage backends

Under the hood, the system uses 2 feedback loops, 3 data pools, 3 control points to manage its runtime behavior.

A 10-component weather climate with 6 connections. 341 files analyzed. Data flows through 5 distinct pipeline stages.

How Data Flows Through the System

Weather data flows from external sources through standardized interfaces, gets processed by AI models, and outputs to various storage backends

  1. Data Ingestion — Fetch weather data from sources like GFS, IFS, or satellite feeds using async caching
  2. Preprocessing — Transform data to model-expected format using lexicon mappings and coordinate systems
  3. Model Inference — Run AI weather models (FCN3, GraphCast, AIFS) to generate forecasts
  4. Postprocessing — Apply perturbations, statistics, or ensemble aggregation to model outputs
  5. Output Storage — Save results to Zarr, NetCDF, or other formats for analysis and visualization

System Behavior

How the system operates at runtime — where data accumulates, what loops, what waits, and what controls what.

Data Pools

Local Cache (file-store)
Downloaded weather data files cached locally to avoid repeated downloads
Redis Job Queue (queue)
Pending and active inference jobs managed by RQ
Model Checkpoints (file-store)
Pre-trained model weights downloaded from HuggingFace Hub

Feedback Loops

Delays

Control Points

Technology Stack

PyTorch (framework)
Deep learning framework for model inference
Xarray (library)
N-dimensional labeled arrays for weather data
FastAPI (framework)
REST API framework for production deployment
Redis (database)
Job queuing and caching for scalable inference
HuggingFace Hub (library)
Model repository and automatic downloading
Zarr (library)
Chunked array storage for large weather datasets
Hydra (library)
Configuration management for complex workflows
Prometheus (infra)
Metrics collection for production monitoring

Key Components

Explore the interactive analysis

See the full architecture map, data flow, and code patterns visualization.

Analyze on CodeSea

Compare earth2studio

Related Weather Climate Repositories

Frequently Asked Questions

What is earth2studio used for?

AI weather/climate modeling framework with pre-trained models and workflows nvidia/earth2studio is a 10-component weather climate written in Python. Data flows through 5 distinct pipeline stages. The codebase contains 341 files.

How is earth2studio architected?

earth2studio is organized into 5 architecture layers: Data Layer, Model Layer, Workflow Layer, I/O Layer, and 1 more. Data flows through 5 distinct pipeline stages. This layered structure keeps concerns separated and modules independent.

How does data flow through earth2studio?

Data moves through 5 stages: Data Ingestion → Preprocessing → Model Inference → Postprocessing → Output Storage. Weather data flows from external sources through standardized interfaces, gets processed by AI models, and outputs to various storage backends This pipeline design reflects a complex multi-stage processing system.

What technologies does earth2studio use?

The core stack includes PyTorch (Deep learning framework for model inference), Xarray (N-dimensional labeled arrays for weather data), FastAPI (REST API framework for production deployment), Redis (Job queuing and caching for scalable inference), HuggingFace Hub (Model repository and automatic downloading), Zarr (Chunked array storage for large weather datasets), and 2 more. A focused set of dependencies that keeps the build manageable.

What system dynamics does earth2studio have?

earth2studio exhibits 3 data pools (Local Cache, Redis Job Queue), 2 feedback loops, 3 control points, 3 delays. The feedback loops handle cache-invalidation and retry. These runtime behaviors shape how the system responds to load, failures, and configuration changes.

What design patterns does earth2studio use?

5 design patterns detected: Protocol-based Interfaces, Lexicon Translation, Async Caching, AutoModel Pattern, Hydra Configuration.

How does earth2studio compare to alternatives?

CodeSea has side-by-side architecture comparisons of earth2studio with graphcast. These comparisons show tech stack differences, pipeline design, system behavior, and code patterns. See the comparison pages above for detailed analysis.

Analyzed on March 25, 2026 by CodeSea. Written by .