Samudra 2: Scaling Ocean Emulators
across Resolutions

Yuan Yuan1,*, Jesse Rusak2, Alexander Merose2, Adam Subel1, Pavel Perezhogin1, Alistair Adcroft3, Carlos Fernandez-Granda1, Laure Zanna1
1New York University   2Open Athena   3Princeton University
*Corresponding author

TL;DR. Samudra 2 is a neural ocean emulator that scales to 1°, 1/2°, and 1/4° resolution with stable ~8-year autoregressive rollouts.

Gulf Stream surface kinetic energy across resolutions

Surface kinetic energy in the Gulf Stream region from GFDL OM4 (top) and Samudra 2 (bottom) at 1 deg, 1/2 deg, and 1/4 deg resolution near the end of an 8-year autoregressive rollout. At finer grids, the emulator progressively captures mesoscale eddies, meanders, and filamentary western boundary current structure.

Abstract

Ocean general circulation models (OGCMs) are essential to climate science but computationally expensive, limiting ensemble size and forcing scenarios. Neural emulators promise orders-of-magnitude speedups, yet existing ocean emulators have not combined fine spatial resolution with multi-year autoregressive rollouts. Samudra, the first autoregressive neural ocean emulator to produce multi-decade global rollouts, is limited to 1° resolution and exhibits two long-horizon failure modes: variance collapse, the loss of temporal variability, and imprinting artifacts, in which velocity patterns leak into deep-ocean fields. We present Samudra 2, which introduces a wider U-Net backbone with modified ConvNeXt-style blocks and a reduced block-internal expansion factor, together with a dynamic loss that reweights output channels according to their prediction errors, strengthening gradients for slow-evolving deep-ocean fields. At 1°, Samudra 2 increases upper-ocean global-mean temperature R2 from 0.56 to 0.87 and reduces deep-ocean temperature error by roughly sevenfold. The same architecture scales to 1/2° and 1/4° over approximately 8-year autoregressive rollouts, recovering mesoscale eddies and sharp western boundary currents. Running on a single GPU, Samudra 2 enables larger ensembles for sea-level projections, ocean heat uptake, and climate variability studies. Project page: https://m2lines.github.io/Samudra/.

Core ideas

  Wider ConvNeXt U-Net

Stage widths increase from [200, 250, 300, 400] to [280, 380, 480, 520] and the block-internal expansion factor is reduced from 4 to 2, shifting capacity toward inter-stage features that higher resolutions need.

  Dynamic variance-weighted loss

Per-channel MSE weights are updated online using an exponential moving average of inverse prediction error, amplifying the gradient signal from slow-evolving deep-ocean fields.

  Scaling to higher resolutions

The paper demonstrates multi-year ocean emulation at 1/2 deg and 1/4 deg on GFDL OM4 data, where mesoscale eddies and western boundary currents emerge at eddy-permitting resolution.

Method

The emulator is an autoregressive function gtheta that maps two consecutive ocean states plus atmospheric forcing to the next two states:

(x-hatt+1, x-hatt+2) = gtheta(xt-1, xt, ft-1, ft)

State xt contains four 3D prognostic variables across 19 depth levels: potential temperature, salinity, zonal velocity, and meridional velocity, plus sea surface height. That yields 77 prognostic channels. Training uses short autoregressive rollouts with K = 4 steps, while evaluation runs freely for about 580 steps, roughly 8 years, from 2014 to 2022 with no ground-truth feedback.

Training versus inference rollout diagram

Figure 2. Training with short rollouts versus long-horizon free-running inference.

Results

Quantitative summary at 1 deg

Against the original Samudra at 1 deg resolution, Samudra 2 improves the main long-horizon diagnostics reported in the paper. It tracks the Nino 3.4 index with higher skill (R2 0.93 versus 0.90, RMSE 0.222 °C versus 0.268 °C) and substantially improves detrended global-mean temperature in the upper ocean.

0.56 to 0.87
Upper-ocean R2 (0-700 m)
about 10x
error reduction at 700-2000 m
about 7x
error reduction at 2000-7000 m

The deepest layers remain the hardest regime, but the paper shows that Samudra 2 sharply reduces imprinting artifacts, where velocity-field patterns leak into temperature and salinity at depth.

Rollout demo

The same architecture, trained independently at each resolution, recovers progressively finer dynamical structure. At 1/4 deg, mesoscale eddies, meanders, and sharper western boundary currents become visible in a way that does not come across as well in a single static snapshot.

This video is a better front-page demonstration of the paper's result: long-horizon behavior that remains organized, energetic, and spatially coherent through rollout.

Long-horizon rollout video for Samudra 2.

Why it matters

A multi-year eddy-permitting ocean rollout that would take millions of core-hours on a traditional OGCM completes on a single GPU with Samudra 2. That is roughly two orders of magnitude of speedup, which changes the workflow from running a few scenarios to running hundreds or thousands of ensemble members for sea-level projections, ocean heat uptake, and climate variability studies such as ENSO.

BibTeX

@misc{yuan2026samudra2scalingocean,
      title={Samudra 2: Scaling Ocean Emulators across Resolutions},
      author={Yuan Yuan and Jesse Rusak and Alexander Merose and Adam Subel and Pavel Perezhogin and Alistair Adcroft and Carlos Fernandez-Granda and Laure Zanna},
      year={2026},
      eprint={2606.02610},
      archivePrefix={arXiv},
      primaryClass={cs.CE},
      url={https://arxiv.org/abs/2606.02610},
}

Acknowledgments and Disclosure of Funding

This project is supported by Schmidt Sciences, as part of the M2 LInES project. We also acknowledge support from the NSF CAIG program via grant 2530958. We thank NVIDIA for a GPU hardware grant, ongoing support, and helpful advice; Lambda for a grant that provided the hardware for developing these models; and AWS for infrastructure grants, which provided data storage and engineering lifecycle support. This research was also supported in part through the NYU IT High Performance Computing resources, services, and staff expertise.