Running Simulations and Making Analyzations With pyqg#

Using pyqg, we can run unique high-resolution simulations of our own. To start, import the following modules/packages:

import numpy as np
import xarray as xr
import fsspec
import matplotlib
import matplotlib.pyplot as plt
import pyqg 
import pyqg.diagnostic_tools as tools
import seaborn as sns

%matplotlib inline

We will use the pyqg.QGModel class to generate the two layer quasigeostrophic models we will be running simulations on. There is a base class from which all other models inherit, pyqg.Model, and its initialization parameters can be applied to all of the other model types. Although there are numerous parameters that can be set and played around with, the following parameters from the base class are of main focus to reproducing the high resolution simulations from the paper.

  • nx: number of real space grid points in the \(x\) direction (affects the resolution of the simulations)

  • dt: numerical timestep

  • tmax: total time of integration (units: model time)

  • tavestart: start time for averaging (units: model time)

Setting these parameters will result in a model under an eddy-like configuration. To create a model under a jet-structured configuration, you must initialize the following parameters decidedly:

  • rek: linear drag in lower layer. Units: seconds-1

  • delta: layer thickness ratio (H1/H2)

  • beta: gradient (slope) of Coriolis parameter. Units: meters-1 seconds-1

In the paper, the following arguments were passed to create the eddy and jet models that were simulated:

eddy_model = pyqg.QGModel(nx=256, dt=3600.0, tmax=311040000.0, tavestart=155520000.0)
jet_model = pyqg.QGModel(nx=256, dt=3600.0, tmax=311040000.0, tavestart=155520000.0, rek=7e-08, delta=0.1, beta=1e-11)
INFO:  Logger initialized
INFO:  Logger initialized

The argument values can be played around with to produce different effects and behaviours from the simulation. For instance, decreasing the value on the nx parameter will result in a lower resolution simulation. To better understand the other parameters and default argument values on the model class, take a further look at pyqg’s official and latest API documentation page.

We can then call run() to run the respective models forward without stopping until the end. Each of the above models are run over a span of 10 years with averages starting to be taken at 5 years in numerical time steps of 1 hour.

# From here, you can call .run() to run a new simulation
eddy_model.run()
jet_model.run()

# Convert to xarray Datasets
eddy_high_res = eddy_model.to_dataset()
jet_high_res = jet_model.to_dataset()

To run a simulation with snapshot variables saved periodically, we use generate_snapshots() and concatenate each of the resulting snapshots for each configuration setting into cumulative xarray.Dataset objects. Below, we are saving snapshots every 1000 hours.

def generate_snapshots(model):
    snapshots = []
    snapshots.append(model.to_dataset())
    for _ in model.run_with_snapshots(tsnapint=1000*model.dt):
        snapshots.append(model.to_dataset())
    return xr.concat(snapshots, dim='time')

eddy_high_res = generate_snapshots(eddy_model)
jet_high_res = generate_snapshots(jet_model)

The xarray.Dataset objects additionally store some documentation for each variable:

eddy_high_res
<xarray.Dataset>
Dimensions:            (l: 256, k: 129, lev: 2, time: 87, y: 256, x: 256,
                        lev_mid: 1)
Coordinates:
  * k                  (k) float32 0.0 6.283e-06 ... 0.000798 0.0008042
  * l                  (l) float32 0.0 6.283e-06 ... -1.257e-05 -6.283e-06
  * lev                (lev) int32 1 2
  * lev_mid            (lev_mid) float32 1.5
  * time               (time) float32 0.0 3.6e+06 7.2e+06 ... 3.06e+08 3.096e+08
  * x                  (x) float32 1.953e+03 5.859e+03 ... 9.941e+05 9.98e+05
  * y                  (y) float32 1.953e+03 5.859e+03 ... 9.941e+05 9.98e+05
Data variables: (12/27)
    APEflux            (l, k) float32 -0.0 5.671e-15 ... 1.012e-37 5.839e-39
    APEgen             float32 7.518e-11
    APEgenspec         (l, k) float32 0.0 -2.749e-15 -1.629e-14 ... 0.0 0.0 -0.0
    Dissspec           (l, k) float32 -0.0 -0.0 -0.0 ... -5.605e-34 -2.88e-35
    EKE                (lev) float32 0.002667 7.86e-05
    EKEdiss            float32 7.277e-11
    ...                 ...
    q                  (time, lev, y, x) float32 9.808e-07 ... -1.629e-06
    streamfunction     (time, lev, y, x) float32 0.0 0.0 0.0 ... 27.0 60.59
    u                  (time, lev, y, x) float32 0.0 0.0 0.0 ... 0.02634 0.02545
    ufull              (time, lev, y, x) float32 0.025 0.025 ... 0.02634 0.02545
    v                  (time, lev, y, x) float32 0.0 0.0 0.0 ... 0.01063 0.00668
    vfull              (time, lev, y, x) float32 0.0 0.0 0.0 ... 0.01063 0.00668
Attributes: (12/24)
    pyqg:L:          1000000.0
    pyqg:M:          65536
    pyqg:W:          1000000.0
    pyqg:beta:       1.5e-11
    pyqg:del2:       0.8
    pyqg:delta:      0.25
    ...              ...
    pyqg:tc:         0
    pyqg:tmax:       311040000.0
    pyqg:twrite:     1000.0
    pyqg_params:     {"nx": 256.0, "dt": 3600.0, "tmax": 311040000.0, "tavest...
    reference:       https://pyqg.readthedocs.io/en/latest/index.html
    title:           pyqg: Python Quasigeostrophic Model

Retrieving Preexisting Datasets#

Alternatively, with preexisting datasets, we can replicate the procedures followed within the paper using the same high resolution datasets that were generated in the study. We can retrieve the exact datasets stored on a persistent storage bucket within Jupyter Hub. The datasets are organized as zarr files in the following top-down fashion:

eddy/
    high_res.zarr
    low_res.zarr
    forcing1.zarr
    forcing2.zarr
    forcing3.zarr
jet/
    high_res.zarr
    low_res.zarr
    forcing1.zarr
    forcing2.zarr
    forcing3.zarr

Where the top-level directories (eddy and jet) delineate the configurations under which the pyqg.QGModel simulations were run under.

# Fives runs total were made, the following gets the model output from the first run
eddy_high_res = xr.open_zarr('gs://leap-persistent/pbluc/eddy/high_res').isel(run=0).load()
jet_high_res = xr.open_zarr('gs://leap-persistent/pbluc/jet/high_res').isel(run=0).load()

high_res.zarr contains snapshots and diagnostics for high resolution eddy- and jet-configured models (where nx=256). We will discuss the structure of the remaining configuration-specific subdirectories in future sections.

Visualizing the Output#

State Variables#

The xarray.Dataset objects contain various variables including state and diagnostic variables. Using the state variables, we can visualize the results of the simulations we run. The following state variables are stored in each dataset:

  • q: potential vorticity (PV) in real space

  • u: the zonal velocity anomaly or the \(x\)-velocity relative to the background flow

  • v: meridional velocity anomaly or the \(y\)-velocity relative to the background flow

  • ufull and vfull - corresponding variables of the full velocities in real space, including the background flow

  • streamfunction

Examples of snapshot depictions that we can make using these variables include snapshots of the upper and lower PV, kinetic energy density, and enstrophy for the simulations, in both eddy and jet configurations.

We can plot snapshots of the PV in both levels simply using q from each dataset.

plt.figure(figsize=(12,5)).suptitle("Final-timestep upper PV snapshots")
plt.subplot(121); eddy_high_res.q.isel(lev=0, time=-1).plot(vmin=-2e-5, vmax=2e-5, cmap='bwr'); plt.title("Eddy config")
plt.subplot(122);  jet_high_res.q.isel(lev=0, time=-1).plot(vmin=-2e-5, vmax=2e-5, cmap='bwr'); plt.title( "Jet config")
plt.tight_layout()
../_images/9e951b3edb9c76d08fdb60f1450788c8ae204bee16f7347d1ee18b6c3f5625b9.png
plt.figure(figsize=(12,5)).suptitle("Final-timestep lower PV snapshots")
plt.subplot(121); eddy_high_res.q.isel(lev=1, time=-1).plot(vmin=-2e-6, vmax=2e-6, cmap='bwr'); plt.title("Eddy config")
plt.subplot(122);  jet_high_res.q.isel(lev=1, time=-1).plot(vmin=-2e-6, vmax=2e-6, cmap='bwr'); plt.title( "Jet config")
plt.tight_layout()
../_images/85ed6f27872e1ca8ae666822cbefe405856424c97651ccc0c7dc6e1be384fc25.png

To plot snapshots of the kinetic energy density, we use the state variables u and v, to calculate the sum of the kinetic energy density, \((u^2 + v^2) / 2\), at each layer.

eddy_ke_density = ((eddy_high_res.u * eddy_high_res.u) + (eddy_high_res.v * eddy_high_res.v)) / 2
jet_ke_density = ((jet_high_res.u * jet_high_res.u) + (jet_high_res.v * jet_high_res.v)) / 2

plt.figure(figsize=(12,5)).suptitle("Final-timestep KE density snapshots")
plt.subplot(121); eddy_ke_density.sum(dim=['lev']).isel(time=-1).plot(vmax=1e-2, cmap='magma'); plt.title("Eddy config")
plt.subplot(122);  jet_ke_density.sum(dim=['lev']).isel(time=-1).plot(vmax=1e-2, cmap='magma'); plt.title( "Jet config")
plt.tight_layout()
../_images/130c98fb71f4c1f0d14e001d4ae98dee4abc3ee66e3690f06e13b7c6ea77ffd9.png

Traditionally, the enstrophy is calculated as the \(\text{curl}^{2}(\vec{u})/2\). However pygq’s calculation of enstrophy is made without taking any gradient operations as it is based on the PV instead. Thus, to plot snapshots of the enstrophy, we use only the state variable q to calculate the enstrophy as \(q^{2}/2\), at each layer. Though it is to be noted that this calculation of enstrophy has an extra term that is specific to QG models thus using the gradient calculated enstrophy is more standard in practice.

eddy_enstrophy = (eddy_high_res.q * eddy_high_res.q) / 2
jet_enstrophy = (jet_high_res.q * jet_high_res.q) / 2

plt.figure(figsize=(12,5)).suptitle("Final-timestep enstrophy snapshots")
plt.subplot(121); eddy_enstrophy.sum(dim=['lev']).isel(time=-1).plot(vmax=1.25e-10, cmap='magma'); plt.title("Eddy config")
plt.subplot(122);  jet_enstrophy.sum(dim=['lev']).isel(time=-1).plot(vmax=1.25e-10, cmap='magma'); plt.title( "Jet config")
plt.tight_layout()
../_images/98df66adc9cf32b45f050fc2a2a1dc57be8f218a3cd64eecba38cdd61cf04a24.png

Diagnostic Variables#

Now we can also employ more diagnostic-type variables to aide in quantifying how entities such as energy, and, enstrophy, are distributed and transferred across scales. These variables are most useful when plotted for high and low resolution simulations as well as parameterized simulations.

Power Spectra#

  • KEspec: how much kinetic energy is stored at each \(x\) and \(y\) lengthscale

  • Ensspec: how much enstrophy is stored at each \(x\) and \(y\) lengthscale

Energy budget#

  • KEflux: how kinetic energy is being transferred across lengthscales

  • APEflux: how available potential energy is being transferred across lengthscales

  • APEgenspec: how much new available potential energy is being generated at each scale

  • KEfrictionspec: how much energy is being lost to bottom drag at each lengthscale

  • Dissspec: how much energy is being lost due to numerical dissipation at each lengthscale

Enstrophy budget#

  • ENSflux: how enstrophy is being transferred across lengthscales

  • ENSgenspec: how much new enstrophy is being generated

  • ENSfrictionspec: how much enstrophy is lost to bottom drag

  • ENSDissspec: how much enstrophy is lost to numerical dissipation

Using a selection of these variables, we will plot the time-averaged kinetic energy power spectrum summed over fluid layers, time-series of total kinetic energy, spatially flattened probability distribution of upper layer PV, and spectral energy flux terms for eddy configuration simulations as examples of how these diagnostics can be plotted.

# Using diagnostics from final timestamp of eddy configuration
eddy_model_ds = eddy_high_res.isel(time=-1)
# Time-averaged kinetic energy power spectra summed over fluid layers
kr, KEspec_upper = tools.calc_ispec(eddy_model, eddy_model_ds.KEspec.isel(lev=0).data)
_, KEspec_lower  = tools.calc_ispec(eddy_model, eddy_model_ds.KEspec.isel(lev=1).data)

plt.loglog(kr, KEspec_upper + KEspec_lower)
plt.xlabel(r'Isotropic wavenumber [$m^{-1}$]')
plt.ylabel(r'Power spectrum [$m^{3}s^{-2}$]')
plt.title('KE power spectrum')
plt.grid()
../_images/336582c855bbd4a45e6266231783599d7fd6414b70f27ddbf7f17fef59ee6e69.png
# Time-series of total kinetic energy
eddy_ke_density = ((eddy_high_res.u * eddy_high_res.u) + (eddy_high_res.v * eddy_high_res.v)) / 2
eddy_ke_density = eddy_ke_density.sum(dim=['lev']).mean(dim=['x', 'y'])

plt.plot(eddy_high_res['time'] / 86400,  eddy_ke_density)
plt.ticklabel_format(axis='y', style='sci', scilimits=(0,0))
plt.xlabel('Model time [days]')
plt.ylabel(r'KE density [$m^{2}s^{-2}$]')
plt.title('Total KE')
plt.grid()
../_images/9bc0d8d32ad4525ba3376990f1e250511c329ee2e665a833f6be301a0878cb1b.png
# Spatially flattened probability distributiom of upper layer PV
sns.kdeplot(np.reshape(eddy_model_ds.q.isel(lev=0).values, (-1)))
plt.ticklabel_format(axis='y', style='sci', scilimits=(0,0))
plt.xlabel(r'$q_{1}$ [$s^{-1}$]')
plt.ylabel('probability density')
plt.title('Upper PV PDF')
plt.grid()
../_images/bca60ea656a3b812395ff5ca4c626ff4a1516db5ede3f3e611d8b43058561531.png
# Spectral energy flux terms 
kr, APEgenspec = tools.calc_ispec(eddy_model, eddy_model_ds.APEgenspec.data)
_, APEflux     = tools.calc_ispec(eddy_model, eddy_model_ds.APEflux.data)
_, KEflux      = tools.calc_ispec(eddy_model, eddy_model_ds.KEflux.data)
_, KEfrictionspec = tools.calc_ispec(eddy_model, eddy_model_ds.KEfrictionspec.data)

energy_budget = [ APEgenspec,
                  APEflux,
                  KEflux,
                  KEfrictionspec]
energy_budget_labels = ['APE gen','APE flux','KE flux','Bottom drag']
[plt.semilogx(kr, term) for term in energy_budget]
plt.axhline(0, color='gray', ls='--')
plt.legend(energy_budget_labels, loc='upper right')
plt.xlabel(r'Isotropic wavenumber [$m^{-1}$]'); 
plt.ylabel(r'Energy flux [$m^{3}s^{-3}$]'); 
plt.title('Spectral Energy Transfer');
plt.grid()
../_images/5e98e16943e094265cbaf0486117399385eb8390fb305c06c4b91a3eaf9814fc.png