Visualizing and Comparing LIS Output

../../_images/nasa-lis-combined-logos2.png

LIS Output Primer

LIS writes model state variables to disk at a frequency selected by the user (e.g., 6-hourly, daily, monthly). The LIS output we will be exploring was originally generated as daily NetCDF files, meaning one NetCDF was written per simulated day. We have converted these NetCDF files into a Zarr store for improved performance in the cloud.

Import Libraries

# interface to Amazon S3 filesystem
import s3fs

# interact with n-d arrays
import numpy as np
import xarray as xr

# interact with tabular data (incl. spatial)
import pandas as pd
import geopandas as gpd

# interactive plots
import holoviews as hv
import geoviews as gv
import hvplot.pandas
import hvplot.xarray

# used to find nearest grid cell to a given location
from scipy.spatial import distance

# set bokeh as the holoviews plotting backend
hv.extension('bokeh')

Load the LIS Output

The xarray library makes working with labelled n-dimensional arrays easy and efficient. If you’re familiar with the pandas library it should feel pretty familiar.

Here we load the LIS output into an xarray.Dataset object:

# create S3 filesystem object
s3 = s3fs.S3FileSystem(anon=False)

# define the name of our S3 bucket
bucket_name = 'eis-dh-hydro/SNOWEX-HACKWEEK'

# define path to store on S3
lis_output_s3_path = f's3://{bucket_name}/DA_SNODAS/SURFACEMODEL/LIS_HIST.d01.zarr/'

# create key-value mapper for S3 object (required to read data stored on S3)
lis_output_mapper = s3.get_mapper(lis_output_s3_path)

# open the dataset
lis_output_ds = xr.open_zarr(lis_output_mapper, consolidated=True)

# drop some unneeded variables
lis_output_ds = lis_output_ds.drop_vars(['_history', '_eis_source_path'])

Explore the Data

Display an interactive widget for inspecting the dataset by running a cell containing the variable name. Expand the dropdown menus and click on the document and database icons to inspect the variables and attributes.

lis_output_ds
<xarray.Dataset>
Dimensions:           (time: 730, north_south: 215, east_west: 361, SoilMoist_profiles: 4)
Coordinates:
  * time              (time) datetime64[ns] 2016-10-01 2016-10-02 ... 2018-09-30
Dimensions without coordinates: north_south, east_west, SoilMoist_profiles
Data variables: (12/26)
    Albedo_tavg       (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    CanopInt_tavg     (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ECanop_tavg       (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ESoil_tavg        (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    GPP_tavg          (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    LAI_tavg          (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ...                ...
    Swnet_tavg        (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TVeg_tavg         (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TWS_tavg          (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TotalPrecip_tavg  (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    lat               (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    lon               (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
Attributes: (12/14)
    DX:                      0.10000000149011612
    DY:                      0.10000000149011612
    MAP_PROJECTION:          EQUIDISTANT CYLINDRICAL
    NUM_SOIL_LAYERS:         4
    SOIL_LAYER_THICKNESSES:  [10.0, 30.000001907348633, 60.000003814697266, 1...
    SOUTH_WEST_CORNER_LAT:   28.549999237060547
    ...                      ...
    conventions:             CF-1.6
    institution:             NASA GSFC
    missing_value:           -9999.0
    references:              Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007
    source:                  Noah-MP.4.0.1
    title:                   LIS land surface model output

Accessing Attributes

Dataset attributes (metadata) are accessible via the attrs attribute:

lis_output_ds.attrs
{'DX': 0.10000000149011612,
 'DY': 0.10000000149011612,
 'MAP_PROJECTION': 'EQUIDISTANT CYLINDRICAL',
 'NUM_SOIL_LAYERS': 4,
 'SOIL_LAYER_THICKNESSES': [10.0,
  30.000001907348633,
  60.000003814697266,
  100.0],
 'SOUTH_WEST_CORNER_LAT': 28.549999237060547,
 'SOUTH_WEST_CORNER_LON': -113.94999694824219,
 'comment': 'website: http://lis.gsfc.nasa.gov/',
 'conventions': 'CF-1.6',
 'institution': 'NASA GSFC',
 'missing_value': -9999.0,
 'references': 'Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007',
 'source': 'Noah-MP.4.0.1',
 'title': 'LIS land surface model output'}

Accessing Variables

Variables can be accessed using either dot notation or square bracket notation:

# dot notation
lis_output_ds.SnowDepth_tavg
<xarray.DataArray 'SnowDepth_tavg' (time: 730, north_south: 215, east_west: 361)>
dask.array<open_dataset-7d66e42249419f6b85d6bd66542e643aSnowDepth_tavg, shape=(730, 215, 361), dtype=float32, chunksize=(1, 215, 361), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2016-10-01 2016-10-02 ... 2018-09-30
Dimensions without coordinates: north_south, east_west
Attributes:
    long_name:      snow depth
    standard_name:  snow_depth
    units:          m
    vmax:           999999986991104.0
    vmin:           -999999986991104.0
# square bracket notation
lis_output_ds['SnowDepth_tavg']
<xarray.DataArray 'SnowDepth_tavg' (time: 730, north_south: 215, east_west: 361)>
dask.array<open_dataset-7d66e42249419f6b85d6bd66542e643aSnowDepth_tavg, shape=(730, 215, 361), dtype=float32, chunksize=(1, 215, 361), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2016-10-01 2016-10-02 ... 2018-09-30
Dimensions without coordinates: north_south, east_west
Attributes:
    long_name:      snow depth
    standard_name:  snow_depth
    units:          m
    vmax:           999999986991104.0
    vmin:           -999999986991104.0

Which syntax should I use?

While both syntaxes perform the same function, the square-bracket syntax is useful when interacting with a dataset programmatically. For example, we can define a variable varname that stores the name of the variable in the dataset we want to access and then use that with the square-brackets notation:

varname = 'SnowDepth_tavg'

lis_output_ds[varname]
<xarray.DataArray 'SnowDepth_tavg' (time: 730, north_south: 215, east_west: 361)>
dask.array<open_dataset-7d66e42249419f6b85d6bd66542e643aSnowDepth_tavg, shape=(730, 215, 361), dtype=float32, chunksize=(1, 215, 361), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2016-10-01 2016-10-02 ... 2018-09-30
Dimensions without coordinates: north_south, east_west
Attributes:
    long_name:      snow depth
    standard_name:  snow_depth
    units:          m
    vmax:           999999986991104.0
    vmin:           -999999986991104.0

The dot notation syntax will not work this way because xarray tries to find a variable in the dataset named varname instead of the value of the varname variable. When xarray can’t find this variable, it throws an error:

# uncomment and run the code below to see the error

# varname = 'SnowDepth_tavg'

# lis_output_ds.varname

Dimensions and Coordinate Variables

The dimensions and coordinate variable fields put the “labelled” in “labelled n-dimensional arrays”:

  • Dimensions: labels for each dimension in the dataset (e.g., time)

  • Coordinates: labels for indexing along dimensions (e.g., '2019-01-01')

We can use these labels to select, slice, and aggregate the dataset.

Selecting/Subsetting

xarray provides two methods for selecting or subsetting along coordinate variables:

  • index selection: ds.isel(time=0)

  • value selection ds.sel(time='2019-01-01')

For example, we can select the first timestep from our dataset using index selection by passing the dimension name as a keyword argument:

# remember: python indexes start at 0
lis_output_ds.isel(time=0)
<xarray.Dataset>
Dimensions:           (north_south: 215, east_west: 361, SoilMoist_profiles: 4)
Coordinates:
    time              datetime64[ns] 2016-10-01
Dimensions without coordinates: north_south, east_west, SoilMoist_profiles
Data variables: (12/26)
    Albedo_tavg       (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    CanopInt_tavg     (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    ECanop_tavg       (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    ESoil_tavg        (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    GPP_tavg          (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    LAI_tavg          (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    ...                ...
    Swnet_tavg        (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    TVeg_tavg         (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    TWS_tavg          (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    TotalPrecip_tavg  (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    lat               (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    lon               (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
Attributes: (12/14)
    DX:                      0.10000000149011612
    DY:                      0.10000000149011612
    MAP_PROJECTION:          EQUIDISTANT CYLINDRICAL
    NUM_SOIL_LAYERS:         4
    SOIL_LAYER_THICKNESSES:  [10.0, 30.000001907348633, 60.000003814697266, 1...
    SOUTH_WEST_CORNER_LAT:   28.549999237060547
    ...                      ...
    conventions:             CF-1.6
    institution:             NASA GSFC
    missing_value:           -9999.0
    references:              Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007
    source:                  Noah-MP.4.0.1
    title:                   LIS land surface model output

Or we can use value selection to select based on the coordinate(s) (think “labels”) of a given dimension:

lis_output_ds.sel(time='2018-01-01')
<xarray.Dataset>
Dimensions:           (north_south: 215, east_west: 361, SoilMoist_profiles: 4)
Coordinates:
    time              datetime64[ns] 2018-01-01
Dimensions without coordinates: north_south, east_west, SoilMoist_profiles
Data variables: (12/26)
    Albedo_tavg       (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    CanopInt_tavg     (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    ECanop_tavg       (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    ESoil_tavg        (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    GPP_tavg          (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    LAI_tavg          (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    ...                ...
    Swnet_tavg        (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    TVeg_tavg         (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    TWS_tavg          (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    TotalPrecip_tavg  (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    lat               (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
    lon               (north_south, east_west) float32 dask.array<chunksize=(215, 361), meta=np.ndarray>
Attributes: (12/14)
    DX:                      0.10000000149011612
    DY:                      0.10000000149011612
    MAP_PROJECTION:          EQUIDISTANT CYLINDRICAL
    NUM_SOIL_LAYERS:         4
    SOIL_LAYER_THICKNESSES:  [10.0, 30.000001907348633, 60.000003814697266, 1...
    SOUTH_WEST_CORNER_LAT:   28.549999237060547
    ...                      ...
    conventions:             CF-1.6
    institution:             NASA GSFC
    missing_value:           -9999.0
    references:              Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007
    source:                  Noah-MP.4.0.1
    title:                   LIS land surface model output

The .sel() approach also allows the use of shortcuts in some cases. For example, here we select all timesteps in the month of January 2018:

lis_output_ds.sel(time='2018-01')
<xarray.Dataset>
Dimensions:           (time: 31, north_south: 215, east_west: 361, SoilMoist_profiles: 4)
Coordinates:
  * time              (time) datetime64[ns] 2018-01-01 2018-01-02 ... 2018-01-31
Dimensions without coordinates: north_south, east_west, SoilMoist_profiles
Data variables: (12/26)
    Albedo_tavg       (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    CanopInt_tavg     (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ECanop_tavg       (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ESoil_tavg        (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    GPP_tavg          (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    LAI_tavg          (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ...                ...
    Swnet_tavg        (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TVeg_tavg         (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TWS_tavg          (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TotalPrecip_tavg  (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    lat               (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    lon               (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
Attributes: (12/14)
    DX:                      0.10000000149011612
    DY:                      0.10000000149011612
    MAP_PROJECTION:          EQUIDISTANT CYLINDRICAL
    NUM_SOIL_LAYERS:         4
    SOIL_LAYER_THICKNESSES:  [10.0, 30.000001907348633, 60.000003814697266, 1...
    SOUTH_WEST_CORNER_LAT:   28.549999237060547
    ...                      ...
    conventions:             CF-1.6
    institution:             NASA GSFC
    missing_value:           -9999.0
    references:              Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007
    source:                  Noah-MP.4.0.1
    title:                   LIS land surface model output

Select a custom range of dates using Python’s built-in slice() method:

lis_output_ds.sel(time=slice('2018-01-01', '2018-01-15'))
<xarray.Dataset>
Dimensions:           (time: 15, north_south: 215, east_west: 361, SoilMoist_profiles: 4)
Coordinates:
  * time              (time) datetime64[ns] 2018-01-01 2018-01-02 ... 2018-01-15
Dimensions without coordinates: north_south, east_west, SoilMoist_profiles
Data variables: (12/26)
    Albedo_tavg       (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    CanopInt_tavg     (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ECanop_tavg       (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ESoil_tavg        (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    GPP_tavg          (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    LAI_tavg          (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ...                ...
    Swnet_tavg        (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TVeg_tavg         (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TWS_tavg          (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TotalPrecip_tavg  (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    lat               (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    lon               (time, north_south, east_west) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
Attributes: (12/14)
    DX:                      0.10000000149011612
    DY:                      0.10000000149011612
    MAP_PROJECTION:          EQUIDISTANT CYLINDRICAL
    NUM_SOIL_LAYERS:         4
    SOIL_LAYER_THICKNESSES:  [10.0, 30.000001907348633, 60.000003814697266, 1...
    SOUTH_WEST_CORNER_LAT:   28.549999237060547
    ...                      ...
    conventions:             CF-1.6
    institution:             NASA GSFC
    missing_value:           -9999.0
    references:              Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007
    source:                  Noah-MP.4.0.1
    title:                   LIS land surface model output

Latitude and Longitude

You may have noticed that latitude (lat) and longitude (lon) are listed as data variables, not coordinate variables. This dataset would be easier to work with if lat and lon were coordinate variables and dimensions. Here we define a helper function that reads the spatial information from the dataset attributes, generates arrays containing the lat and lon values, and appends them to the dataset:

def add_latlon_coords(dataset: xr.Dataset)->xr.Dataset:
    """Adds lat/lon as dimensions and coordinates to an xarray.Dataset object."""
    
    # get attributes from dataset
    attrs = dataset.attrs
    
    # get x, y resolutions
    dx = round(float(attrs['DX']), 3)
    dy = round(float(attrs['DY']), 3)
    
    # get grid cells in x, y dimensions
    ew_len = len(dataset['east_west'])
    ns_len = len(dataset['north_south'])
    
    # get lower-left lat and lon
    ll_lat = round(float(attrs['SOUTH_WEST_CORNER_LAT']), 3)
    ll_lon = round(float(attrs['SOUTH_WEST_CORNER_LON']), 3)
    
    # calculate upper-right lat and lon
    ur_lat =  ll_lat + (dy * ns_len)
    ur_lon = ll_lon + (dx * ew_len)
    
    # define the new coordinates
    coords = {
        # create an arrays containing the lat/lon at each gridcell
        'lat': np.linspace(ll_lat, ur_lat, ns_len, dtype=np.float32, endpoint=False),
        'lon': np.linspace(ll_lon, ur_lon, ew_len, dtype=np.float32, endpoint=False)
    }
    
    lon_attrs = dataset.lon.attrs
    lat_attrs = dataset.lat.attrs
    
    # rename the original lat and lon variables
    dataset = dataset.rename({'lon':'orig_lon', 'lat':'orig_lat'})
    # rename the grid dimensions to lat and lon
    dataset = dataset.rename({'north_south': 'lat', 'east_west': 'lon'})
    # assign the coords above as coordinates
    dataset = dataset.assign_coords(coords)
    dataset.lon.attrs = lon_attrs
    dataset.lat.attrs = lat_attrs
    
    return dataset

Now that the function is defined, let’s use it to append lat and lon coordinates to the LIS output:

lis_output_ds = add_latlon_coords(lis_output_ds)

Inspect the dataset:

lis_output_ds
<xarray.Dataset>
Dimensions:           (time: 730, lat: 215, lon: 361, SoilMoist_profiles: 4)
Coordinates:
  * time              (time) datetime64[ns] 2016-10-01 2016-10-02 ... 2018-09-30
  * lat               (lat) float32 28.55 28.65 28.75 ... 49.75 49.85 49.95
  * lon               (lon) float32 -113.9 -113.8 -113.8 ... -78.05 -77.95
Dimensions without coordinates: SoilMoist_profiles
Data variables: (12/26)
    Albedo_tavg       (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    CanopInt_tavg     (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ECanop_tavg       (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ESoil_tavg        (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    GPP_tavg          (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    LAI_tavg          (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    ...                ...
    Swnet_tavg        (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TVeg_tavg         (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TWS_tavg          (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    TotalPrecip_tavg  (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    orig_lat          (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
    orig_lon          (time, lat, lon) float32 dask.array<chunksize=(1, 215, 361), meta=np.ndarray>
Attributes: (12/14)
    DX:                      0.10000000149011612
    DY:                      0.10000000149011612
    MAP_PROJECTION:          EQUIDISTANT CYLINDRICAL
    NUM_SOIL_LAYERS:         4
    SOIL_LAYER_THICKNESSES:  [10.0, 30.000001907348633, 60.000003814697266, 1...
    SOUTH_WEST_CORNER_LAT:   28.549999237060547
    ...                      ...
    conventions:             CF-1.6
    institution:             NASA GSFC
    missing_value:           -9999.0
    references:              Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007
    source:                  Noah-MP.4.0.1
    title:                   LIS land surface model output

Now lat and lon are listed as coordinate variables and have replaced the north_south and east_west dimensions. This will make it easier to spatially subset the dataset!

Basic Spatial Subsetting

We can use the slice() function we used above on the lat and lon dimensions to select data between a range of latitudes and longitudes:

lis_output_ds.sel(lat=slice(37, 41), lon=slice(-110, -101))
<xarray.Dataset>
Dimensions:           (time: 730, lat: 40, lon: 90, SoilMoist_profiles: 4)
Coordinates:
  * time              (time) datetime64[ns] 2016-10-01 2016-10-02 ... 2018-09-30
  * lat               (lat) float32 37.05 37.15 37.25 ... 40.75 40.85 40.95
  * lon               (lon) float32 -109.9 -109.8 -109.8 ... -101.2 -101.1
Dimensions without coordinates: SoilMoist_profiles
Data variables: (12/26)
    Albedo_tavg       (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    CanopInt_tavg     (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    ECanop_tavg       (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    ESoil_tavg        (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    GPP_tavg          (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    LAI_tavg          (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    ...                ...
    Swnet_tavg        (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    TVeg_tavg         (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    TWS_tavg          (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    TotalPrecip_tavg  (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    orig_lat          (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
    orig_lon          (time, lat, lon) float32 dask.array<chunksize=(1, 40, 90), meta=np.ndarray>
Attributes: (12/14)
    DX:                      0.10000000149011612
    DY:                      0.10000000149011612
    MAP_PROJECTION:          EQUIDISTANT CYLINDRICAL
    NUM_SOIL_LAYERS:         4
    SOIL_LAYER_THICKNESSES:  [10.0, 30.000001907348633, 60.000003814697266, 1...
    SOUTH_WEST_CORNER_LAT:   28.549999237060547
    ...                      ...
    conventions:             CF-1.6
    institution:             NASA GSFC
    missing_value:           -9999.0
    references:              Kumar_etal_EMS_2006, Peters-Lidard_etal_ISSE_2007
    source:                  Noah-MP.4.0.1
    title:                   LIS land surface model output

Notice how the sizes of the lat and lon dimensions have decreased.

Subset Across Multiple Dimensions

Select snow depth for Jan 2017 within a range of lat/lon:

# define a range of dates to select
wy_2018_slice = slice('2017-10-01', '2018-09-30')
lat_slice = slice(37, 41)
lon_slice = slice(-109, -102)

# select the snow depth and subset to wy_2018_slice
snd_CO_wy2018_ds = lis_output_ds['SnowDepth_tavg'].sel(time=wy_2018_slice, lat=lat_slice, lon=lon_slice)

# inspect resulting dataset
snd_CO_wy2018_ds
<xarray.DataArray 'SnowDepth_tavg' (time: 365, lat: 40, lon: 70)>
dask.array<getitem, shape=(365, 40, 70), dtype=float32, chunksize=(1, 40, 70), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 2017-10-01 2017-10-02 ... 2018-09-30
  * lat      (lat) float32 37.05 37.15 37.25 37.35 ... 40.65 40.75 40.85 40.95
  * lon      (lon) float32 -108.9 -108.8 -108.8 -108.7 ... -102.2 -102.2 -102.1
Attributes:
    long_name:      snow depth
    standard_name:  snow_depth
    units:          m
    vmax:           999999986991104.0
    vmin:           -999999986991104.0

Plotting

We’ve imported two plotting libraries:

  • matplotlib: static plots

  • hvplot: interactive plots

We can make a quick matplotlib-based plot for the subsetted data using the .plot() function supplied by xarray.Dataset objects. For this example, we’ll select one day and plot it:

# simple matplotlilb plot
snd_CO_wy2018_ds.sel(time='2018-01-01').plot()
<matplotlib.collections.QuadMesh at 0x7f14d27015e0>
../../_images/1_exploring_lis_output_43_1.png

Similarly we can make an interactive plot using the hvplot accessor and specifying a quadmesh plot type:

# hvplot based map
snd_CO_20180101_plot = snd_CO_wy2018_ds.sel(time='2018-01-01').hvplot.quadmesh(geo=True, rasterize=True, project=True,
                                                                               xlabel='lon', ylabel='lat', cmap='viridis',
                                                                               tiles='EsriImagery')

snd_CO_20180101_plot