Depths and Snow Pit Data Package Contents

(12 minutes)

Learning Objectives:

  • Tools to access data.

  • Code snippets to extract and prep data.

# standard imports
import os
from pathlib import Path
import glob
import pandas as pd
import numpy as np

#plotting imports
import matplotlib as mpl
import matplotlib.pyplot as plt
plt.style.use(['seaborn-notebook'])

Access tutorial data from Zenodo

We’ve archived datasets used for 2021 Hackweek tutorials on Zenodo, to ensure that these tutorials can be run in the future. The following code pulls data from the Zenodo ‘record’ and unzips it:

%%bash 

# Retrieve a copy of data files used in this tutorial from Zenodo.org:
# Re-running this cell will not re-download things if they already exist

mkdir -p /tmp/tutorial-data
cd /tmp/tutorial-data
wget -q -nc -O data.zip https://zenodo.org/record/5504396/files/core-datasets.zip
unzip -q -n data.zip
rm data.zip
TUTORIAL_DATA = '/tmp/tutorial-data/core-datasets'

Download snow depth data from NSIDC

From the SnowEx20 Depth Probe Landing Page, you can download data and access the User’s Guide ../../_images/SD-landingPage.png

The Community Snow Depth Probe data package is a single CSV with over 36,000 geolocated snow depths! Three different instrument types were used to measure depths and are recorded in the Measurement Tool column.

Programmatically download snow depth data from NSIDC

#os.chmod('/home/jovyan/.netrc', 0o600) #only necessary on snowex hackweek jupyterhub
#%run './scripts/nsidc-download_SNEX20_SD.001.py' 
#print('Grand Mesa 2020 Snow Depth data download complete') 
#path = Path('./data/depths/')

#for filename in path.glob('*.csv'):
#    print(filename.name)

Read the Depth File

df = pd.read_csv(f'{TUTORIAL_DATA}/depths/SnowEx2020_SnowDepths_COGM_alldepths_v01.csv', sep=',', header=0, parse_dates=[[2,3]]) #parse the date[2] and time[3] columns such that they are read in as datetime dtypes
    
print('file has been read, and is ready to use.')
file has been read, and is ready to use.
# check data types for each column
df.dtypes
Date (yyyymmdd)_Time (hh:mm, local, MST)                           datetime64[ns]
Measurement Tool (MP = Magnaprobe; M2 = Mesa 2; PR = Pit Ruler)            object
ID                                                                          int64
PitID                                                                      object
Longitude                                                                 float64
Latitude                                                                  float64
Easting                                                                   float64
Northing                                                                  float64
Depth (cm)                                                                  int64
elevation (m)                                                             float64
equipment                                                                  object
Version Number                                                              int64
dtype: object

Prep for Data Analysis

# rename some columns for ease further down
df.rename(columns = {
    'Measurement Tool (MP = Magnaprobe; M2 = Mesa 2; PR = Pit Ruler)':'Measurement Tool', 
    'Date (yyyymmdd)_Time (hh:mm, local, MST)': "Datetime"},
          inplace = True)

# set up filter for IOP date range
start = pd.to_datetime('1/28/2020') #first day of GM IOP campaign
end   = pd.to_datetime('2/12/2020') #last day of GM IOP campaign

# filter the IOP date range
df = df[(df['Datetime'] >= start) & (df['Datetime'] <= end)]

print('DataFrame shape is: ', df.shape)
df.head()
DataFrame shape is:  (36388, 12)
Datetime Measurement Tool ID PitID Longitude Latitude Easting Northing Depth (cm) elevation (m) equipment Version Number
0 2020-01-28 11:48:00 MP 100000 8N58 -108.13515 39.03045 747987.62 4324061.71 94 3148.2 CRREL_B 1
1 2020-01-28 11:48:00 MP 100001 8N58 -108.13516 39.03045 747986.75 4324061.68 74 3148.3 CRREL_B 1
2 2020-01-28 11:48:00 MP 100002 8N58 -108.13517 39.03045 747985.89 4324061.65 90 3148.2 CRREL_B 1
3 2020-01-28 11:48:00 MP 100003 8N58 -108.13519 39.03044 747984.19 4324060.49 87 3148.6 CRREL_B 1
4 2020-01-28 11:48:00 MP 100004 8N58 -108.13519 39.03042 747984.26 4324058.27 90 3150.1 CRREL_B 1

Use .groupby() to sort the data set

# group data by the measurement tool 
gb = df.groupby('Measurement Tool', as_index=False).mean().round(1)

# show mean snow depth from each tool
gb[['Measurement Tool', 'Depth (cm)']]
Measurement Tool Depth (cm)
0 M2 97.0
1 MP 94.8
2 PR 94.6

Your turn

# group data by the snow pit ID


# show mean snow depth around each snow pit


#(hint: what is the pit id column called? If you're not sure you can use df.columns to see a list of column names. Consider using .head() to display only the first 5 returns)

Find depths associated with a certain measurement tool

print('List of Measurement Tools: ', df['Measurement Tool'].unique())
List of Measurement Tools:  ['MP' 'M2' 'PR']
r = df.loc[df['Measurement Tool'] == 'PR']
print('DataFrame shape is: ', r.shape)
r.head()
DataFrame shape is:  (148, 12)
Datetime Measurement Tool ID PitID Longitude Latitude Easting Northing Depth (cm) elevation (m) equipment Version Number
37755 2020-01-30 11:24:00 PR 300001 7C15 -108.19593 39.04563 742673.94 4325582.37 100 3048.699951 ruler 1
37756 2020-01-29 15:00:00 PR 300002 6C37 -108.14791 39.00760 746962.00 4321491.00 117 3087.709961 ruler 1
37757 2020-02-09 12:30:00 PR 300003 8C31 -108.16401 39.02144 745520.00 4322983.00 98 3099.639893 ruler 1
37758 2020-01-28 09:13:00 PR 300004 6N18 -108.19103 39.03404 743137.23 4324309.43 92 3055.590088 ruler 1
37760 2020-02-10 10:30:00 PR 300006 8S41 -108.14962 39.01659 746783.00 4322484.00 95 3113.870117 ruler 1

Your turn

# find all depths recorded by the Mesa2


# (hint: copy/paste is your friend)

Let’s make sure we all have the same pd.DataFrame() again

# pit ruler snow depths from Grand Mesa IOP
r = df.loc[df['Measurement Tool'] == 'PR'] 
print( 'DataFrame is back to only pit ruler depths')
DataFrame is back to only pit ruler depths

Plotting

# plot pit ruler depths 
ax = r.plot(x='Easting', y='Northing', c='Depth (cm)', kind='scatter', alpha=0.7, colorbar=True, colormap='PuBu', legend=True)
ax.set_title('Grand Mesa Pit Ruler Depths')
ax.set_xlabel('Easting [m]')
ax.set_ylabel('Northing [m]')
plt.show()

print('Notice the point on the far right - that is the "GML" or Grand Mesa Lodge pit where all instruments were deployed for a comparison study. pitID=GML')
../../_images/02_data-package_27_0.png
Notice the point on the far right - that is the "GML" or Grand Mesa Lodge pit where all instruments were deployed for a comparison study. pitID=GML
# plot histogram of pit ruler depths
ax = r['Depth (cm)'].plot.hist(bins=25)
ax.grid()
ax.set_title('Grand Mesa Pit Ruler Depths')
ax.set_xlabel('Snow Depth (cm)')
ax.set_ylabel('Frequency')
Text(0, 0.5, 'Frequency')
../../_images/02_data-package_28_1.png

Download snow pit data from NSIDC

From the SnowEx20 Snow Pit Landing Page, you can download data and access the User’s Guide.

../../_images/SP-landingPage.png

Programmatically download snow pit data from NSIDC

# load snow pit data
#%run 'scripts/nsidc-download_SNEX20_GM_SP.001.py'
#print('Grand Mesa 2020 Snow Pit data download complete')

Don’t want to work with all the files? Method to filter files

# what files would you like to find?
parameter = 'temperature'
pitID = '5N19'
date = '20200128'

path = '{}/pits/SnowEx20_SnowPits_GMIOP_{}_{}_{}_v01.csv'.format(TUTORIAL_DATA, date, pitID, parameter)

Read the Pit Parameter File

t = pd.read_csv(path, header=7)
t
# Height (cm) Temperature (deg C)
0 102 -9.5
1 100 -9.4
2 90 -11.5
3 80 -7.5
4 70 -5.2
5 60 -4.1
6 50 -3.5
7 40 -3.0
8 30 -2.5
9 20 -2.1
10 10 -1.5
11 0 -0.8

Plotting

# plot temperature
ax = t.plot(x='Temperature (deg C)',y='# Height (cm)', legend=False)
ax.set_aspect(0.4)
ax.grid()
ax.set_title('pit {}: Temperature Profile'.format(pitID))
ax.set_xlabel('Temperature (deg C)')
ax.set_ylabel('Snow Depth (cm)')
Text(0, 0.5, 'Snow Depth (cm)')
../../_images/02_data-package_37_1.png
# grab a different pit parameter file
parameter = 'density'
path = '/tmp/tutorial-data/core-datasets/pits/SnowEx20_SnowPits_GMIOP_{}_{}_{}_v01.csv'.format(date, pitID, parameter)
d = pd.read_csv(path, header=7)
d
# Top (cm) Bottom (cm) Density A (kg/m3) Density B (kg/m3) Density C (kg/m3)
0 102.0 92.0 136.0 138.0 NaN
1 92.0 82.0 193.0 192.0 NaN
2 82.0 72.0 232.0 221.0 NaN
3 72.0 62.0 262.0 260.0 NaN
4 62.0 52.0 275.0 278.0 NaN
5 52.0 42.0 261.0 252.0 NaN
6 42.0 32.0 267.0 269.0 NaN
7 32.0 22.0 336.0 367.0 NaN
8 22.0 12.0 268.0 246.0 NaN
9 12.0 2.0 262.0 271.0 NaN
# get the average density 
d['Avg Density (kg/m3)'] = d[['Density A (kg/m3)', 'Density B (kg/m3)', 'Density C (kg/m3)']].mean(axis=1, skipna=True)
d
# Top (cm) Bottom (cm) Density A (kg/m3) Density B (kg/m3) Density C (kg/m3) Avg Density (kg/m3)
0 102.0 92.0 136.0 138.0 NaN 137.0
1 92.0 82.0 193.0 192.0 NaN 192.5
2 82.0 72.0 232.0 221.0 NaN 226.5
3 72.0 62.0 262.0 260.0 NaN 261.0
4 62.0 52.0 275.0 278.0 NaN 276.5
5 52.0 42.0 261.0 252.0 NaN 256.5
6 42.0 32.0 267.0 269.0 NaN 268.0
7 32.0 22.0 336.0 367.0 NaN 351.5
8 22.0 12.0 268.0 246.0 NaN 257.0
9 12.0 2.0 262.0 271.0 NaN 266.5
def plot_density(ax, dataframe):
    
    '''
    This function helps you plot density profiles from snow pits. Use it to iterate through 
    DataFrame rows and plot the density for each top and bottom segment.
    
    '''
    
    for index, row in dataframe.iterrows():
        # plot blue bars to represent 10cm density intervals
        top = row['# Top (cm)']
        bottom = row['Bottom (cm)']
        dens = row["Avg Density (kg/m3)"]
        ax.plot([dens, dens],[bottom, top], color='blue', linewidth=4)

        # plot a red cross to show the spread between A and B samples
        densA = row["Density A (kg/m3)"]
        densB = row["Density B (kg/m3)"]
        middle = bottom + 5.
        ax.plot([densA, densB],[middle,middle], color='red')
        
    return ax

fig, ax = plt.subplots()
ax = plot_density(ax, d)
ax.set_xlim(50, 400)
ax.set_ylim(0, 140)
ax.grid(color='lightgray', alpha=.5)
ax.set_aspect(2)
ax.set_title('pit {}: Density Profile'.format(pitID))
ax.set_xlabel('Density kg/m3')
ax.set_ylabel('Snow Depth (cm)')   
Text(0, 0.5, 'Snow Depth (cm)')
../../_images/02_data-package_40_1.png