tobac example: Compute bulk statistics during feature detection

You can compute bulk statistics for features from the feature detection or the masked features from the segmentation mask.

This example shows how to derive some basic statistics for precipitation features associated with isolated deep convective clouds using the same data as in our example for precipitation tracking. The data used in this example is downloaded from automatically as part of the notebook.

[1]:
# Import libraries
import iris
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import datetime
import shutil
from six.moves import urllib
from pathlib import Path
%matplotlib inline
[2]:
# Import tobac itself
import tobac
print('using tobac version', str(tobac.__version__))
using tobac version 1.5.3
[3]:
# Disable a few warnings:
import warnings
warnings.filterwarnings('ignore', category=UserWarning, append=True)
warnings.filterwarnings('ignore', category=RuntimeWarning, append=True)
warnings.filterwarnings('ignore', category=FutureWarning, append=True)
warnings.filterwarnings('ignore',category=pd.io.pytables.PerformanceWarning)

Download example data:

Actual download has to be performed only once for all example notebooks!

[4]:
data_out=Path('../')
[5]:
# Download the data: This only has to be done once for all tobac examples and can take a while
data_file = list(data_out.rglob('data/Example_input_Precip.nc'))
if len(data_file) == 0:
    file_path='https://zenodo.org/records/3195910/files/climate-processes/tobac_example_data-v1.0.1.zip'
    #file_path='http://zenodo..'
    tempfile=Path('temp.zip')
    print('start downloading data')
    request=urllib.request.urlretrieve(file_path, tempfile)
    print('start extracting data')
    shutil.unpack_archive(tempfile, data_out)
    tempfile.unlink()
    print('data extracted')
    data_file = list(data_out.rglob('data/Example_input_Precip.nc'))

Load Data from downloaded file:

[6]:
Precip=iris.load_cube(str(data_file[0]),'surface_precipitation_average')
[7]:
#display information about the iris cube containing the surface precipitation data:
display(Precip)
Surface Precipitation Average (mm h-1) time south_north west_east
Shape 47 198 198
Dimension coordinates
time x - -
south_north - x -
west_east - - x
Auxiliary coordinates
projection_y_coordinate - x -
y - x -
latitude - x x
longitude - x x
projection_x_coordinate - - x
x - - x
Attributes
Conventions 'CF-1.5'
[8]:
#Set up directory to save output:
savedir=Path("Save")
if not savedir.is_dir():
    savedir.mkdir()
plot_dir=Path("Plot")
if not plot_dir.is_dir():
    plot_dir.mkdir()

Feature detection with bulk statistics

Feature detection is perfomed based on surface precipitation field and a range of thresholds. The statistics are calculated for each individual feature region and added to the feature output dataframe. You can decide which statistics to calculate by providing a dictionary with the name of the metric as keys (this will be the name of the column added to the dataframe) and functions as values. Note that it is also possible to provide input parameter to these functions.

Setting parameters for feature detection:

[9]:
parameters_features={}
parameters_features['position_threshold']='weighted_diff'
parameters_features['sigma_threshold']=0.5
parameters_features['min_distance']=0
parameters_features['sigma_threshold']=1
parameters_features['threshold']=[1,2,3,4,5,10,15] #mm/h
parameters_features['n_erosion_threshold']=0
parameters_features['n_min_threshold']=3

# get temporal and spation resolution of the data
dxy,dt=tobac.get_spacings(Precip)

Defining the dictionary for the statistics to be calculated

[10]:
statistics = {}
statistics['mean_precip'] = np.mean
statistics['total_precip'] = np.sum
statistics['max_precip'] = np.max

for some functions, we need to provide additional input parameters, e.g. np.percentile(). These can be provided as key word arguments in form of a dictionary. So instead of the function, you can provide a tuple with both the function and its respective input parameters:

[11]:
statistics['percentiles'] = (np.percentile, {'q': [95,99]})
[12]:
# Feature detection based on based on surface precipitation field and a range of thresholds
print('starting feature detection based on multiple thresholds')
Features= tobac.feature_detection_multithreshold(Precip,dxy,**parameters_features, statistic=statistics)
print('feature detection done')
Features.to_hdf(savedir / 'Features.h5','table')
print('features saved')
starting feature detection based on multiple thresholds
feature detection done
features saved

Look at the output:

[13]:
Features.head()
[13]:
frame idx hdim_1 hdim_2 num threshold_value mean_precip total_precip max_precip percentiles ... time timestr south_north west_east projection_y_coordinate y latitude longitude projection_x_coordinate x
0 0 1 50.065727 139.857477 9 1 1.241012 11.169106 1.528488 ([1.4821563005447387, 1.5192213106155394],) ... 2013-06-19 20:05:00 2013-06-19 20:05:00 331.065727 420.857477 165782.863285 331.065727 29.846362 -94.172015 210678.738492 420.857477
1 0 15 120.527119 172.500325 4 1 1.25016 5.000638 1.267255 ([1.266197031736374, 1.267043651342392],) ... 2013-06-19 20:05:00 2013-06-19 20:05:00 401.527119 453.500325 201013.559414 401.527119 30.166929 -93.996892 227000.162468 453.500325
2 0 18 126.779273 145.368401 15 1 1.564113 23.461691 2.321664 ([2.268769121170044, 2.311084909439087],) ... 2013-06-19 20:05:00 2013-06-19 20:05:00 407.779273 426.368401 204139.636582 407.779273 30.196499 -94.139960 213434.200454 426.368401
3 0 34 111.611369 155.452030 4 2 2.313658 9.25463 2.409467 ([2.4016830801963804, 2.4079100108146667],) ... 2013-06-19 20:05:00 2013-06-19 20:05:00 392.611369 436.452030 196555.684682 392.611369 30.126871 -94.087317 218476.015240 436.452030
4 0 35 111.765231 164.938866 8 2 2.610886 20.887089 3.081343 ([2.995926022529602, 3.064259362220764],) ... 2013-06-19 20:05:00 2013-06-19 20:05:00 392.765231 445.938866 196632.615461 392.765231 30.127221 -94.037226 223219.433218 445.938866

5 rows × 21 columns

[ ]: