tobac example: Compute bulk statistics during feature detection
You can compute bulk statistics for features from the feature detection or the masked features from the segmentation mask.
This example shows how to derive some basic statistics for precipitation features associated with isolated deep convective clouds using the same data as in our example for precipitation tracking. The data used in this example is downloaded from automatically as part of the notebook.
[1]:
# Import libraries
import iris
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import datetime
import shutil
from six.moves import urllib
from pathlib import Path
%matplotlib inline
[2]:
# Import tobac itself
import tobac
print('using tobac version', str(tobac.__version__))
using tobac version 1.5.3
[3]:
# Disable a few warnings:
import warnings
warnings.filterwarnings('ignore', category=UserWarning, append=True)
warnings.filterwarnings('ignore', category=RuntimeWarning, append=True)
warnings.filterwarnings('ignore', category=FutureWarning, append=True)
warnings.filterwarnings('ignore',category=pd.io.pytables.PerformanceWarning)
Download example data:
Actual download has to be performed only once for all example notebooks!
[4]:
data_out=Path('../')
[5]:
# Download the data: This only has to be done once for all tobac examples and can take a while
data_file = list(data_out.rglob('data/Example_input_Precip.nc'))
if len(data_file) == 0:
file_path='https://zenodo.org/records/3195910/files/climate-processes/tobac_example_data-v1.0.1.zip'
#file_path='http://zenodo..'
tempfile=Path('temp.zip')
print('start downloading data')
request=urllib.request.urlretrieve(file_path, tempfile)
print('start extracting data')
shutil.unpack_archive(tempfile, data_out)
tempfile.unlink()
print('data extracted')
data_file = list(data_out.rglob('data/Example_input_Precip.nc'))
Load Data from downloaded file:
[6]:
Precip=iris.load_cube(str(data_file[0]),'surface_precipitation_average')
[7]:
#display information about the iris cube containing the surface precipitation data:
display(Precip)
Surface Precipitation Average (mm h-1) | time | south_north | west_east |
---|---|---|---|
Shape | 47 | 198 | 198 |
Dimension coordinates | |||
time | x | - | - |
south_north | - | x | - |
west_east | - | - | x |
Auxiliary coordinates | |||
projection_y_coordinate | - | x | - |
y | - | x | - |
latitude | - | x | x |
longitude | - | x | x |
projection_x_coordinate | - | - | x |
x | - | - | x |
Attributes | |||
Conventions | 'CF-1.5' |
[8]:
#Set up directory to save output:
savedir=Path("Save")
if not savedir.is_dir():
savedir.mkdir()
plot_dir=Path("Plot")
if not plot_dir.is_dir():
plot_dir.mkdir()
Feature detection with bulk statistics
Feature detection is perfomed based on surface precipitation field and a range of thresholds. The statistics are calculated for each individual feature region and added to the feature output dataframe. You can decide which statistics to calculate by providing a dictionary with the name of the metric as keys (this will be the name of the column added to the dataframe) and functions as values. Note that it is also possible to provide input parameter to these functions.
Setting parameters for feature detection:
[9]:
parameters_features={}
parameters_features['position_threshold']='weighted_diff'
parameters_features['sigma_threshold']=0.5
parameters_features['min_distance']=0
parameters_features['sigma_threshold']=1
parameters_features['threshold']=[1,2,3,4,5,10,15] #mm/h
parameters_features['n_erosion_threshold']=0
parameters_features['n_min_threshold']=3
# get temporal and spation resolution of the data
dxy,dt=tobac.get_spacings(Precip)
Defining the dictionary for the statistics to be calculated
[10]:
statistics = {}
statistics['mean_precip'] = np.mean
statistics['total_precip'] = np.sum
statistics['max_precip'] = np.max
for some functions, we need to provide additional input parameters, e.g. np.percentile(). These can be provided as key word arguments in form of a dictionary. So instead of the function, you can provide a tuple with both the function and its respective input parameters:
[11]:
statistics['percentiles'] = (np.percentile, {'q': [95,99]})
[12]:
# Feature detection based on based on surface precipitation field and a range of thresholds
print('starting feature detection based on multiple thresholds')
Features= tobac.feature_detection_multithreshold(Precip,dxy,**parameters_features, statistic=statistics)
print('feature detection done')
Features.to_hdf(savedir / 'Features.h5','table')
print('features saved')
starting feature detection based on multiple thresholds
feature detection done
features saved
Look at the output:
[13]:
Features.head()
[13]:
frame | idx | hdim_1 | hdim_2 | num | threshold_value | mean_precip | total_precip | max_precip | percentiles | ... | time | timestr | south_north | west_east | projection_y_coordinate | y | latitude | longitude | projection_x_coordinate | x | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1 | 50.065727 | 139.857477 | 9 | 1 | 1.241012 | 11.169106 | 1.528488 | ([1.4821563005447387, 1.5192213106155394],) | ... | 2013-06-19 20:05:00 | 2013-06-19 20:05:00 | 331.065727 | 420.857477 | 165782.863285 | 331.065727 | 29.846362 | -94.172015 | 210678.738492 | 420.857477 |
1 | 0 | 15 | 120.527119 | 172.500325 | 4 | 1 | 1.25016 | 5.000638 | 1.267255 | ([1.266197031736374, 1.267043651342392],) | ... | 2013-06-19 20:05:00 | 2013-06-19 20:05:00 | 401.527119 | 453.500325 | 201013.559414 | 401.527119 | 30.166929 | -93.996892 | 227000.162468 | 453.500325 |
2 | 0 | 18 | 126.779273 | 145.368401 | 15 | 1 | 1.564113 | 23.461691 | 2.321664 | ([2.268769121170044, 2.311084909439087],) | ... | 2013-06-19 20:05:00 | 2013-06-19 20:05:00 | 407.779273 | 426.368401 | 204139.636582 | 407.779273 | 30.196499 | -94.139960 | 213434.200454 | 426.368401 |
3 | 0 | 34 | 111.611369 | 155.452030 | 4 | 2 | 2.313658 | 9.25463 | 2.409467 | ([2.4016830801963804, 2.4079100108146667],) | ... | 2013-06-19 20:05:00 | 2013-06-19 20:05:00 | 392.611369 | 436.452030 | 196555.684682 | 392.611369 | 30.126871 | -94.087317 | 218476.015240 | 436.452030 |
4 | 0 | 35 | 111.765231 | 164.938866 | 8 | 2 | 2.610886 | 20.887089 | 3.081343 | ([2.995926022529602, 3.064259362220764],) | ... | 2013-06-19 20:05:00 | 2013-06-19 20:05:00 | 392.765231 | 445.938866 | 196632.615461 | 392.765231 | 30.127221 | -94.037226 | 223219.433218 | 445.938866 |
5 rows × 21 columns
[ ]: