tobac.utils.general#

Description

General tobac utilities

tobac.utils.general.add_coordinates_3D(t, variable_cube, vertical_coord=None, vertical_axis=None, assume_coords_fixed_in_time=True, use_standard_names=None)#
Function adding coordinates from the tracking cube to the trajectories

for the 3D case: time, longitude&latitude, x&y dimensions, and altitude

Parameters:
  • t (pandas DataFrame) – Input features

  • variable_cube (iris.cube.Cube) – Cube (usually the one you are tracking on) at least conaining the dimension of ‘time’. Typically, ‘longitude’,’latitude’,’x_projection_coordinate’,’y_projection_coordinate’, and ‘altitude’ (if 3D) are the coordinates that we expect, although this function will happily interpolate along any dimension coordinates you give.

  • vertical_coord (str or int) – Name or axis number of the vertical coordinate. If None, tries to auto-detect. If it is a string, it looks for the coordinate or the dimension name corresponding to the string. If it is an int, it assumes that it is the vertical axis. Note that if you only have a 2D or 3D coordinate for altitude, you must pass in an int.

  • vertical_axis (int or None) – Axis number of the vertical.

  • assume_coords_fixed_in_time (bool) – If true, it assumes that the coordinates are fixed in time, even if the coordinates say they vary in time. This is, by default, True, to preserve legacy functionality. If False, it assumes that if a coordinate says it varies in time, it takes the coordinate at its word.

  • use_standard_names (bool) – If true, when interpolating a coordinate, it looks for a standard_name and uses that to name the output coordinate, to mimic iris functionality. If false, uses the actual name of the coordinate to output.

Returns:

trajectories with added coordinates

Return type:

pandas DataFrame

tobac.utils.general.combine_feature_dataframes(feature_df_list, renumber_features=True, old_feature_column_name=None, sort_features_by=None)#

Function to combine a list of tobac feature detection dataframes into one combined dataframe that can be used for tracking or segmentation.

Parameters:
  • feature_df_list (array-like of Pandas DataFrames) – A list of dataframes (generated, for example, by running feature detection on multiple nodes).

  • renumber_features (bool, optional (default: True)) – If true, features are renumber with contiguous integers. If false, the old feature numbers will be retained, but an exception will be raised if there are any non-unique feature numbers. If you have non-unique feature numbers and want to preserve them, use the old_feature_column_name to save the old feature numbers to under a different column name.

  • old_feature_column_name (str or None, optional (default: None)) – The column name to preserve old feature numbers in. If None, these old numbers will be deleted. Users may want to enable this feature if they have run segmentation with the separate dataframes and therefore old feature numbers.

  • sort_features_by (list, str or None, optional (default: None)) – The sorting order to pass to Dataframe.sort_values for the merged dataframe. If None, will default to [“frame”, “idx”] if renumber_features is True, or “feature” if renumber_features is False.

Returns:

One combined DataFrame.

Return type:

pd.DataFrame

tobac.utils.general.get_bounding_box(x, buffer=1)#

Finds the bounding box of a ndarray

This is the smallest bounding rectangle for nonzero values as explained here: https://stackoverflow.com/questions/31400769/bounding-box-of-numpy-array

Parameters:
  • x (numpy.ndarray) – Array for which the bounding box is to be determined.

  • buffer (int, optional) – Number to set a buffer between the nonzero values and the edges of the box. Default is 1.

Returns:

bbox – Dimensionwise list of the indices representing the edges of the bounding box.

Return type:

list

tobac.utils.general.get_spacings(field_in, grid_spacing=None, time_spacing=None, average_method='arithmetic')#

Determine spatial and temporal grid spacing of the input data.

Parameters:
  • field_in (iris.cube.Cube) – Input field where to get spacings.

  • grid_spacing (float, optional) – Manually sets the grid spacing if specified. Default is None.

  • time_spacing (float, optional) – Manually sets the time spacing if specified. Default is None.

  • average_method (string, optional) –

    Defines how spacings in x- and y-direction are combined.

    • ’arithmetic’ : standard arithmetic mean like (dx+dy)/2

    • ’geometric’ : geometric mean; conserves gridbox area

    Default is ‘arithmetic’.

Returns:

  • dxy (float) – Grid spacing in metres.

  • dt (float) – Time resolution in seconds.

Raises:

ValueError – If input_cube does not contain projection_x_coord and projection_y_coord or keyword argument grid_spacing.

tobac.utils.general.spectral_filtering(dxy, field_in, lambda_min, lambda_max, return_transfer_function=False)#

This function creates and applies a 2D transfer function that can be used as a bandpass filter to remove certain wavelengths of an atmospheric input field (e.g. vorticity, IVT, etc).

Parameters:#

dxyfloat

Grid spacing in m.

field_in: numpy.array

2D field with input data.

lambda_min: float

Minimum wavelength in m.

lambda_max: float

Maximum wavelength in m.

return_transfer_function: boolean, optional

default: False. If set to True, then the 2D transfer function and the corresponding wavelengths are returned.

Returns:#

filtered_field: numpy.array

Spectrally filtered 2D field of data (with same shape as input data).

transfer_function: tuple

Two 2D fields, where the first one corresponds to the wavelengths in the spectral space of the domain and the second one to the 2D transfer function of the bandpass filter. Only returned, if return_transfer_function is True.

tobac.utils.general.standardize_track_dataset(TrackedFeatures, Mask, Projection=None)#

Combine a feature mask with the feature data table into a common dataset returned by tobac.segmentation

CAUTION: this function is experimental. No data structures output are guaranteed to be supported in future versions of tobac. with the TrackedFeatures dataset returned by tobac.linking_trackpy. Also rename the variables to be more descriptive and comply with cf-tree. Convert the default cell parent ID to an integer table. Add a cell dimension to reflect Projection is an xarray DataArray TODO: Add metadata attributes

Parameters:
  • TrackedFeatures (xarray.core.dataset.Dataset) – xarray dataset of tobac Track information, the xarray dataset returned by tobac.tracking.linking_trackpy

  • Mask (xarray.core.dataset.Dataset) – xarray dataset of tobac segmentation mask information, the xarray dataset returned by tobac.segmentation.segmentation

  • Projection (xarray.core.dataarray.DataArray, default = None) – array.DataArray of the original input dataset (gridded nexrad data for example). If using gridded nexrad data, this can be input as: data[‘ProjectionCoordinateSystem’] An example of the type of information in the dataarray includes the following attributes: latitude_of_projection_origin :29.471900939941406 longitude_of_projection_origin :-95.0787353515625 _CoordinateTransformType :Projection _CoordinateAxes :x y z time _CoordinateAxesTypes :GeoX GeoY Height Time grid_mapping_name :azimuthal_equidistant semi_major_axis :6370997.0 inverse_flattening :298.25 longitude_of_prime_meridian :0.0 false_easting :0.0 false_northing :0.0

Returns:

ds – xarray dataset of merged Track and Segmentation Mask datasets with renamed variables.

Return type:

xarray.core.dataset.Dataset

tobac.utils.general.transform_feature_points(features, new_dataset, latitude_name=None, longitude_name=None, altitude_name=None, max_time_away=None, max_space_away=None, max_vspace_away=None, warn_dropped_features=True)#

Function to transform input feature dataset horizontal grid points to a different grid. The typical use case for this function is to transform detected features to perform segmentation on a different grid.

The existing feature dataset must have some latitude/longitude coordinates associated with each feature, and the new_dataset must have latitude/longitude available with the same name. Note that due to xarray/iris incompatibilities, we suggest that the input coordinates match the standard_name from Iris.

Parameters:
  • features (pd.DataFrame) – Input feature dataframe

  • new_dataset (iris.cube.Cube or xarray) – The dataset to transform the

  • latitude_name (str) – The name of the latitude coordinate. If None, tries to auto-detect.

  • longitude_name (str) – The name of the longitude coordinate. If None, tries to auto-detect.

  • altitude_name (str) – The name of the altitude coordinate. If None, tries to auto-detect.

  • max_time_away (datetime.timedelta) – The maximum time delta to associate feature points away from.

  • max_space_away (float) – The maximum horizontal distance (in meters) to transform features to.

  • max_vspace_away (float) – The maximum vertical distance (in meters) to transform features to.

  • warn_dropped_features (bool) – Whether or not to print a warning message if one of the max_* options is going to result in features that are dropped.

Returns:

transformed_features – A new feature dataframe, with the coordinates transformed to the new grid, suitable for use in segmentation

Return type:

pd.DataFrame

Functions

add_coordinates(features, variable_cube[, ...])

Add coordinates from the input cube of the feature detection to the trajectories/features.

add_coordinates_3D(t, variable_cube[, ...])

Function adding coordinates from the tracking cube to the trajectories

combine_feature_dataframes(feature_df_list)

Function to combine a list of tobac feature detection dataframes into one combined dataframe that can be used for tracking or segmentation.

combine_tobac_feats(list_of_feats[, ...])

WARNING: This function has been deprecated and will be removed in a future release, please use 'combine_feature_dataframes' instead

get_bounding_box(x[, buffer])

Finds the bounding box of a ndarray

get_spacings(field_in[, grid_spacing, ...])

Determine spatial and temporal grid spacing of the input data.

spectral_filtering(dxy, field_in, ...[, ...])

This function creates and applies a 2D transfer function that can be used as a bandpass filter to remove certain wavelengths of an atmospheric input field (e.g. vorticity, IVT, etc).

standardize_track_dataset(TrackedFeatures, Mask)

Combine a feature mask with the feature data table into a common dataset returned by tobac.segmentation

transform_feature_points(features, new_dataset)

Function to transform input feature dataset horizontal grid points to a different grid.