tobac.segmentation.segmentation#
- tobac.segmentation.segmentation(features, field, dxy, threshold=0.003, target='maximum', level=None, method='watershed', max_distance=None, vertical_coord=None, PBC_flag='none', seed_3D_flag='column', seed_3D_size=5, segment_number_below_threshold=0, segment_number_unassigned=0, statistic=None, time_padding=datetime.timedelta(microseconds=500000))#
Use watershedding to determine region above a threshold value around initial seeding position for all time steps of the input data. Works both in 2D (based on single seeding point) and 3D and returns a mask with zeros everywhere around the identified regions and the feature id inside the regions.
Calls segmentation_timestep at each individal timestep of the input data.
- Parameters:
features (pandas.DataFrame) – Output from trackpy/maketrack.
field (iris.cube.Cube or xarray.DataArray) – Containing the field to perform the watershedding on.
dxy (float) – Grid spacing of the input data in meters.
threshold (float, optional) – Threshold for the watershedding field to be used for the mask. Default is 3e-3.
target ({'maximum', 'minimum'}, optional) – Flag to determine if tracking is targetting minima or maxima in the data. Default is ‘maximum’.
level (slice of iris.cube.Cube, optional) – Levels at which to seed the cells for the watershedding algorithm. Default is None.
method ({'watershed'}, optional) – Flag determining the algorithm to use (currently watershedding implemented). ‘random_walk’ could be uncommented.
max_distance (float, optional) – Maximum distance from a marker allowed to be classified as belonging to that cell in meters. Default is None.
vertical_coord ({'auto', 'z', 'model_level_number', 'altitude',) – ‘geopotential_height’}, optional Name of the vertical coordinate for use in 3D segmentation case
PBC_flag ({'none', 'hdim_1', 'hdim_2', 'both'}) – Sets whether to use periodic boundaries, and if so in which directions. ‘none’ means that we do not have periodic boundaries ‘hdim_1’ means that we are periodic along hdim1 ‘hdim_2’ means that we are periodic along hdim2 ‘both’ means that we are periodic along both horizontal dimensions
seed_3D_flag (str('column', 'box')) – Seed 3D field at feature positions with either the full column (default) or a box of user-set size
seed_3D_size (int or tuple (dimensions equal to dimensions of field)) – This sets the size of the seed box when seed_3D_flag is ‘box’. If it’s an integer (units of number of pixels), the seed box is identical in all dimensions. If it’s a tuple, it specifies the seed area for each dimension separately, in units of pixels. Note: we strongly recommend the use of odd numbers for this. If you give an even number, your seed box will be biased and not centered around the feature. Note: if two seed boxes overlap, the feature that is seeded will be the closer feature.
segment_number_below_threshold (int) – the marker to use to indicate a segmentation point is below the threshold.
segment_number_unassigned (int) – the marker to use to indicate a segmentation point is above the threshold but unsegmented.
statistic (dict, optional) – Default is None. Optional parameter to calculate bulk statistics within feature detection. Dictionary with callable function(s) to apply over the region of each detected feature and the name of the statistics to appear in the feature output dataframe. The functions should be the values and the names of the metric the keys (e.g. {‘mean’: np.mean})
time_padding (timedelta, optional) – If set, allows for segmentation to be associated with a feature input timestep that is time_padding off of the feature. Extremely useful when converting between micro- and nanoseconds, as is common when using Pandas dataframes.
- Returns:
segmentation_out (iris.cube.Cube) – Mask, 0 outside and integer numbers according to track inside the area/volume of the feature.
features_out (pandas.DataFrame) – Feature dataframe including the number of cells (2D or 3D) in the segmented area/volume of the feature at the timestep.
- Raises:
ValueError – If field_in.ndim is neither 3 nor 4 and ‘time’ is not included in coords.
- Return type:
tuple[xarray.DataArray, pandas.DataFrame]