Idealized Case 2: Two crossing blobs

This tutorial explores the different methods of the linking process using an example of two crossing blobs. The following chapters will be covered:

  1. Data generation

  2. Feature detection

  3. Influence of tracking method

  4. Analysis

1. Data generation

We start by importing the usual libraries and adjusting some settings:

[1]:
import tobac
import numpy as np
import matplotlib.pyplot as plt
import datetime
import xarray as xr
import seaborn as sns

sns.set_context("talk")
%matplotlib inline

We will need to generate our own dataset for this tutorial. For this reason we define some bounds for our system:

[2]:
(
    x_min,
    y_min,
    x_max,
    y_max,
) = (
    0,
    0,
    1e5,
    1e5,
)
t_min, t_max = 0, 10000

We use these to create a mesh:

[3]:
def create_mesh(x_min, y_min, x_max, y_max, t_min, t_max, N_x=200, N_y=200, dt=520):
    x = np.linspace(x_min, x_max, N_x)
    y = np.linspace(y_min, y_max, N_y)
    t = np.arange(t_min, t_max, dt)
    mesh = np.meshgrid(t, y, x, indexing="ij")

    return mesh


mesh = create_mesh(x_min, y_min, x_max, y_max, t_min, t_max)

Additionally, we need to set velocities for our blobs:

[4]:
v_x = 10
v_y = 10

The dataset is created by using two functions. The first creates a wandering Gaussian blob as numpy-Array on our grid and the second transforms it into an xarray-DataArray with an arbitrary datetime.

[5]:
def create_wandering_blob(mesh, x_0, y_0, v_x, v_y, t_create, t_vanish, sigma=1e7):
    tt, yy, xx = mesh
    exponent = (xx - x_0 - v_x * (tt - t_create)) ** 2 + (
        yy - y_0 - v_y * (tt - t_create)
    ) ** 2
    blob = np.exp(-exponent / sigma)
    blob = np.where(np.logical_and(tt >= t_create, tt <= t_vanish), blob, 0)

    return blob


def create_xarray(array, mesh, starting_time="2022-04-01T00:00"):
    tt, yy, xx = mesh
    t = np.unique(tt)
    y = np.unique(yy)
    x = np.unique(xx)

    N_t = len(t)
    dt = np.diff(t)[0]

    t_0 = np.datetime64(starting_time)
    t_delta = np.timedelta64(dt, "s")

    time = np.array([t_0 + i * t_delta for i in range(len(array))])

    dims = ("time", "projection_x_coordinate", "projection_y_coordinate")
    coords = {"time": time, "projection_x_coordinate": x, "projection_y_coordinate": y}
    attributes = {"units": ("m s-1")}

    data = xr.DataArray(data=array, dims=dims, coords=coords, attrs=attributes)
    # data = data.projection_x_coordinate.assign_attrs({"units": ("m")})
    # data = data.projection_y_coordinate.assign_attrs({"units": ("m")})
    data = data.assign_coords(latitude=("projection_x_coordinate", x / x.max() * 90))
    data = data.assign_coords(longitude=("projection_y_coordinate", y / y.max() * 90))

    return data

We use the first function to create two blobs whose paths will cross. To keep them detectable as seperate features in the dataset we don’t want to just add them together. Instead we are going to use the highest value of each pixel by applying boolean masking and the resulting field is transformed into the xarray format.

[6]:
blob_1 = create_wandering_blob(mesh, x_min, y_min, v_x, v_y, t_min, t_max)
blob_2 = create_wandering_blob(mesh, x_max, y_min, -v_x, v_y, t_min, t_max)
blob_mask = blob_1 > blob_2
blob = np.where(blob_mask, blob_1, blob_2)

data = create_xarray(blob, mesh)
/var/folders/40/kfr98p0j7n30fjp2n4ljjqbh0000gr/T/ipykernel_50349/1703872627.py:30: UserWarning: Converting non-nanosecond precision datetime values to nanosecond precision. This behavior can eventually be relaxed in xarray, as it is an artifact from pandas which is now beginning to support non-nanosecond precision values. This warning is caused by passing non-nanosecond np.datetime64 or np.timedelta64 values to the DataArray or Variable constructor; it can be silenced by converting the values to nanosecond precision ahead of time.
  data = xr.DataArray(data=array, dims=dims, coords=coords, attrs=attributes)

Let’s check if we achived what we wanted by plotting the result:

[7]:
data.plot(
    cmap="viridis",
    col="time",
    col_wrap=5,
    x="projection_x_coordinate",
    y="projection_y_coordinate",
    size=5,
)
[7]:
<xarray.plot.facetgrid.FacetGrid at 0x13a7ad010>
../../_images/examples4doc_Basics_Idealized-Case-2_Two_crossing_Blobs_15_1.png

Looks good! We see two features crossing each other, and they are clearly separable in every frame.

2. Feature detection

Before we can perform the tracking we need to detect the features with the usual function. The grid spacing is deduced from the generated field. We still need to find a reasonable threshold value. Let’s try a really high one:

[8]:
%%capture

spacing = np.diff(np.unique(mesh[1]))[0]

dxy, dt = tobac.get_spacings(data, grid_spacing=spacing)
features = tobac.feature_detection_multithreshold(data, dxy, threshold=0.99)
[9]:
plt.figure(figsize=(10, 6))
features.plot.scatter(x="hdim_1", y="hdim_2")
[9]:
<Axes: xlabel='hdim_1', ylabel='hdim_2'>
<Figure size 1000x600 with 0 Axes>
../../_images/examples4doc_Basics_Idealized-Case-2_Two_crossing_Blobs_19_2.png

As you can see almost no features are detected. This means our threshold is too high and neglects many datapoints. Therefore it is a good idea to try a low threshold value:

[10]:
%%capture
features = tobac.feature_detection_multithreshold(data, dxy, threshold=0.3)
[11]:
plt.figure(figsize=(10, 6))
features.plot.scatter(x="hdim_1", y="hdim_2")
[11]:
<Axes: xlabel='hdim_1', ylabel='hdim_2'>
<Figure size 1000x600 with 0 Axes>
../../_images/examples4doc_Basics_Idealized-Case-2_Two_crossing_Blobs_22_2.png

Here the paths of the blobs are clearly visible, but there is an area in the middle where both merge into one feature. This should be avoided. Therefore we try another value for the threshold somewhere in the middle of the available range:

[12]:
%%capture
features = tobac.feature_detection_multithreshold(data, dxy, threshold=0.8)
[13]:
plt.figure(figsize=(10, 6))
features.plot.scatter(x="hdim_1", y="hdim_2")
[13]:
<Axes: xlabel='hdim_1', ylabel='hdim_2'>
<Figure size 1000x600 with 0 Axes>
../../_images/examples4doc_Basics_Idealized-Case-2_Two_crossing_Blobs_25_2.png

This is the picture we wanted to see. This means we can continue working with this set of features.

3. Influence of the tracking method

Now the tracking can be performed. We will create two outputs, one with method = 'random', and the other one with method = 'predict'. Since we know what the velocities of our features are beforehand, we can select a reasonable value for v_max. Normally this would need to be finetuned.

[14]:
%matplotlib inline
v_max = 20

track_1 = tobac.linking_trackpy(
    features, data, dt, dxy, v_max=v_max, method_linking="random"
)

track_2 = tobac.linking_trackpy(
    features, data, dt, dxy, v_max=v_max, method_linking="predict"
)
Frame 19: 2 trajectories present.
[15]:
track_1
[15]:
frame idx hdim_1 hdim_2 num threshold_value feature time timestr projection_x_coordinate projection_y_coordinate latitude longitude cell time_cell
0 0 1 1.000000 1.000000 9 0.8 1 2022-04-01 00:00:00 2022-04-01 00:00:00 502.512563 502.512563 0.452261 0.452261 1 0 days 00:00:00
1 0 2 1.000000 198.000000 9 0.8 2 2022-04-01 00:00:00 2022-04-01 00:00:00 502.512563 99497.487437 0.452261 89.547739 2 0 days 00:00:00
2 1 1 10.321429 10.321429 28 0.8 3 2022-04-01 00:08:40 2022-04-01 00:08:40 5186.647523 5186.647523 4.667983 4.667983 1 0 days 00:08:40
3 1 2 10.321429 188.678571 28 0.8 4 2022-04-01 00:08:40 2022-04-01 00:08:40 5186.647523 94813.352477 4.667983 85.332017 2 0 days 00:08:40
4 2 1 20.678571 20.678571 28 0.8 5 2022-04-01 00:17:20 2022-04-01 00:17:20 10391.241924 10391.241924 9.352118 9.352118 1 0 days 00:17:20
5 2 2 20.678571 178.321429 28 0.8 6 2022-04-01 00:17:20 2022-04-01 00:17:20 10391.241924 89608.758076 9.352118 80.647882 2 0 days 00:17:20
6 3 1 31.000000 31.000000 25 0.8 7 2022-04-01 00:26:00 2022-04-01 00:26:00 15577.889447 15577.889447 14.020101 14.020101 1 0 days 00:26:00
7 3 2 31.000000 168.000000 25 0.8 8 2022-04-01 00:26:00 2022-04-01 00:26:00 15577.889447 84422.110553 14.020101 75.979899 2 0 days 00:26:00
9 4 2 41.321429 157.678571 28 0.8 10 2022-04-01 00:34:40 2022-04-01 00:34:40 20764.536971 79235.463029 18.688083 71.311917 2 0 days 00:34:40
8 4 1 41.321429 41.321429 28 0.8 9 2022-04-01 00:34:40 2022-04-01 00:34:40 20764.536971 20764.536971 18.688083 18.688083 1 0 days 00:34:40
10 5 1 51.678571 51.678571 28 0.8 11 2022-04-01 00:43:20 2022-04-01 00:43:20 25969.131371 25969.131371 23.372218 23.372218 1 0 days 00:43:20
11 5 2 51.678571 147.321429 28 0.8 12 2022-04-01 00:43:20 2022-04-01 00:43:20 25969.131371 74030.868629 23.372218 66.627782 2 0 days 00:43:20
12 6 1 62.192308 62.192308 26 0.8 13 2022-04-01 00:52:00 2022-04-01 00:52:00 31252.415926 31252.415926 28.127174 28.127174 1 0 days 00:52:00
13 6 2 62.192308 136.807692 26 0.8 14 2022-04-01 00:52:00 2022-04-01 00:52:00 31252.415926 68747.584074 28.127174 61.872826 2 0 days 00:52:00
14 7 1 72.321429 72.321429 28 0.8 15 2022-04-01 01:00:40 2022-04-01 01:00:40 36342.426418 36342.426418 32.708184 32.708184 1 0 days 01:00:40
15 7 2 72.321429 126.678571 28 0.8 16 2022-04-01 01:00:40 2022-04-01 01:00:40 36342.426418 63657.573582 32.708184 57.291816 2 0 days 01:00:40
16 8 1 82.678571 82.678571 28 0.8 17 2022-04-01 01:09:20 2022-04-01 01:09:20 41547.020818 41547.020818 37.392319 37.392319 1 0 days 01:09:20
17 8 2 82.678571 116.321429 28 0.8 18 2022-04-01 01:09:20 2022-04-01 01:09:20 41547.020818 58452.979182 37.392319 52.607681 2 0 days 01:09:20
19 9 2 93.192308 105.807692 26 0.8 20 2022-04-01 01:18:00 2022-04-01 01:18:00 46830.305373 53169.694627 42.147275 47.852725 2 0 days 01:18:00
18 9 1 93.192308 93.192308 26 0.8 19 2022-04-01 01:18:00 2022-04-01 01:18:00 46830.305373 46830.305373 42.147275 42.147275 1 0 days 01:18:00
20 10 1 103.321429 95.678571 28 0.8 21 2022-04-01 01:26:40 2022-04-01 01:26:40 51920.315865 48079.684135 46.728284 43.271716 1 0 days 01:26:40
21 10 2 103.321429 103.321429 28 0.8 22 2022-04-01 01:26:40 2022-04-01 01:26:40 51920.315865 51920.315865 46.728284 46.728284 2 0 days 01:26:40
22 11 1 113.807692 85.192308 26 0.8 23 2022-04-01 01:35:20 2022-04-01 01:35:20 57189.795129 42810.204871 51.470816 38.529184 1 0 days 01:35:20
23 11 2 113.807692 113.807692 26 0.8 24 2022-04-01 01:35:20 2022-04-01 01:35:20 57189.795129 57189.795129 51.470816 51.470816 2 0 days 01:35:20
24 12 1 124.192308 74.807692 26 0.8 25 2022-04-01 01:44:00 2022-04-01 01:44:00 62408.194820 37591.805180 56.167375 33.832625 1 0 days 01:44:00
25 12 2 124.192308 124.192308 26 0.8 26 2022-04-01 01:44:00 2022-04-01 01:44:00 62408.194820 62408.194820 56.167375 56.167375 2 0 days 01:44:00
26 13 1 134.678571 64.321429 28 0.8 27 2022-04-01 01:52:40 2022-04-01 01:52:40 67677.674085 32322.325915 60.909907 29.090093 1 0 days 01:52:40
27 13 2 134.678571 134.678571 28 0.8 28 2022-04-01 01:52:40 2022-04-01 01:52:40 67677.674085 67677.674085 60.909907 60.909907 2 0 days 01:52:40
29 14 2 144.807692 144.807692 26 0.8 30 2022-04-01 02:01:20 2022-04-01 02:01:20 72767.684577 72767.684577 65.490916 65.490916 2 0 days 02:01:20
28 14 1 144.807692 54.192308 26 0.8 29 2022-04-01 02:01:20 2022-04-01 02:01:20 72767.684577 27232.315423 65.490916 24.509084 1 0 days 02:01:20
30 15 1 155.321429 43.678571 28 0.8 31 2022-04-01 02:10:00 2022-04-01 02:10:00 78050.969131 21949.030869 70.245872 19.754128 1 0 days 02:10:00
31 15 2 155.321429 155.321429 28 0.8 32 2022-04-01 02:10:00 2022-04-01 02:10:00 78050.969131 78050.969131 70.245872 70.245872 2 0 days 02:10:00
32 16 1 165.678571 33.321429 28 0.8 33 2022-04-01 02:18:40 2022-04-01 02:18:40 83255.563532 16744.436468 74.930007 15.069993 1 0 days 02:18:40
33 16 2 165.678571 165.678571 28 0.8 34 2022-04-01 02:18:40 2022-04-01 02:18:40 83255.563532 83255.563532 74.930007 74.930007 2 0 days 02:18:40
34 17 1 175.916667 23.083333 24 0.8 35 2022-04-01 02:27:20 2022-04-01 02:27:20 88400.335008 11599.664992 79.560302 10.439698 1 0 days 02:27:20
35 17 2 175.916667 175.916667 24 0.8 36 2022-04-01 02:27:20 2022-04-01 02:27:20 88400.335008 88400.335008 79.560302 79.560302 2 0 days 02:27:20
36 18 1 186.321429 12.678571 28 0.8 37 2022-04-01 02:36:00 2022-04-01 02:36:00 93628.858579 6371.141421 84.265973 5.734027 1 0 days 02:36:00
37 18 2 186.321429 186.321429 28 0.8 38 2022-04-01 02:36:00 2022-04-01 02:36:00 93628.858579 93628.858579 84.265973 84.265973 2 0 days 02:36:00
38 19 1 196.678571 2.321429 28 0.8 39 2022-04-01 02:44:40 2022-04-01 02:44:40 98833.452979 1166.547021 88.950108 1.049892 1 0 days 02:44:40
39 19 2 196.678571 196.678571 28 0.8 40 2022-04-01 02:44:40 2022-04-01 02:44:40 98833.452979 98833.452979 88.950108 88.950108 2 0 days 02:44:40

Let’s have a look at the resulting tracks:

[16]:
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(20, 10))

cell_color = [None, "C0", "C1", "C2", "C3"]
for i, cell_track in track_1.groupby("cell"):
    cell_track.plot(
        x="projection_x_coordinate",
        y="projection_y_coordinate",
        ax=ax1,
        marker="o",
        color=cell_color[i],
        label="cell {0}".format(int(i)),
    )
ax1.legend()
ax1.set_title("random")

for i, cell_track in track_2.groupby("cell"):
    cell_track.plot(
        x="projection_x_coordinate",
        y="projection_y_coordinate",
        ax=ax2,
        marker="o",
        color=cell_color[i],
        label="cell {0}".format(int(i)),
    )
ax2.legend()
ax2.set_title("predict")
[16]:
Text(0.5, 1.0, 'predict')
../../_images/examples4doc_Basics_Idealized-Case-2_Two_crossing_Blobs_32_1.png

As you can see, there is a clear difference. While in the first link output the feature positions in the top half of the graph are linked into one cell, in the second output the path of the cell follows the actual way of the Gaussian blobs we created. This is possible because method = "predict" uses an extraploation to infer the next position from the previous timeframes.

4. Analysis

We know that the second option is “correct”, because we created the data. But can we also decude this by analyzing our tracks?

Let’s calculate the values for the velocities:

[17]:
track_1 = tobac.analysis.calculate_velocity(track_1)
track_2 = tobac.analysis.calculate_velocity(track_2)

v1 = track_1.where(track_1["cell"] == 1).dropna().v.values
v2 = track_2.where(track_1["cell"] == 1).dropna().v.values

Visualizing these can help us with our investigation:

[18]:
fig, (ax) = plt.subplots(figsize=(14, 6))

ax.set_title("Cell 1")

mask_1 = track_1["cell"] == 1
mask_2 = track_2["cell"] == 1

track_1.where(mask_1).dropna().plot(
    x="time",
    y="v",
    ax=ax,
    label='method = "random"',
    marker="^",
    linestyle="",
    alpha=0.5,
)
track_2.where(mask_2).dropna().plot(
    x="time",
    y="v",
    ax=ax,
    label='method = "predict"',
    marker="v",
    linestyle="",
    alpha=0.5,
)

ticks = ax.get_xticks()

plt.hlines(
    [np.sqrt(v_x**2 + v_y**2)],
    ticks.min(),
    ticks.max(),
    color="black",
    label="expected velocity",
    linestyle="--",
)


ax.set_ylabel("$v$ [m/s]")
plt.legend()
plt.tight_layout()
../../_images/examples4doc_Basics_Idealized-Case-2_Two_crossing_Blobs_37_0.png

The expected velocity is just added for reference. But also without looking at this, we can see that the values for method = "random" have an outlier, that deviates far from the other values. This is a clear sign, that this method is not suited well for this case.