Idealized Case 2: Two crossing blobs
This tutorial explores the different methods of the linking process using an example of two crossing blobs. The following chapters will be covered:
1. Data generation
We start by importing the usual libraries and adjusting some settings:
[1]:
import tobac
import numpy as np
import matplotlib.pyplot as plt
import datetime
import xarray as xr
import seaborn as sns
sns.set_context("talk")
%matplotlib inline
We will need to generate our own dataset for this tutorial. For this reason we define some bounds for our system:
[2]:
(
x_min,
y_min,
x_max,
y_max,
) = (
0,
0,
1e5,
1e5,
)
t_min, t_max = 0, 10000
We use these to create a mesh:
[3]:
def create_mesh(x_min, y_min, x_max, y_max, t_min, t_max, N_x=200, N_y=200, dt=520):
x = np.linspace(x_min, x_max, N_x)
y = np.linspace(y_min, y_max, N_y)
t = np.arange(t_min, t_max, dt)
mesh = np.meshgrid(t, y, x, indexing="ij")
return mesh
mesh = create_mesh(x_min, y_min, x_max, y_max, t_min, t_max)
Additionally, we need to set velocities for our blobs:
[4]:
v_x = 10
v_y = 10
The dataset is created by using two functions. The first creates a wandering Gaussian blob as numpy
-Array on our grid and the second transforms it into an xarray
-DataArray with an arbitrary datetime
.
[5]:
def create_wandering_blob(mesh, x_0, y_0, v_x, v_y, t_create, t_vanish, sigma=1e7):
tt, yy, xx = mesh
exponent = (xx - x_0 - v_x * (tt - t_create)) ** 2 + (
yy - y_0 - v_y * (tt - t_create)
) ** 2
blob = np.exp(-exponent / sigma)
blob = np.where(np.logical_and(tt >= t_create, tt <= t_vanish), blob, 0)
return blob
def create_xarray(array, mesh, starting_time="2022-04-01T00:00"):
tt, yy, xx = mesh
t = np.unique(tt)
y = np.unique(yy)
x = np.unique(xx)
N_t = len(t)
dt = np.diff(t)[0]
t_0 = np.datetime64(starting_time)
t_delta = np.timedelta64(dt, "s")
time = np.array([t_0 + i * t_delta for i in range(len(array))])
dims = ("time", "projection_x_coordinate", "projection_y_coordinate")
coords = {"time": time, "projection_x_coordinate": x, "projection_y_coordinate": y}
attributes = {"units": ("m s-1")}
data = xr.DataArray(data=array, dims=dims, coords=coords, attrs=attributes)
# data = data.projection_x_coordinate.assign_attrs({"units": ("m")})
# data = data.projection_y_coordinate.assign_attrs({"units": ("m")})
data = data.assign_coords(latitude=("projection_x_coordinate", x / x.max() * 90))
data = data.assign_coords(longitude=("projection_y_coordinate", y / y.max() * 90))
return data
We use the first function to create two blobs whose paths will cross. To keep them detectable as seperate features in the dataset we don’t want to just add them together. Instead we are going to use the highest value of each pixel by applying boolean masking and the resulting field is transformed into the xarray
format.
[6]:
blob_1 = create_wandering_blob(mesh, x_min, y_min, v_x, v_y, t_min, t_max)
blob_2 = create_wandering_blob(mesh, x_max, y_min, -v_x, v_y, t_min, t_max)
blob_mask = blob_1 > blob_2
blob = np.where(blob_mask, blob_1, blob_2)
data = create_xarray(blob, mesh)
/var/folders/40/kfr98p0j7n30fjp2n4ljjqbh0000gr/T/ipykernel_50349/1703872627.py:30: UserWarning: Converting non-nanosecond precision datetime values to nanosecond precision. This behavior can eventually be relaxed in xarray, as it is an artifact from pandas which is now beginning to support non-nanosecond precision values. This warning is caused by passing non-nanosecond np.datetime64 or np.timedelta64 values to the DataArray or Variable constructor; it can be silenced by converting the values to nanosecond precision ahead of time.
data = xr.DataArray(data=array, dims=dims, coords=coords, attrs=attributes)
Let’s check if we achived what we wanted by plotting the result:
[7]:
data.plot(
cmap="viridis",
col="time",
col_wrap=5,
x="projection_x_coordinate",
y="projection_y_coordinate",
size=5,
)
[7]:
<xarray.plot.facetgrid.FacetGrid at 0x13a7ad010>
Looks good! We see two features crossing each other, and they are clearly separable in every frame.
2. Feature detection
Before we can perform the tracking we need to detect the features with the usual function. The grid spacing is deduced from the generated field. We still need to find a reasonable threshold value. Let’s try a really high one:
[8]:
%%capture
spacing = np.diff(np.unique(mesh[1]))[0]
dxy, dt = tobac.get_spacings(data, grid_spacing=spacing)
features = tobac.feature_detection_multithreshold(data, dxy, threshold=0.99)
[9]:
plt.figure(figsize=(10, 6))
features.plot.scatter(x="hdim_1", y="hdim_2")
[9]:
<Axes: xlabel='hdim_1', ylabel='hdim_2'>
<Figure size 1000x600 with 0 Axes>
As you can see almost no features are detected. This means our threshold is too high and neglects many datapoints. Therefore it is a good idea to try a low threshold value:
[10]:
%%capture
features = tobac.feature_detection_multithreshold(data, dxy, threshold=0.3)
[11]:
plt.figure(figsize=(10, 6))
features.plot.scatter(x="hdim_1", y="hdim_2")
[11]:
<Axes: xlabel='hdim_1', ylabel='hdim_2'>
<Figure size 1000x600 with 0 Axes>
Here the paths of the blobs are clearly visible, but there is an area in the middle where both merge into one feature. This should be avoided. Therefore we try another value for the threshold somewhere in the middle of the available range:
[12]:
%%capture
features = tobac.feature_detection_multithreshold(data, dxy, threshold=0.8)
[13]:
plt.figure(figsize=(10, 6))
features.plot.scatter(x="hdim_1", y="hdim_2")
[13]:
<Axes: xlabel='hdim_1', ylabel='hdim_2'>
<Figure size 1000x600 with 0 Axes>
This is the picture we wanted to see. This means we can continue working with this set of features.
3. Influence of the tracking method
Now the tracking can be performed. We will create two outputs, one with method = 'random'
, and the other one with method = 'predict'
. Since we know what the velocities of our features are beforehand, we can select a reasonable value for v_max
. Normally this would need to be finetuned.
[14]:
%matplotlib inline
v_max = 20
track_1 = tobac.linking_trackpy(
features, data, dt, dxy, v_max=v_max, method_linking="random"
)
track_2 = tobac.linking_trackpy(
features, data, dt, dxy, v_max=v_max, method_linking="predict"
)
Frame 19: 2 trajectories present.
[15]:
track_1
[15]:
frame | idx | hdim_1 | hdim_2 | num | threshold_value | feature | time | timestr | projection_x_coordinate | projection_y_coordinate | latitude | longitude | cell | time_cell | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | 1 | 1.000000 | 1.000000 | 9 | 0.8 | 1 | 2022-04-01 00:00:00 | 2022-04-01 00:00:00 | 502.512563 | 502.512563 | 0.452261 | 0.452261 | 1 | 0 days 00:00:00 |
1 | 0 | 2 | 1.000000 | 198.000000 | 9 | 0.8 | 2 | 2022-04-01 00:00:00 | 2022-04-01 00:00:00 | 502.512563 | 99497.487437 | 0.452261 | 89.547739 | 2 | 0 days 00:00:00 |
2 | 1 | 1 | 10.321429 | 10.321429 | 28 | 0.8 | 3 | 2022-04-01 00:08:40 | 2022-04-01 00:08:40 | 5186.647523 | 5186.647523 | 4.667983 | 4.667983 | 1 | 0 days 00:08:40 |
3 | 1 | 2 | 10.321429 | 188.678571 | 28 | 0.8 | 4 | 2022-04-01 00:08:40 | 2022-04-01 00:08:40 | 5186.647523 | 94813.352477 | 4.667983 | 85.332017 | 2 | 0 days 00:08:40 |
4 | 2 | 1 | 20.678571 | 20.678571 | 28 | 0.8 | 5 | 2022-04-01 00:17:20 | 2022-04-01 00:17:20 | 10391.241924 | 10391.241924 | 9.352118 | 9.352118 | 1 | 0 days 00:17:20 |
5 | 2 | 2 | 20.678571 | 178.321429 | 28 | 0.8 | 6 | 2022-04-01 00:17:20 | 2022-04-01 00:17:20 | 10391.241924 | 89608.758076 | 9.352118 | 80.647882 | 2 | 0 days 00:17:20 |
6 | 3 | 1 | 31.000000 | 31.000000 | 25 | 0.8 | 7 | 2022-04-01 00:26:00 | 2022-04-01 00:26:00 | 15577.889447 | 15577.889447 | 14.020101 | 14.020101 | 1 | 0 days 00:26:00 |
7 | 3 | 2 | 31.000000 | 168.000000 | 25 | 0.8 | 8 | 2022-04-01 00:26:00 | 2022-04-01 00:26:00 | 15577.889447 | 84422.110553 | 14.020101 | 75.979899 | 2 | 0 days 00:26:00 |
9 | 4 | 2 | 41.321429 | 157.678571 | 28 | 0.8 | 10 | 2022-04-01 00:34:40 | 2022-04-01 00:34:40 | 20764.536971 | 79235.463029 | 18.688083 | 71.311917 | 2 | 0 days 00:34:40 |
8 | 4 | 1 | 41.321429 | 41.321429 | 28 | 0.8 | 9 | 2022-04-01 00:34:40 | 2022-04-01 00:34:40 | 20764.536971 | 20764.536971 | 18.688083 | 18.688083 | 1 | 0 days 00:34:40 |
10 | 5 | 1 | 51.678571 | 51.678571 | 28 | 0.8 | 11 | 2022-04-01 00:43:20 | 2022-04-01 00:43:20 | 25969.131371 | 25969.131371 | 23.372218 | 23.372218 | 1 | 0 days 00:43:20 |
11 | 5 | 2 | 51.678571 | 147.321429 | 28 | 0.8 | 12 | 2022-04-01 00:43:20 | 2022-04-01 00:43:20 | 25969.131371 | 74030.868629 | 23.372218 | 66.627782 | 2 | 0 days 00:43:20 |
12 | 6 | 1 | 62.192308 | 62.192308 | 26 | 0.8 | 13 | 2022-04-01 00:52:00 | 2022-04-01 00:52:00 | 31252.415926 | 31252.415926 | 28.127174 | 28.127174 | 1 | 0 days 00:52:00 |
13 | 6 | 2 | 62.192308 | 136.807692 | 26 | 0.8 | 14 | 2022-04-01 00:52:00 | 2022-04-01 00:52:00 | 31252.415926 | 68747.584074 | 28.127174 | 61.872826 | 2 | 0 days 00:52:00 |
14 | 7 | 1 | 72.321429 | 72.321429 | 28 | 0.8 | 15 | 2022-04-01 01:00:40 | 2022-04-01 01:00:40 | 36342.426418 | 36342.426418 | 32.708184 | 32.708184 | 1 | 0 days 01:00:40 |
15 | 7 | 2 | 72.321429 | 126.678571 | 28 | 0.8 | 16 | 2022-04-01 01:00:40 | 2022-04-01 01:00:40 | 36342.426418 | 63657.573582 | 32.708184 | 57.291816 | 2 | 0 days 01:00:40 |
16 | 8 | 1 | 82.678571 | 82.678571 | 28 | 0.8 | 17 | 2022-04-01 01:09:20 | 2022-04-01 01:09:20 | 41547.020818 | 41547.020818 | 37.392319 | 37.392319 | 1 | 0 days 01:09:20 |
17 | 8 | 2 | 82.678571 | 116.321429 | 28 | 0.8 | 18 | 2022-04-01 01:09:20 | 2022-04-01 01:09:20 | 41547.020818 | 58452.979182 | 37.392319 | 52.607681 | 2 | 0 days 01:09:20 |
19 | 9 | 2 | 93.192308 | 105.807692 | 26 | 0.8 | 20 | 2022-04-01 01:18:00 | 2022-04-01 01:18:00 | 46830.305373 | 53169.694627 | 42.147275 | 47.852725 | 2 | 0 days 01:18:00 |
18 | 9 | 1 | 93.192308 | 93.192308 | 26 | 0.8 | 19 | 2022-04-01 01:18:00 | 2022-04-01 01:18:00 | 46830.305373 | 46830.305373 | 42.147275 | 42.147275 | 1 | 0 days 01:18:00 |
20 | 10 | 1 | 103.321429 | 95.678571 | 28 | 0.8 | 21 | 2022-04-01 01:26:40 | 2022-04-01 01:26:40 | 51920.315865 | 48079.684135 | 46.728284 | 43.271716 | 1 | 0 days 01:26:40 |
21 | 10 | 2 | 103.321429 | 103.321429 | 28 | 0.8 | 22 | 2022-04-01 01:26:40 | 2022-04-01 01:26:40 | 51920.315865 | 51920.315865 | 46.728284 | 46.728284 | 2 | 0 days 01:26:40 |
22 | 11 | 1 | 113.807692 | 85.192308 | 26 | 0.8 | 23 | 2022-04-01 01:35:20 | 2022-04-01 01:35:20 | 57189.795129 | 42810.204871 | 51.470816 | 38.529184 | 1 | 0 days 01:35:20 |
23 | 11 | 2 | 113.807692 | 113.807692 | 26 | 0.8 | 24 | 2022-04-01 01:35:20 | 2022-04-01 01:35:20 | 57189.795129 | 57189.795129 | 51.470816 | 51.470816 | 2 | 0 days 01:35:20 |
24 | 12 | 1 | 124.192308 | 74.807692 | 26 | 0.8 | 25 | 2022-04-01 01:44:00 | 2022-04-01 01:44:00 | 62408.194820 | 37591.805180 | 56.167375 | 33.832625 | 1 | 0 days 01:44:00 |
25 | 12 | 2 | 124.192308 | 124.192308 | 26 | 0.8 | 26 | 2022-04-01 01:44:00 | 2022-04-01 01:44:00 | 62408.194820 | 62408.194820 | 56.167375 | 56.167375 | 2 | 0 days 01:44:00 |
26 | 13 | 1 | 134.678571 | 64.321429 | 28 | 0.8 | 27 | 2022-04-01 01:52:40 | 2022-04-01 01:52:40 | 67677.674085 | 32322.325915 | 60.909907 | 29.090093 | 1 | 0 days 01:52:40 |
27 | 13 | 2 | 134.678571 | 134.678571 | 28 | 0.8 | 28 | 2022-04-01 01:52:40 | 2022-04-01 01:52:40 | 67677.674085 | 67677.674085 | 60.909907 | 60.909907 | 2 | 0 days 01:52:40 |
29 | 14 | 2 | 144.807692 | 144.807692 | 26 | 0.8 | 30 | 2022-04-01 02:01:20 | 2022-04-01 02:01:20 | 72767.684577 | 72767.684577 | 65.490916 | 65.490916 | 2 | 0 days 02:01:20 |
28 | 14 | 1 | 144.807692 | 54.192308 | 26 | 0.8 | 29 | 2022-04-01 02:01:20 | 2022-04-01 02:01:20 | 72767.684577 | 27232.315423 | 65.490916 | 24.509084 | 1 | 0 days 02:01:20 |
30 | 15 | 1 | 155.321429 | 43.678571 | 28 | 0.8 | 31 | 2022-04-01 02:10:00 | 2022-04-01 02:10:00 | 78050.969131 | 21949.030869 | 70.245872 | 19.754128 | 1 | 0 days 02:10:00 |
31 | 15 | 2 | 155.321429 | 155.321429 | 28 | 0.8 | 32 | 2022-04-01 02:10:00 | 2022-04-01 02:10:00 | 78050.969131 | 78050.969131 | 70.245872 | 70.245872 | 2 | 0 days 02:10:00 |
32 | 16 | 1 | 165.678571 | 33.321429 | 28 | 0.8 | 33 | 2022-04-01 02:18:40 | 2022-04-01 02:18:40 | 83255.563532 | 16744.436468 | 74.930007 | 15.069993 | 1 | 0 days 02:18:40 |
33 | 16 | 2 | 165.678571 | 165.678571 | 28 | 0.8 | 34 | 2022-04-01 02:18:40 | 2022-04-01 02:18:40 | 83255.563532 | 83255.563532 | 74.930007 | 74.930007 | 2 | 0 days 02:18:40 |
34 | 17 | 1 | 175.916667 | 23.083333 | 24 | 0.8 | 35 | 2022-04-01 02:27:20 | 2022-04-01 02:27:20 | 88400.335008 | 11599.664992 | 79.560302 | 10.439698 | 1 | 0 days 02:27:20 |
35 | 17 | 2 | 175.916667 | 175.916667 | 24 | 0.8 | 36 | 2022-04-01 02:27:20 | 2022-04-01 02:27:20 | 88400.335008 | 88400.335008 | 79.560302 | 79.560302 | 2 | 0 days 02:27:20 |
36 | 18 | 1 | 186.321429 | 12.678571 | 28 | 0.8 | 37 | 2022-04-01 02:36:00 | 2022-04-01 02:36:00 | 93628.858579 | 6371.141421 | 84.265973 | 5.734027 | 1 | 0 days 02:36:00 |
37 | 18 | 2 | 186.321429 | 186.321429 | 28 | 0.8 | 38 | 2022-04-01 02:36:00 | 2022-04-01 02:36:00 | 93628.858579 | 93628.858579 | 84.265973 | 84.265973 | 2 | 0 days 02:36:00 |
38 | 19 | 1 | 196.678571 | 2.321429 | 28 | 0.8 | 39 | 2022-04-01 02:44:40 | 2022-04-01 02:44:40 | 98833.452979 | 1166.547021 | 88.950108 | 1.049892 | 1 | 0 days 02:44:40 |
39 | 19 | 2 | 196.678571 | 196.678571 | 28 | 0.8 | 40 | 2022-04-01 02:44:40 | 2022-04-01 02:44:40 | 98833.452979 | 98833.452979 | 88.950108 | 88.950108 | 2 | 0 days 02:44:40 |
Let’s have a look at the resulting tracks:
[16]:
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(20, 10))
cell_color = [None, "C0", "C1", "C2", "C3"]
for i, cell_track in track_1.groupby("cell"):
cell_track.plot(
x="projection_x_coordinate",
y="projection_y_coordinate",
ax=ax1,
marker="o",
color=cell_color[i],
label="cell {0}".format(int(i)),
)
ax1.legend()
ax1.set_title("random")
for i, cell_track in track_2.groupby("cell"):
cell_track.plot(
x="projection_x_coordinate",
y="projection_y_coordinate",
ax=ax2,
marker="o",
color=cell_color[i],
label="cell {0}".format(int(i)),
)
ax2.legend()
ax2.set_title("predict")
[16]:
Text(0.5, 1.0, 'predict')
As you can see, there is a clear difference. While in the first link output the feature positions in the top half of the graph are linked into one cell, in the second output the path of the cell follows the actual way of the Gaussian blobs we created. This is possible because method = "predict"
uses an extraploation to infer the next position from the previous timeframes.
4. Analysis
We know that the second option is “correct”, because we created the data. But can we also decude this by analyzing our tracks?
Let’s calculate the values for the velocities:
[17]:
track_1 = tobac.analysis.calculate_velocity(track_1)
track_2 = tobac.analysis.calculate_velocity(track_2)
v1 = track_1.where(track_1["cell"] == 1).dropna().v.values
v2 = track_2.where(track_1["cell"] == 1).dropna().v.values
Visualizing these can help us with our investigation:
[18]:
fig, (ax) = plt.subplots(figsize=(14, 6))
ax.set_title("Cell 1")
mask_1 = track_1["cell"] == 1
mask_2 = track_2["cell"] == 1
track_1.where(mask_1).dropna().plot(
x="time",
y="v",
ax=ax,
label='method = "random"',
marker="^",
linestyle="",
alpha=0.5,
)
track_2.where(mask_2).dropna().plot(
x="time",
y="v",
ax=ax,
label='method = "predict"',
marker="v",
linestyle="",
alpha=0.5,
)
ticks = ax.get_xticks()
plt.hlines(
[np.sqrt(v_x**2 + v_y**2)],
ticks.min(),
ticks.max(),
color="black",
label="expected velocity",
linestyle="--",
)
ax.set_ylabel("$v$ [m/s]")
plt.legend()
plt.tight_layout()
The expected velocity is just added for reference. But also without looking at this, we can see that the values for method = "random"
have an outlier, that deviates far from the other values. This is a clear sign, that this method is not suited well for this case.