# Strava: Calculating the similarity of two runs

I go running several times a week and wanted to compare my runs against each other to see how similar they are.

I record my runs with the Strava app and it has an API that returns lat/long coordinates for each run in the Google encoded polyline algorithm format.

We can use the polyline library to decode these values into a list of lat/long tuples. For example:

```
import polyline
polyline.decode('u{~vFvyys@fS]')
[(40.63179, -8.65708), (40.62855, -8.65693)]
```

Once we've got the route defined as a set of coordinates we need to compare them. My Googling led me to an algorithm called Dynamic Time Warping

DTW is a method that calculates an optimal match between two given sequences (e.g. time series) with certain restrictions. The sequences are "warped" non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension.

The fastdtw library implements an approximation of this library and returns a value indicating the distance between sets of points.

We can see how to apply fastdtw and polyline against Strava data in the following example:

```
import os
import polyline
import requests
from fastdtw import fastdtw
token = os.environ["TOKEN"]
headers = {'Authorization': "Bearer {0}".format(token)}
def find_points(activity_id):
r = requests.get("https://www.strava.com/api/v3/activities/{0}".format(activity_id), headers=headers)
response = r.json()
line = response["map"]["polyline"]
return polyline.decode(line)
```

Now let's try it out on two runs, 1361109741 and 1346460542:

```
from scipy.spatial.distance import euclidean
activity1_id = 1361109741
activity2_id = 1346460542
distance, path = fastdtw(find_points(activity1_id), find_points(activity2_id), dist=euclidean)
>>> print(distance)
2.91985018100644
```

These two runs are both near my house so the value is small. Let's change the second route to be from my trip to New York:

```
activity1_id = 1361109741
activity2_id = 1246017379
distance, path = fastdtw(find_points(activity1_id), find_points(activity2_id), dist=euclidean)
>>> print(distance)
29383.492965394034
```

Much bigger!

I'm not really interested in the actual value returned but I am interested in the relative values. I'm building a little application to generate routes that I should run and I want it to come up with a routes that are different to recent ones that I've run. This score can now form part of the criteria.

##### About the author

Mark Needham is a Developer Relations Engineer for Neo4j, the world's leading graph database.