this is the nav!
Workspace
Felix Estay-Foix/

# Code Along: Analyzing Top Runner Performance from A to Z with AI using Workspace

0
Beta

### .mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Analyzing Top Runner Performance from A to Z with AI using Workspace

In this code along, we'll be analyzing Strava data! More specifically, we'll be analyzing total kilometers conquered running, comparing years, average speeds and discovering personal bests.

There's a Strava `activities.csv` file available in the workspace, but you can also follow these instructions to get your own Strava data. This will require logging into Strava and requesting a bulk export of your data through the settings page. Once your data is ready (it may take up to a few hours), you will get an email with a link to download a big folder. You do not all of this data! Simply unzip it, find a file called `activities.csv`, and upload this file into your workspace (overwriting the placeholder file).

```.mfe-app-workspace-qcdhrn{font-size:13px;line-height:1.5384615384615385;font-family:JetBrainsMonoNL,Menlo,Monaco,'Courier New',monospace;}```# import packages
import plotly.express as px
import pandas as pd``````

### Importing and prepping the data 🏋️

With the `activities.csv` file in place, let's import the CSV file.

``pd.read_csv('activities.csv')``

There's a bunch of data that we don't need here; let's zoom in on what we need.

``````# Positions of relevant columns
usecols = [0, 1, 2, 3, 6, 16, 20]

# English column names
names = [
"activity_id",
"activity_date",
"activity_name",
"activity_type",
"distance_km",
"moving_time_s",
"elevation_gain"
]
import pandas as pd

# Reading the raw data with preprocessing
"activities.csv",
)

df``````
``````# Filter the dataframe to include only runs
df_runs = df[df['activity_type'] == 'Run']

df_runs``````
``````# Convert distance_km to a float, and calculate average speed
``````

### Analyzing distances

``````import plotly.express as px
import pandas as pd

# Group the data by year and calculate the total distance run per year
df['activity_date'] = pd.to_datetime(df['activity_date'])
df['year'] = df['activity_date'].dt.year
total_distance_per_year = df.groupby('year')['distance_km'].sum().reset_index()

# Create the bar chart
fig = px.bar(total_distance_per_year, x='year', y='distance_km', title='Total Distance Run per Year')
fig.show()``````
``````# Create a cumulative area plot showing total distance run
``````
``````# Show total distance per month in the year 2022
``````

### Analyzing speed

``````# Show average speed for activities with a distance between 13k and 20k
``````

### Records

``````# Find the activity with the highest average speed
``````
``````# Find the activity with the most elevation gain
``````

### Identifying patterns

``````# Build a graph to see if the time of day that runs were started has changed over time
``````