Skip to content

Analyze Multiple Time Series

This template provides a playbook to analyze multiple time series simultaneously. You will take an indepth look into your time series data by:

  1. Loading and visualizing your data
  2. Inspecting the distribution
  3. Analyzing subsets of your data
  4. Decomposing time series into seasonality, trend and noise
  5. Visualizing correlations with a clustermap
# Load packages
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.graphics import tsaplots
import statsmodels.api as sm
import seaborn as sns
df.head()
Spinner
DataFrameas
data
variable
SELECT *
FROM cinema.films
data.set_index('release_year')
data[['release_year','duration']].plot()

1. Load and visualize your data

# Upload your data as CSV and load as a data frame
df = pd.read_csv(
    "data.csv",
    parse_dates=["datestamp"],  # Tell pandas which column(s) to parse as dates
    index_col="datestamp",  # Use a date column as your index
)
df.head()
# Plot settings
%config InlineBackend.figure_format='retina'
plt.rcParams["figure.figsize"] = (18, 10)
plt.style.use('ggplot')

# Plot all time series in the df DataFrame
ax = df.plot(
    colormap="Spectral",  # Set a colormap to avoid overlapping colors
    fontsize=10,  # Set fontsize
    linewidth=0.8, # Set width of lines
)

# Set labels and legend
ax.set_xlabel("Date", fontsize=12)  # X axis text
ax.set_ylabel("Unemployment Rate", fontsize=12) # Set font size
ax.set_title("Unemployment rate of U.S. workers by industry", fontsize=15)
ax.legend(
    loc="center left",  # Set location of legend within bounding box
    bbox_to_anchor=(1.0, 0.5),  # Set location of bounding box
)

# Annotate your plots with vertical lines
ax.axvline(
    "2001-07-01",  # Position of vertical line
    color="red",  # Color of line
    linestyle="--",  # Style of line
    linewidth=2, # Thickness of line
)
ax.axvline("2008-09-01", color="red", linestyle="--", linewidth=2)

# Show plot
plt.show()

2. Inspect the distribution

df.describe()
# Generate a boxplot
ax = df.boxplot(fontsize=10, vert=False)  # Plots boxplot horizonally if false
ax.set_xlabel("Unemployment Percentage")
ax.set_title("Distribution of Unemployment by industry")
plt.show()

3. Analyze subsets of your data

a) Visualize (partial) autocorrelation

Autocorrelation refers to the degree of correlation of a variable between two successive time intervals. It measures how the lagged version of the value of a variable is related to the original version of it in a time series.

# Display the autocorrelation plot of your time series
fig = tsaplots.plot_acf(
    df["Agriculture"], lags=24  # Change column to inspect
)  # Set lag period

# Show plot
plt.show()
# Display the partial autocorrelation plot of your time series
fig = tsaplots.plot_pacf(
    df["Agriculture"], lags=24  # Change column to inspect
)  # Set lag period

# Show plot
plt.show()