Analyze Multiple Time Series

Beta

Analyze Multiple Time Series

This template provides a playbook to analyze multiple time series simultaneously. You will take an indepth look into your time series data by:

Loading and visualizing your data
Inspecting the distribution
Analyzing subsets of your data
Decomposing time series into seasonality, trend and noise
Visualizing correlations with a clustermap

# Load packages
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.graphics import tsaplots
import statsmodels.api as sm
import seaborn as sns

df.head()

Unknown integration

DataFrameavailable as

data

variable

SELECT *
FROM cinema.films

This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

data.set_index('release_year')
data[['release_year','duration']].plot()

1. Load and visualize your data

# Upload your data as CSV and load as a data frame
df = pd.read_csv(
    "data.csv",
    parse_dates=["datestamp"],  # Tell pandas which column(s) to parse as dates
    index_col="datestamp",  # Use a date column as your index
)
df.head()

# Plot settings
%config InlineBackend.figure_format='retina'
plt.rcParams["figure.figsize"] = (18, 10)
plt.style.use('ggplot')

# Plot all time series in the df DataFrame
ax = df.plot(
    colormap="Spectral",  # Set a colormap to avoid overlapping colors
    fontsize=10,  # Set fontsize
    linewidth=0.8, # Set width of lines
)

# Set labels and legend
ax.set_xlabel("Date", fontsize=12)  # X axis text
ax.set_ylabel("Unemployment Rate", fontsize=12) # Set font size
ax.set_title("Unemployment rate of U.S. workers by industry", fontsize=15)
ax.legend(
    loc="center left",  # Set location of legend within bounding box
    bbox_to_anchor=(1.0, 0.5),  # Set location of bounding box
)

# Annotate your plots with vertical lines
ax.axvline(
    "2001-07-01",  # Position of vertical line
    color="red",  # Color of line
    linestyle="--",  # Style of line
    linewidth=2, # Thickness of line
)
ax.axvline("2008-09-01", color="red", linestyle="--", linewidth=2)

# Show plot
plt.show()

2. Inspect the distribution

df.describe()

# Generate a boxplot
ax = df.boxplot(fontsize=10, vert=False)  # Plots boxplot horizonally if false
ax.set_xlabel("Unemployment Percentage")
ax.set_title("Distribution of Unemployment by industry")
plt.show()

3. Analyze subsets of your data

a) Visualize (partial) autocorrelation

Autocorrelation refers to the degree of correlation of a variable between two successive time intervals. It measures how the lagged version of the value of a variable is related to the original version of it in a time series.

# Display the autocorrelation plot of your time series
fig = tsaplots.plot_acf(
    df["Agriculture"], lags=24  # Change column to inspect
)  # Set lag period

# Show plot
plt.show()

# Display the partial autocorrelation plot of your time series
fig = tsaplots.plot_pacf(
    df["Agriculture"], lags=24  # Change column to inspect
)  # Set lag period

# Show plot
plt.show()

‌
‌
‌

Analyze Multiple Time Series

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Analyze Multiple Time Series

1. Load and visualize your data

2. Inspect the distribution

3. Analyze subsets of your data

a) Visualize (partial) autocorrelation

Analyze Multiple Time Series