Analyze Multiple Time Series
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Analyze Multiple Time Series

    This template provides a playbook to analyze multiple time series simultaneously. You will take an indepth look into your time series data by:

    1. Loading and visualizing your data
    2. Inspecting the distribution
    3. Analyzing subsets of your data
    4. Decomposing time series into seasonality, trend and noise
    5. Visualizing correlations with a clustermap
    # Load packages
    import pandas as pd
    import matplotlib.pyplot as plt
    from statsmodels.graphics import tsaplots
    import statsmodels.api as sm
    import seaborn as sns
    

    1. Load and visualize your data

    # Upload your data as CSV and load as a data frame
    df = pd.read_csv(
        "data.csv",
        parse_dates=["datestamp"],  # Tell pandas which column(s) to parse as dates
        index_col="datestamp",  # Use a date column as your index
    )
    df.head()
    
    # Plot settings
    %config InlineBackend.figure_format='retina'
    plt.rcParams["figure.figsize"] = (18, 10)
    plt.style.use('ggplot')
    
    # Plot all time series in the df DataFrame
    ax = df.plot(
        colormap="Spectral",  # Set a colormap to avoid overlapping colors
        fontsize=10,  # Set fontsize
        linewidth=0.8, # Set width of lines
    )
    
    # Set labels and legend
    ax.set_xlabel("Date", fontsize=12)  # X axis text
    ax.set_ylabel("Unemployment Rate", fontsize=12) # Set font size
    ax.set_title("Unemployment rate of U.S. workers by industry", fontsize=15)
    ax.legend(
        loc="center left",  # Set location of legend within bounding box
        bbox_to_anchor=(1.0, 0.5),  # Set location of bounding box
    )
    
    # Annotate your plots with vertical lines
    ax.axvline(
        "2001-07-01",  # Position of vertical line
        color="red",  # Color of line
        linestyle="--",  # Style of line
        linewidth=2, # Thickness of line
    )
    ax.axvline("2008-09-01", color="red", linestyle="--", linewidth=2)
    
    # Show plot
    plt.show()
    

    2. Inspect the distribution

    df.describe()
    
    # Generate a boxplot
    ax = df.boxplot(fontsize=10, vert=False)  # Plots boxplot horizonally if false
    ax.set_xlabel("Unemployment Percentage")
    ax.set_title("Distribution of Unemployment by industry")
    plt.show()
    

    3. Analyze subsets of your data

    a) Visualize (partial) autocorrelation

    Autocorrelation refers to the degree of correlation of a variable between two successive time intervals. It measures how the lagged version of the value of a variable is related to the original version of it in a time series.

    # Display the autocorrelation plot of your time series
    fig = tsaplots.plot_acf(
        df["Agriculture"], lags=24  # Change column to inspect
    )  # Set lag period
    
    # Show plot
    plt.show()
    
    # Display the partial autocorrelation plot of your time series
    fig = tsaplots.plot_pacf(
        df["Agriculture"], lags=24  # Change column to inspect
    )  # Set lag period
    
    # Show plot
    plt.show()
    

    b) Group data by different time periods

    Uncover patterns by grouping your data by different time periods e.g. yearly, monthly, daily etc.

    # Extract time period of interest
    index_year = df.index.year  # Choose year, month, day etc.
    
    # Compute mean for each time period
    df_by_year = df.groupby(index_year).mean()  # Replace .mean() with aggregation function
    
    # Plot the mean for each time period
    ax = df_by_year.plot(fontsize=10, linewidth=1)
    
    # Set axis labels and legend
    ax.set_xlabel("Year", fontsize=12)
    ax.set_ylabel("Mean unemployment rate", fontsize=12)
    ax.axvline(
        2008,  # Position of vertical line
        color="red",  # Color of line
        linestyle="--",  # Style of line
        linewidth=2,
    )  # Thickness of line
    
    ax.legend(
        loc="center left", bbox_to_anchor=(1.0, 0.5)  # Placement of legend within bbox
    )  # Location of boundary box (bbox)
    plt.show()
    

    4. Decompose time series into seasonality, trend and noise

    Seasonality, trend and noise are essential to every time series. You can interpret them as such:

    • Trend shows you the increasing or decreasing value in the series.
    • Seasonality highlights the repeating short-term cycle in the series.
    • Noise is the random variation in the series.