Analyze Multiple Time Series
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Analyze Multiple Time Series

    This template provides a playbook to analyze multiple time series simultaneously. You will take an indepth look into your time series data by:

    1. Loading and visualizing your data
    2. Inspecting the distribution
    3. Analyzing subsets of your data
    4. Decomposing time series into seasonality, trend and noise
    5. Visualizing correlations with a clustermap
    # Load packages
    import pandas as pd
    import matplotlib.pyplot as plt
    from statsmodels.graphics import tsaplots
    import statsmodels.api as sm
    import seaborn as sns
    
    df.head()
    Unknown integration
    DataFrameavailable as
    data
    variable
    SELECT *
    FROM cinema.films
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    data.set_index('release_year')
    data[['release_year','duration']].plot()

    1. Load and visualize your data

    # Upload your data as CSV and load as a data frame
    df = pd.read_csv(
        "data.csv",
        parse_dates=["datestamp"],  # Tell pandas which column(s) to parse as dates
        index_col="datestamp",  # Use a date column as your index
    )
    df.head()
    
    # Plot settings
    %config InlineBackend.figure_format='retina'
    plt.rcParams["figure.figsize"] = (18, 10)
    plt.style.use('ggplot')
    
    # Plot all time series in the df DataFrame
    ax = df.plot(
        colormap="Spectral",  # Set a colormap to avoid overlapping colors
        fontsize=10,  # Set fontsize
        linewidth=0.8, # Set width of lines
    )
    
    # Set labels and legend
    ax.set_xlabel("Date", fontsize=12)  # X axis text
    ax.set_ylabel("Unemployment Rate", fontsize=12) # Set font size
    ax.set_title("Unemployment rate of U.S. workers by industry", fontsize=15)
    ax.legend(
        loc="center left",  # Set location of legend within bounding box
        bbox_to_anchor=(1.0, 0.5),  # Set location of bounding box
    )
    
    # Annotate your plots with vertical lines
    ax.axvline(
        "2001-07-01",  # Position of vertical line
        color="red",  # Color of line
        linestyle="--",  # Style of line
        linewidth=2, # Thickness of line
    )
    ax.axvline("2008-09-01", color="red", linestyle="--", linewidth=2)
    
    # Show plot
    plt.show()
    

    2. Inspect the distribution

    df.describe()
    
    # Generate a boxplot
    ax = df.boxplot(fontsize=10, vert=False)  # Plots boxplot horizonally if false
    ax.set_xlabel("Unemployment Percentage")
    ax.set_title("Distribution of Unemployment by industry")
    plt.show()
    

    3. Analyze subsets of your data

    a) Visualize (partial) autocorrelation

    Autocorrelation refers to the degree of correlation of a variable between two successive time intervals. It measures how the lagged version of the value of a variable is related to the original version of it in a time series.

    # Display the autocorrelation plot of your time series
    fig = tsaplots.plot_acf(
        df["Agriculture"], lags=24  # Change column to inspect
    )  # Set lag period
    
    # Show plot
    plt.show()
    
    # Display the partial autocorrelation plot of your time series
    fig = tsaplots.plot_pacf(
        df["Agriculture"], lags=24  # Change column to inspect
    )  # Set lag period
    
    # Show plot
    plt.show()