Skip to content
Soccer Through the Ages
  • AI Chat
  • Code
  • Report
  • Spinner

    Soccer Through the Ages

    This dataset contains information on international soccer games throughout the years. It includes results of soccer games and information about the players who scored the goals. The dataset contains data from 1872 up to 2023.

    💾 The data

    • data/results.csv - CSV with results of soccer games between 1872 and 2023
      • home_score - The score of the home team, excluding penalty shootouts
      • away_score - The score of the away team, excluding penalty shootouts
      • tournament - The name of the tournament
      • city - The name of the city where the game was played
      • country - The name of the country where the game was played
      • neutral - Whether the game was played at a neutral venue or not
    • data/shootouts.csv - CSV with results of penalty shootouts in the soccer games
      • winner - The team that won the penalty shootout
    • data/goalscorers.csv - CSV with information on goal scorers of some of the soccer games in the results CSV
      • team - The team that scored the goal
      • scorer - The player who scored the goal
      • minute - The minute in the game when the goal was scored
      • own_goal - Whether it was an own goal or not
      • penalty - Whether the goal was scored as a penalty or not

    The following columns can be found in all datasets:

    • date - The date of the soccer game
    • home_team - The team that played at home
    • away_team - The team that played away

    These shared columns fully identify the game that was played and can be used to join data between the different CSV files.

    Source: GitHub

    Introduction

    In my data analysis of international football results spanning from 1873, I have focused on showing the depth and range of my analytical skills. This includes unveiling the victorious nations since 1960, scrutinizing the distribution of total goals across minutes, deciphering top hat-trick scorers, and dissecting the proportion disparity between home and away wins. The spotlight then shifts to a more detailed examination of Euro 2024 winners, employing a range of visualizations to highlight their historical results to give an indication of future success in the tournament.

    Hidden code results
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df1
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df2
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df3
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df4
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df7
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Preliminary analysis

    The data covered a range of types: TIMESTAMP WITH TIME ZONE, VARCHAR, BOOLEAN and BIG INT. I found no NULL values. The data covered the results from 146 tournments.

    Initial analysis of results

    I focused on 4 areas:

    • the top 15 winning countries since 1960
    • total goals scored per minute
    • the 10 top hat trick scorers
    • the proportiopn difference between home and away wins
    Hidden code df6
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.