Duplicate of Blog | Fifa World Cup 2022 | Arne Warnke
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner
    install.packages("elo")
    Hidden output

    Load packages and set plotting options.

    library(readr)
    library(dplyr)
    library(lubridate)
    library(tidyr)
    library(ggplot2)
    library(elo)
    options(repr.plot.width = 15, repr.plot.height = 8)
    theme_set(theme_bw(24))
    Hidden output

    Import the dataset, keeping only games since 1950. Original data from https://www.kaggle.com/datasets/martj42/international-football-results-from-1872-to-2017

    fifa = read_csv(
        "results.csv", 
        col_types = cols(date = col_date("%Y-%m-%d"))
    ) %>% 
        filter(year(date) >= 1950)
    fifa

    The score is missing for a single game between Solomon Islands and Fiji, in which Fiji won 5-4.

    fifa <- fifa %>% 
      replace_na(list(home_score = 4, away_score = 5))

    Create a points column of 1 for the winner, 0.5 for a tie, and 0 for the loser.

    fifa <- fifa %>%
      mutate(points = score(home_score, away_score))
    fifa

    Count the number of games by each team, and use this to filter for teams that have played at least 200 games.

    count_home_games <- fifa %>% 
        rename(team = home_team) %>%
        count(team)
    count_away_games <- fifa %>% 
        rename(team = away_team) %>%
        count(team)
    count_games_gt200 <- bind_rows(count_home_games, count_away_games) %>%
        group_by(team) %>%
        summarize(n = sum(n)) %>%
        filter(n >= 200) %>%
        arrange(desc(n))
    count_games_gt200
    fifa <- fifa %>%
        filter(home_team %in% count_games_gt200$team | away_team %in% count_games_gt200$team)

    Get the running Elo rating after each match.

    model <- elo.run(points ~ home_team + away_team, data = fifa, k = 20)
    summary(model)

    Calculate the mean Elo Rating by team by year.