Internet data from around the world
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    How Much of the World Has Access to the Internet?

    1. Background

    You work for a policy consulting firm. One of the firm's principals is preparing to give a presentation on the state of internet access in the world. She needs your help answering some questions about internet accessibility across the world.

    2. The data dictionary

    The research team compiled the following tables (source):

    2.1 internet table

    • "Entity" - The name of the country, region, or group.
    • "Code" - Unique id for the country (null for other entities).
    • "Year" - Year from 1990 to 2019.
    • "Internet_usage" - The share of the entity's population who have used the internet in the last three months.

    2.2 people table

    • "Entity" - The name of the country, region, or group.
    • "Code" - Unique id for the country (null for other entities).
    • "Year" - Year from 1990 to 2020.
    • "Users" - The number of people who have used the internet in the last three months for that country, region, or group.

    2.3 broadband table

    • "Entity" - The name of the country, region, or group.
    • "Code" - Unique id for the country (null for other entities).
    • "Year" - Year from 1998 to 2020.
    • "Broadband_Subscriptions" - The number of fixed subscriptions to high-speed internet at downstream speeds >= 256 kbit/s for that country, region, or group (It is obtained by dividing the number of fixed broadband Internet subscribers by the population and then multiplying by 100).

    Acknowledgments: Max Roser, Hannah Ritchie, and Esteban Ortiz-Ospina (2015) - "Internet." OurWorldInData.org.

    3. Challenge

    Create a report to answer the principal's questions. Include:

    1. What are the top 5 countries with the highest internet use (by population share)?
    2. How many people had internet access in those countries in 2019?
    3. What are the top 5 countries with the highest internet use for each of the following regions: 'Africa Eastern and Southern', 'Africa Western and Central', 'Latin America & Caribbean', 'East Asia & Pacific', 'South Asia', 'North America', 'European Union'?
    4. Create a visualization for those five regions' internet usage over time.
    5. What are the 5 countries with the most internet users?
    6. What is the correlation between internet usage (population share) and broadband subscriptions for 2019?
    7. Summarize your findings.

    4. Load packages and DataFrames

    4.1 Load packages

    Let's start by loading all the necessary Python packages.

    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    sns.set_style('whitegrid') # Set the style for all the figures

    4.2 Load DataFrames

    internet = pd.read_csv('data/internet.csv')
    internet.tail()
    people = pd.read_csv('data/people.csv')
    people.tail()
    broadband = pd.read_csv('data/broadband.csv')
    broadband.tail()

    5. Data wrangling

    5.1 The internet DataFrame

    print(f"The internet DataFrame has {internet.shape[0]}, and {internet.shape[1]} columns")
    • The Code column is the only column in the DataFrame that has null values.