Samvel Kocharyan
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
β€Œ
Sign up
Beta
Spinner

How Much of the World Has Access to the Internet?

Author

Samvel Kocharyan, [email protected]
https://www.linkedin.com/in/samvelkoch/
2022

πŸ’Ύ The data

The research team compiled the following tables (source):
internet
  • "Entity" - The name of the country, region, or group.
  • "Code" - Unique id for the country (null for other entities).
  • "Year" - Year from 1990 to 2019.
  • "Internet_usage" - The share of the entity's population who have used the internet in the last three months.
people
  • "Entity" - The name of the country, region, or group.
  • "Code" - Unique id for the country (null for other entities).
  • "Year" - Year from 1990 to 2020.
  • "Users" - The number of people who have used the internet in the last three months for that country, region, or group.
broadband
  • "Entity" - The name of the country, region, or group.
  • "Code" - Unique id for the country (null for other entities).
  • "Year" - Year from 1998 to 2020.
  • "Broadband_Subscriptions" - The number of fixed subscriptions to high-speed internet at downstream speeds >= 256 kbit/s for that country, region, or group.

Acknowledgments: Max Roser, Hannah Ritchie, and Esteban Ortiz-Ospina (2015) - "Internet." OurWorldInData.org.

# Import smomething useful
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
Hidden output
# Read the internet table
internet = pd.read_csv('data/internet.csv')

1. What are the top 5 countries with the highest internet use (by population share)?

# Let's explore internet table
internet.describe()
# Have a look at last entries by year 2019

top5_internet_2019 = internet[internet['Year'] == 2019]
top5_internet_2019_sorted = top5_internet_2019.sort_values(by=['Internet_Usage'], ascending=False)
top5_internet_2019_sorted.head(5)

TOP 5 countries with the highest internet use (by population share in 2019)

    1. πŸ‡§πŸ‡­ Bahrain
    1. πŸ‡ΆπŸ‡¦ Qatar
    1. πŸ‡°πŸ‡Ό Kuwait
    1. πŸ‡¦πŸ‡ͺ UAE
    1. πŸ‡©πŸ‡° Denmark
# But for whole dataset years coverage (1990-2019) 
# "TOP5 Internet Usage" should be absolutely different

# Will use Internet_Usage Median as indicator instead of 
# Mean and Sum. Outliers will not smash our rating. 

top5_internet = internet.groupby('Entity')['Internet_Usage'].agg([np.sum, np.mean, np.median]).sort_values(by="median", ascending=False).head(6)
top5_internet
# Hmm... Kosovo on the TOP. But this country became indepent only in 2008. 
# Not fair enough to be in this top. 
# Let's check the year when Kosovo got it's first data 
# in the 'internet' table? 

kosovo = internet[internet['Entity'] == 'Kosovo']
kosovo
### OK. Kosovo 2017. Doesn't work for TOP5 rating which accumulates 
# Internet Usage stat from 1990. 
# Let's forget about Kosovo for a while and explore our next leader countries. 

leaders = ['Iceland', 'Sweden', 'Denmark', 'Norway', 'Netherlands']
internet[(internet['Entity'].isin(leaders)) & (internet['Year'] == 1990)].head(6)
# Iceland - leader of the rating has no data for 1990. 
# But in 1991 Iternet Usage was 0.5%. 
# And for next 28 years Iceland was in the top. Fair enough for leader. 

internet[internet['Code'] == 'ISL']['Year'].agg('min')
top5_internet_leaders = internet[internet['Entity'] != 'Kosovo'].groupby('Entity')[['Entity','Internet_Usage']].agg(np.median).sort_values(by='Internet_Usage',ascending=False).head(5)
top5_internet_leaders
β€Œ
β€Œ
β€Œ
  • AI Chat
  • Code