Workspace

Netflix Top 10 Charts (An Independent Review)

0
Beta
Spinner

📖 Background

The Netflix Top 10 charts represent the most popular movies and TV series, with millions of viewers around the globe. Understanding what makes the biggest hits is crucial to making more hits.

💪 Challenge

Explore the dataset to understand the most common attributes of popular Netflix content. Your published notebook should contain a short report on the popular content, including summary statistics, visualizations, statistical models, and text describing any insights you found.

💾 The data

There are three datasets taken from Netflix Top 10.

Each dataset is stored as a table in a PostgreSQL database.

  • all_weeks_global: This contains the weekly top 10 list for movies (films) and TV series at a global level.
  • all_weeks_countries: This contains the weekly top 10 list for movies (films) and TV series by country.
  • most_popular: All-time most popular content by number of hours viewed in the first 28 days from launch.

The data source page describes the methodology for data collection in detail. In particular:

  • Content is categorized as Film (English), TV (English), Film (Non-English), and TV (Non-English).
  • Each season of a TV series is considered separately.
    • Popularity is measured as the total number of hours that Netflix members around the world watched each title from Monday to Sunday of the previous week.
  • Weekly reporting is rounded to the nearest 10 000 viewers.

Database integration

To access the data, use the sample integration named "Competition Netflix Top 10".

Top Weekly Global Movies on Netflix

Unknown integration
DataFrameavailable as
df
variable
SELECT *
	FROM all_weeks_global
   
    
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
min_week = min (world['week'])
max_week = max(world['week'])

min_week, max_week, max_week-min_week

len(world.week.value_counts())

This data was derived for over a period of 75 weeks (518 days) from 2021-07-04 to 2022-12-04

Most watched category worldwide in descending order

world.groupby(['category'])['weekly_hours_viewed'].mean().round().sort_values(ascending = False).reset_index()
import matplotlib.pyplot as plt
import seaborn as sns

plt.figure (figsize = (10,8))
sns.lineplot(x='week', y = 'weekly_hours_viewed', hue = 'category',  data = world)
plt.ylabel('Weekly hours viewed (per 100 million)')
plt.xlabel ('Week')
plt.xticks (rotation = 60)
sns.despine()



  • AI Chat
  • Code