Beta
# Importing the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Importing the course datasets
country_data = pd.read_csv('datasets/countries-of-the-world.csv', decimal=",")
mpg = pd.read_csv('datasets/mpg.csv')
student_data = pd.read_csv('datasets/student-alcohol-consumption.csv', index_col=0)
survey = pd.read_csv('datasets/young-people-survey-responses.csv', index_col=0)
Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Add your code snippets here
Explore Datasets
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- From
country_data
, create a scatter plot to look at the relationship between GDP and Literacy. Use color to segment the data points by region. - Use
mpg
to create a line plot withmodel_year
on the x-axis andweight
on the y-axis. Create differentiating lines for each country of origin (origin
). - Create a box plot from
student_data
to explore the relationship between the number of failures (failures
) and the average final grade (G3
). - Create a bar plot from
survey
to compare howLoneliness
differs across values forInternet usage
. Format it to have two subplots for gender. - Make sure to add titles and labels to your plots and adjust their format for readability!
INTRODUCTION TO SEABORN
Basic scatter plots with seaborn
Student data
student_data
sns.relplot(x="G1",
y="G3",
data=student_data,
kind="scatter",
hue="sex")
plt.show()
sns.relplot(x="G1",
y="G3",
data=student_data,
kind="scatter",
col="schoolsup",
hue="sex",
palette={"F": "red", "M": "blue"})
plt.show()
sns.relplot(data=student_data,
x="G1",
y="G3",
col="schoolsup",
row="famsup",
hue="location",
palette={"Rural": "green", "Urban": "blue"})
plt.show()
sns.countplot(data=student_data,
y="study_time",
hue="sex",
palette={"F":"red", "M":"blue"})
plt.show()
RELATIONAL PLOTS
How to visualize two quantitative variables
MPG
mpg
sns.relplot(x="horsepower",
y="mpg",
data=mpg,
kind="scatter",
size="cylinders",
hue="cylinders")
plt.show()
sns.relplot(x="horsepower",
y="mpg",
data=mpg,
kind="scatter",
hue="origin",
style="origin")
plt.show()