Data Visualization with Seaborn
  • AI Chat
  • Code
  • Report
  • Spinner

    Introduction to Data Visualization with Seaborn

    👋 Welcome to your workspace! Here, you can write and run Python code and add text in Markdown. Below, we've imported the datasets from the course Introduction to Data Visualization with Seaborn as DataFrames as well as the packages used in the course. This is your sandbox environment: analyze the course datasets further, take notes, or experiment with code!

    This notebook serves as a good reference to data visualization with Seaborn

    # Importing course packages; you can add more too!
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    # Importing course datasets as DataFrames
    country_data = pd.read_csv('datasets/countries-of-the-world.csv', decimal=",")
    mpg = pd.read_csv('datasets/mpg.csv')
    student_data = pd.read_csv('datasets/student-alcohol-consumption.csv', index_col=0)
    survey = pd.read_csv('datasets/young-people-survey-responses.csv', index_col=0)
    
    country_data.head() # Display the first five rows of this DataFrame
    # Begin writing your own code here!

    Don't know where to start?

    Try completing these tasks:

    • From country_data, create a scatter plot to look at the relationship between GDP and Literacy. Use color to segment the data points by region.
    • Use mpg to create a line plot with model_year on the x-axis and weight on the y-axis. Create differentiating lines for each country of origin (origin).
    • Create a box plot from student_data to explore the relationship between the number of failures (failures) and the average final grade (G3).
    • Create a bar plot from survey to compare how Loneliness differs across values for Internet usage. Format it to have two subplots for gender.
    • Make sure to add titles and labels to your plots and adjust their format for readability!
    sns.countplot(y=country_data['Region'])
    survey.head()
    sns.countplot(x='Mathematics',hue='Gender', data = survey)
    country_data.head()
    sns.scatterplot(x='GDP ($ per capita)',y='Literacy (%)',hue='Region', data=country_data, size='Population')
    plt.show()
    sns.scatterplot(x="absences", y="G3", hue="location",hue_order=["Rural","Urban"],data=student_data)
    # Create a dictionary mapping subgroup values to colors
    palette_colors = {"Rural": "green", "Urban": "blue"}
    
    # Create a count plot of school with location subgroups
    sns.countplot(x="school",hue="location", palette=palette_colors, data=student_data)
    
    
    
    # Display plot
    plt.show()
    # Change this scatter plot to arrange the plots in rows instead of columns
    sns.relplot(x="absences", y="G3", 
                data=student_data,
                kind="scatter", 
                row="study_time")
    
    # Show plot
    plt.show()
    # Adjust further to add subplots based on family support
    sns.relplot(x="G1", y="G3", 
                data=student_data,
                kind="scatter", 
                col="schoolsup",
                col_order=["yes", "no"], 
                row="famsup",
                row_order=["yes","no"])
    
    # Show plot
    plt.show()
    # Import Matplotlib and Seaborn
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    # Create scatter plot of horsepower vs. mpg
    sns.relplot(x="horsepower", y="mpg", 
                data=mpg, kind="scatter", 
                size="cylinders",
                hue="cylinders")
    
    # Show plot
    plt.show()
    # Import Matplotlib and Seaborn
    import matplotlib.pyplot as plt
    import seaborn as sns
    
    # Create a scatter plot of acceleration vs. mpg
    sns.relplot(x="acceleration",y="mpg",
    			kind="scatter",data=mpg,
    			style="origin",hue="origin")
    
    # Show plot
    plt.show()