Data Manipulation with pandas
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Data Manipulation with pandas

    Run the hidden code cell below to import the data used in this course.

    # Import the course packages
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    
    # Import the four datasets
    avocado = pd.read_csv("datasets/avocado.csv")
    homelessness = pd.read_csv("datasets/homelessness.csv")
    temperatures = pd.read_csv("datasets/temperatures.csv")
    walmart = pd.read_csv("datasets/walmart.csv")

    Take Notes

    Add notes about the concepts you've learned and code cells with code you want to keep.

    Add your notes here Pandas allows to rapidly caluclate some basic statistics

    # Add your code snippets here
    # Print the mean of weekly_sales
    print(walmart['weekly_sales'].mean())
    
    # Print the median of weekly_sales
    print(walmart['weekly_sales'].median())

    The .agg() method allows you to apply your own custom functions to a DataFrame, as well as apply functions to more than one column of a DataFrame at once, making your aggregations super-efficient. For example,

    df['column'].agg(function)

    If the function is to be applied to more columns then Update to print IQR of temperature_c, fuel_price_usd_per_l, & unemployment print(sales[["temperature_c",'fuel_price_usd_per_l', 'unemployment']].agg(iqr))

    Important feature is groupby

    'df.groupby('groupname')'

    Explore Datasets

    Use the DataFrames imported in the first cell to explore the data and practice your skills!

    • Print the highest weekly sales for each department in the walmart DataFrame. Limit your results to the top five departments, in descending order. If you're stuck, try reviewing this video.
    • What was the total nb_sold of organic avocados in 2017 in the avocado DataFrame? If you're stuck, try reviewing this video.
    • Create a bar plot of the total number of homeless people by region in the homelessness DataFrame. Order the bars in descending order. Bonus: create a horizontal bar chart. If you're stuck, try reviewing this video.
    • Create a line plot with two lines representing the temperatures in Toronto and Rome. Make sure to properly label your plot. Bonus: add a legend for the two lines. If you're stuck, try reviewing this video.