Workspace
Aditya Raghunandan Kodavanti/

Data Manipulation with pandas

0
Beta
Spinner

Data Manipulation with pandas

Run the hidden code cell below to import the data used in this course.

# Import the course packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Import the four datasets
avocado = pd.read_csv("datasets/avocado.csv")
homelessness = pd.read_csv("datasets/homelessness.csv")
temperatures = pd.read_csv("datasets/temperatures.csv")
walmart = pd.read_csv("datasets/walmart.csv")

Take Notes

Add notes about the concepts you've learned and code cells with code you want to keep.

Add your notes here

# Add your code snippets here

Explore Datasets

Use the DataFrames imported in the first cell to explore the data and practice your skills!

  • Print the highest weekly sales for each department in the walmart DataFrame. Limit your results to the top five departments, in descending order. If you're stuck, try reviewing this video.
  • What was the total nb_sold of organic avocados in 2017 in the avocado DataFrame? If you're stuck, try reviewing this video.
  • Create a bar plot of the total number of homeless people by region in the homelessness DataFrame. Order the bars in descending order. Bonus: create a horizontal bar chart. If you're stuck, try reviewing this video.
  • Create a line plot with two lines representing the temperatures in Toronto and Rome. Make sure to properly label your plot. Bonus: add a legend for the two lines. If you're stuck, try reviewing this video.
walmart.head()
#1
walmart1=walmart.groupby('department')['weekly_sales'].max()
walmart1.sort_values(ascending=False).head()
#or

#walmart1=walmart1.to_frame()
#walmart1.sort_values('weekly_sales',ascending=False).head()
avocado.head()
#2
avocado[(avocado['type']=='organic') & (avocado['year']==2017)]['nb_sold'].sum()
homelessness.head()
#3
homelessness1=homelessness.groupby('state').agg({'state_pop':sum}).sort_values('state_pop',ascending=False)
homelessness1.reset_index(inplace=True)
homelessness1.head()
plt.barh(homelessness1['state'],homelessness1['state_pop'])
plt.tight_layout()
plt.show()
temperatures.head()
t_t=temperatures[temperatures['city']=='Toronto']
t=temperatures[temperatures['city']=='Rome']
t['date']
#4
t_t=temperatures[temperatures['city']=='Toronto']
t=temperatures[temperatures['city']=='Rome']
plt.plot(t_t['date'],t_t['avg_temp_c'],label='Toronto')
plt.plot(t['date'],t['avg_temp_c'],label='Rome')
plt.legend()
plt.show()
  • AI Chat
  • Code