Workspace
Marius-Dumitru Birsan/

Explore a DataFrame

0
Beta
Spinner

Explore a DataFrame

Welcome to your workspace! In this walkthrough, you will learn the basics of Workspace as you load data and explore it with Python!

Keep an eye out for 💪  icons throughout the notebook. These will indicate opportunities for you to try out Workspace for yourself!

🏃  Import the data

If you click on the file browser icon, you can see that you have access to event_details.csv, a file that contains ticket sales of different events. The cell below uses pandas to import the data and preview it.

Go ahead and try to run the cell now to import and inspect the data!

To run a cell, click inside it and click "Run" or the ► icon. You can also use Shift-Enter to run a selected cell and automatically switch to the next cell.

# Import pandas
import pandas as pd

# Import the data as a DataFrame
event_details = pd.read_csv("event_details.csv")

# Preview the DataFrame
event_details

💪  Browse through the interactive table to see if you can already learn anything from the data!

Next, we may want to use the .info() method to print a summary of the DataFrame. You can find each column's name, data type, and the number of non-null rows.

event_details.info()

We see there are no missing values in any of the eight columns, and we have three numeric variables (month, total_sold, and total_sales).

🎨  Visualize the data

An essential skill in exploratory analysis is data visualization. Let's look at the total number of tickets sold by event category. To do so, we will group the DataFrame by the category_name and take the sum of all tickets sold per category.

# Group the DataFrame by the category_name column
category_totals = event_details.groupby("category_name", as_index=False)["total_sold"].sum()

# Sort the DataFrame by the total tickets sold
category_totals.sort_values(by="total_sold", ascending=False, inplace=True)

# Preview the new DataFrame
category_totals

Workspace has a handy chart cell that allows you to quickly generate and customize different chart types. Let's use a bar chart to visualize the DataFrame we created above.

Select the cell below and click "Refresh" to generate the chart!

💪  Be sure to try out other data visualizations by adjusting the chart type, the x-axis, y-axis, and grouping options!

Current Type: Bar
Current X-axis: total_sold
Current Y-axis: category_name
Current Color: None

Total tickets sold by event category

🔬  Go forth and analyze!

Well done! You have successfully used Python to load data and explore the resulting DataFrame. Feel free to continue to explore the data and expand on this workspace.

When you're finished, make sure to publish your work which can be shared with peers and featured on your DataCamp profile.

After you have finished preparing your report, consider the following options:

  • Try out our ready-to-use datasets. These cover a variety of topics and include flat files such as csvs and additional databases for you to test out your SQL skills!
  • Kickstart your next project by using one of our templates. These provide the code and instructions on various data science topics, ranging from machine learning to visualization.
  • Want to go at it on your own? Open a blank workspace and get coding!
  • AI Chat
  • Code