Measles
This data contains the overall and measles, mumps, and rubella immunization rates for schools across the United States. Each row corresponds to one school and includes a number of variables including the latitude, longitude, name, and vaccination rates.
Not sure where to begin? Scroll to the bottom to find challenges!
import pandas as pd
df = pd.read_csv("data/measles.csv")
df.head()
Data Dictionary
Column | Explanation |
---|---|
index | Index ID |
state | School's state |
year | School academic year |
name | School name |
type | Whether a school is public, private, charter |
city | City |
county | County |
district | School district |
enroll | Enrollment |
mmr | School's Measles, Mumps, and Rubella (MMR) vaccination rate |
overall | School's overall vaccination rate |
xrel | Percentage of students exempted from vaccination for religious reasons |
xmed | Percentage of students exempted from vaccination for medical reasons |
xper | Percentage of students exempted from vaccination for personal reasons |
Don't know where to start?
Challenges are brief tasks designed to help you practice specific skills:
- πΊοΈ Explore: What types of schools have the highest overall and mmr vaccination rates?
- π Visualize: Create a plot that visualizes the overall and mmr vaccination rates for the ten states with the highest number of schools.
- π Analyze: Does location affect the vaccination percentage of a school?
Scenarios are broader questions to help you develop an end-to-end project for your portfolio:
You are working for a public health organization. The organization has a problem: this year, the overall vaccination rate information for schools is not yet available. To gain an initial idea of the rates, your manager has asked you whether it is possible to use other data to predict the overall vaccination rate of a school. This includes such information as the mmr vaccination rate, the location, and the type of school. Your manager also wants to know how reliable your predictions are.
You will need to prepare a report that is accessible to a broad audience. It should outline your motivation, steps, findings, and conclusions.
βοΈ If you have an idea for an interesting Scenario or Challenge, or have feedback on our existing ones, let us know! You can submit feedback by pressing the question mark in the top right corner of the screen and selecting "Give Feedback". Include the phrase "Content Feedback" to help us flag it in our system.
df.head()
We see the numbers of rows and columns
df.shape
Last tail of df
df.tail()
Check null values
df.isna().values.any()
See not null and think What we are doing with our data
df.notnull().sum().sum()
β
β