Cleaning Data in Python
Run the hidden code cell below to import the data used in this course.
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
# Add your code snippets here
Use the DataFrames imported in the first cell to explore the data and practice your skills!
- For each DataFrame, inspect the data types of each column and, where needed, clean and convert columns into the correct data type. You should also rename any columns to have more descriptive titles.
- Identify and remove all the duplicate rows in
- Inspect the unique values of all the columns in
airlinesand clean any inconsistencies.
- For the
airlinesDataFrame, create a new column called
dest_region, where values representing US regions map to
Falseand all other regions map to
bankingDataFrame contains out of date ages. Update the
Agecolumn using today's date and the
- Clean the
restaurants_newDataFrame so that it better matches the categories in the
typecolumn of the
restaurantsDataFrame. Afterward, given typos in restaurant names, use record linkage to generate possible pairs of rows between
restaurants_newusing criteria you think is best.