Skip to content
Read Excel Files
  • AI Chat
  • Code
  • Report
  • Spinner

    Read Excel Files

    The first step to any data science project is to import your data and, often, it will be an Excel spreadsheet with multiple sheets. Use this template to reduce data cleaning tasks later by efficiently importing Excel files into DataFrames, including indicating which sheets to import, the header rows, and index columns and dealing with null values, file comments, and date columns.

    Begin by uploading your Excel files to this workspace!

    import pandas as pd
    
    df = pd.read_excel(
        "data/employee_information.xlsx",  # Replace with your Excel file path
        # The following arguments are optional and can be removed:
        # By default pandas will only read the first sheet
        # You can change that by specifying the sheet name(s) or zero-indexed sheet position(s)
        sheet_name="Employee Addresses",
        # Indicate which zero-indexed row number(s) have the column names
        header=1,
        # If not all columns are needed, indicate which you need (useful for lower memory usage)
        usecols=(0, 1, 3, 4, 5, 6),
        # List of column names to use (useful for renaming columns)
        names=["id", "lastname", "country", "city", "street", "number"],
        # Indicate which column(s) to use as row labels
        index_col="id",
        # Lines starting with this string should be ignored (useful if there are file comments)
        comment="Last updated:",
        # Indicate the number of lines to skip at the start of the file (also useful for file comments)
        skiprows=1,
        # Indicate string(s) that should be recognized as NaN/NA
        na_values=["---", "unknown", "no info"],
        # Indicate which column(s) are date column(s)
        parse_dates=False,
        # Indicate number of rows to read (useful for large files)
        nrows=500,
    )
    
    df.head()  # Preview the DataFrame
    # Start analyzing your DataFrame!

    For more information on arguments, visit pandas' read_excel() documentation.