Read CSV Files
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Read CSV Files

    The first step to any data science project is to import your data and, often, it will be in a Comma Separated Value (CSV) format. Use this template to reduce data cleaning tasks further in your notebook by efficiently importing CSV files. For example, you can specify columns and deal with null values and dates all in one function!

    Begin by uploading your CSVs to this workspace!

    import pandas as pd
    
    df = pd.read_csv(
        "data/sheffield_weather_station.csv",  # Replace with your CSV file path
        # The following arguments are optional and can be removed:
        # If columns aren't separated by commas, indicate the delimiter here
        sep="\s+",
        # Indicate which zero-indexed row number(s) have the column names
        header=0,
        # List of column names to use (useful for renaming columns)
        names=["year", "month", "max_c", "min_c", "af", "rain", "sun"],
        # If not all columns are needed, indicate which you need (useful for lower memory usage)
        usecols=["year", "month", "max_c", "min_c", "rain", "sun"],
        # Indicate which column(s) to use as row labels
        index_col=["year", "month"],
        # Lines starting with this string should be ignored (useful if there are file comments)
        comment="#",
        # Indicate the number of lines to skip at the start of the file (also useful for file comments)
        skiprows=None,
        # Indicate string(s) that should be recognized as NaN/NA
        na_values=["---", "unknown", "no info"],
        # Indicate which column(s) are date column(s)
        parse_dates=False,
        # Indicate number of rows to read (useful for large files)
        nrows=500,
        # Encoding to use when reading file
        encoding="utf-8",
    )
    
    df.head(10)  # Preview the first 10 lines
    # Start analyzing your DataFrame!

    For more information on arguments, visit pandas' read_csv() documentation.