EDA: CO2 Emissions and Bicycle Market Analysis
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    EDA: CO2 Emissions and Bicycle Market Analysis

    1️⃣ Python 🐍 - CO2 Emissions

    📖 Background

    You volunteer for a public policy advocacy organization in Canada, and your colleague asked you to help her draft recommendations for guidelines on CO2 emissions rules.

    After researching emissions data for a wide range of Canadian vehicles, she would like you to investigate which vehicles produce lower emissions.

    💾 The data I

    You have access to seven years of CO2 emissions data for Canadian vehicles (source):

    • "Make" - The company that manufactures the vehicle.
    • "Model" - The vehicle's model.
    • "Vehicle Class" - Vehicle class by utility, capacity, and weight.
    • "Engine Size(L)" - The engine's displacement in liters.
    • "Cylinders" - The number of cylinders.
    • "Transmission" - The transmission type: A = Automatic, AM = Automatic Manual, AS = Automatic with select shift, AV = Continuously variable, M = Manual, 3 - 10 = the number of gears.
    • "Fuel Type" - The fuel type: X = Regular gasoline, Z = Premium gasoline, D = Diesel, E = Ethanol (E85), N = natural gas.
    • "Fuel Consumption Comb (L/100 km)" - Combined city/highway (55%/45%) fuel consumption in liters per 100 km (L/100 km).
    • "CO2 Emissions(g/km)" - The tailpipe carbon dioxide emissions in grams per kilometer for combined city and highway driving.

    The data comes from the Government of Canada's open data website.

    # Import the pandas and numpy packages
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    import matplotlib.lines as mlines
    
    # Load the data
    cars = pd.read_csv('data/co2_emissions_canada.csv')
    
    # create numpy arrays
    cars_makes = cars['Make'].to_numpy()
    cars_models = cars['Model'].to_numpy()
    cars_classes = cars['Vehicle Class'].to_numpy()
    cars_engine_sizes = cars['Engine Size(L)'].to_numpy()
    cars_cylinders = cars['Cylinders'].to_numpy()
    cars_transmissions = cars['Transmission'].to_numpy()
    cars_fuel_types = cars['Fuel Type'].to_numpy()
    cars_fuel_consumption = cars['Fuel Consumption Comb (L/100 km)'].to_numpy()
    cars_co2_emissions = cars['CO2 Emissions(g/km)'].to_numpy()
    
    # Preview the dataframe
    cars
    # Look at the first ten items in the CO2 emissions array
    cars_co2_emissions[:10]

    💪 Challenge I

    Help your colleague gain insights on the type of vehicles that have lower CO2 emissions. Include:

    1. What is the median engine size in liters?
    2. What is the average fuel consumption for regular gasoline (Fuel Type = X), premium gasoline (Z), ethanol (E), and diesel (D)?
    3. What is the correlation between fuel consumption and CO2 emissions?
    4. Which vehicle class has lower average CO2 emissions, 'SUV - SMALL' or 'MID-SIZE'?
    5. What are the average CO2 emissions for all vehicles? For vehicles with an engine size of 2.0 liters or smaller?
    6. Any other insights you found during your analysis?
    1. What is the median engine size in liters?

    The median engine size in liters is 3.

    # Checking the median engine size
    median_engine = cars['Engine Size(L)'].median()
    median_engine
    1. What is the average fuel consumption for regular gasoline (Fuel Type= X), premium gasoline (Z), ethanol (E), and diesel (D)?

    The average fuel consumption, in liters per 100 km, for each of these fuel types is:

    • Diesel (D) = 8.84 L/100 km
    • Regular gasoline (X) = 10.08 L/100 km
    • Premium gasoline (Z) = 11.42 L/100 km
    • Ethanol (E) = 16.86 L/100 km
    # Define a dictionary to map the fuel type codes to their names
    fuel_type_names = {'X': 'Regular Gasoline', 'Z': 'Premium Gasoline', 'E': 'Ethanol', 'D': 'Diesel', 'N': 'Natural Gas'}
    
    # Create a color palette
    palette = {'X': '#77AC30', 'Z': '#D9AF3B', 'E': '#BA3A0A', 'D': '#A6A6A6'}
    
    # Checking average fuel consumption by type, excluding natural gas (N).
    avg_fuel_consumption = cars.groupby('Fuel Type')['Fuel Consumption Comb (L/100 km)'].mean().drop('N').sort_values()
    
    # Plotting the average fuel consumption
    sns.set(style="whitegrid")
    ax = sns.barplot(x=avg_fuel_consumption.index, y=avg_fuel_consumption, palette=palette)
    
    # Add data labels to the bars with the updated fuel type names
    for i, v in enumerate(avg_fuel_consumption):
        fuel_type = fuel_type_names.get(avg_fuel_consumption.index[i], 'Unknown')
        ax.text(i, v, "{:.2f}".format(v), ha='center', fontweight='light')
    
    # Title and labels
    plt.xlabel('Fuel Type')
    plt.ylabel('Fuel Consumption')
    plt.title('Average fuel consumption by type')
    plt.xticks(range(len(avg_fuel_consumption.index)), [fuel_type_names.get(fuel_type, 'Unknown') for fuel_type in avg_fuel_consumption.index], rotation=45)
    
    plt.tight_layout()
    plt.show()
    1. What is the correlation between fuel consumption and CO2 emissions?

    The correlation between fuel consumption and CO2 emissions is strong and positive, with a correlation coefficient of approximately 0.918. This indicates that as the fuel consumption increases, so does the CO2 emissions. The coefficient is close to +1, suggesting a strong linear relationship between the variables.

    # Checking correlation between the two variables
    corr = cars['Fuel Consumption Comb (L/100 km)'].corr(cars['CO2 Emissions(g/km)'])
    corr