Skip to content
Bikeshare Insights: Summer in the Windy City
  • AI Chat
  • Code
  • Report
  • Spinner

    Bikeshare Insights: Summer in the Windy City

    Source: Divvy Bikes

    Hidden code

    Introduction: Initial Summary

    The data in Bikeshare Insights: Summer in the Windy City for July of 2023 is made up of 767650 observations, across 13 columns. The data allows analyis of rider IDs , starting and end stations (identified by name and longitude and latitiude), the types of bikes being used, start and end times and the member or casual status of the bike users. In my initial anaylis I identified the following columns and their data types:

    • ride_id - VARCHAR
    • rideable_type -VARCHAR
    • started_at - VARCHAR
    • ended_at - VARCHAR
    • start_station_name - VARCHAR
    • start_station_id - VARCHAR
    • end_station_name - VARCHAR
    • end_station_id - VARCHAR
    • start_lat - DOUBLE
    • start_lng - DOUBLE
    • end_lat - DOUBLE
    • end_lng - DOUBLE
    • member_casual - VARCHAR

    I found NULL values in these columns:

    • start_station_name
    • start_station_id
    • end_station_name
    • end_station_id
    • end_lat
    • end_lng The data in the columns started_at and ended_at should be converted to the TIMESTAMP format. The NULL values in the other columns (the start_station_name and the end_station_name representing 16% and 17% NULL values respectively of the total in those columns; and less then 1% in the end_lat and end_lng columns) are more problematic as there are many unique journeys which occur only once. Calcuating a median of journeys where there are no NULL values is possible. The resulting 1389 journeys which share the median of 2 could be distirbuted among the observations that have NULL values. This would raise further complications and the end result would be of questionable use. I recommend that NULL values be replaced with the term Not Available.
    Hidden code df
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df8
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df16
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df17
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df26
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df29
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df32
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    1.1 Bike use by rideable type

    There is a difference between the use of electric bikes and classic bikes of 24,966 hires. Given the percentage difference between the use these two types of bikes is 6.45% it would be difficult to draw any worthwhile conclusion from this data. However, it is clear that the use of docked bikes is a great deal lower (18,424) representing only 2.46% of the total electric and classic bike use. As of 2021, docked bikes had to be taken from and returned to Divvy bike docks whereas the new generation of e-bikes can be left attached to a range of urban fixtures such as a light pole. This reduced range of parking choices may explain its low use.