Let's dive into the Lego world
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Let's dive into the Lego world

    Table of contents

    1. Executive summary
    2. Data and Method
    3. Analysis and results
    4. Summary of main findings
    5. Annex

    1. Executive Summary

    Itroduction

    Lego is a Danish company founded in 1932 by Ole Kirk Christiansen. Originally, the company made wooden toys, but by 1934, it had begun creating plastic toys, including the Lego bricks that would become its most famous product. Lego has grown to be one of the world's most popular toys, noted for its versatility and creativity.

    Lego has always had a strong appeal to both children and adults. Lego offers a fun and imaginative play experience for children, allowing them to build and design their own worlds and characters. For adults, Lego offers a nostalgic connection to their childhood and a creative opportunity for self-expression. Lego has recently expanded into video games, movies, and theme parks, further strengthening its appeal to a wider audience.

    The task

    We recently applied for a position as a data analyst intern at Denmark's well-known Lego Group. We were given the following take-home project as part of the job interview process:

    We were tasked with analyzing the popularity of various Lego sets and themes using the dataset provided. The goal is to become comfortable with the data to be prepared for an interview with a business stakeholder. In doing so we will answer several questions:

    1. What is the average number of Lego sets released per year?
    2. What is the average number of Lego parts per year?
    3. Create a visualization for item 2.
    4. What are the 5 most popular colors used in Lego parts?
    5. What proportion of Lego parts are transparent?
    6. What are the 5 rarest lego bricks?
    7. What are the top 10 Lego themes?

    The report starts with a concise introduction to the data and methodology utilized. This is followed by a comprehensive data analysis phase, where we meticulously examine all relevant information to achieve the project objectives. The report ends with a summarization of the most important findings. The aim of the report is to provide a comprehensive and in-depth examination of the data, leading to a better understanding of the results and their implications.

    Results

    The results of the analysis showed that:

    • The company has released an average of 176.82 sets per year since its establishment.
    • With an average of 162.26 pieces in each set.
    • The most commonly used colors in Lego parts are:
      1. Black
      2. White
      3. Light Bluish Gray
      4. Red
      5. Dark Bluish Gray
    • The analysis also revealed that 6.29% of the 23114 total parts were observed to be transparent with 1455 pieces, while 21695 were noted to be of solid color.
    • The rarest Lego bricks were identified as id_part:
      • 30603pb04
      • 25123
      • 30601
      • 3001b
      • 5306bc017.
    • The top 10 lego themes are:
      1. Supplemental
      2. Technic
      3. City
      4. Friends
      5. Basic Set
      6. Creator
      7. Gear
      8. Service Packs
      9. Duplo
      10. Star Wars.
    • These results provide valuable insights into the company's production and inventory, enabling informed decisions to be made regarding future production goals and inventory management.

    2. Data and Methods

    The database consists of eight tables having information regarding the inventories, parts, part color, sets and themes, spanning the entire history of the company up until the year 2017. Below there is a visualization of how the tables are related to each other. (source) About the method, the database has been querying with PostgreSQL using Common Table Expression, subqueries, aggregate functions, window functions and filtering. The visualizations have been created using the Python library Plotly.

    inventory_parts

    • "inventory_id" - id of the inventory the part is in (as in the inventories table)
    • "part_num" - unique id for the part (as in the parts table)
    • "color_id" - id of the color
    • "quantity" - the number of copies of the part included in the set
    • "is_spare" - whether or not it is a spare part

    parts

    • "part_num" - unique id for the part (as in the inventory_parts table)
    • "name" - name of the part
    • "part_cat_id" - part category id (as in part_catagories table)

    part_categories

    • "id" - part category id (as in parts table)
    • "name" - name of the category the part belongs to

    colors

    • "id" - id of the color (as in inventory_parts table)
    • "name" - color name
    • "rgb" - rgb code of the color
    • "is_trans" - whether or not the part is transparent/translucent

    inventories

    • "id" - id of the inventory the part is in (as in the inventory_sets and inventory_parts tables)
    • "version" - version number
    • "set_num" - set number (as in sets table)

    inventory_sets

    • "inventory_id" - id of the inventory the part is in (as in the inventories table)
    • "set_num" - set number (as in sets table)
    • "quantity" - the quantity of sets included

    sets

    • "set_num" - unique set id (as in inventory_sets and inventories tables)
    • "name" - the name of the set
    • "year" - the year the set was published
    • "theme_id" - the id of the theme the set belongs to (as in themes table)
    • num_parts - the number of parts in the set

    themes

    • "id" - the id of the theme (as in the sets table)
    • "name" - the name of the theme
    • "parent_id" - the id of the larger theme, if there is one

    Acknowledgments: Rebrickable.com

    3. Analysis and results

    3.1 What is the average number of Lego sets released per year?

    In order to gain a comprehensive understanding of the number of sets released by the company over the years, it is necessary to execute a query on the 'sets' table consisting of 11,673 rows and 5 columns. By performing the query, we will be able to determine the annual number of sets released. This information will give us a clear picture of the company's production over time, allowing us to track its growth and development. Furthermore, by analyzing the data contained within the table, we will have the ability to calculate the average number of sets released by the company since its establishment.

    Hidden code df
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    The average has been estimated to be 176.82, providing us with a baseline for understanding the company's production output over time and enabling us to make informed decisions about the company's future production goals.

    3.2 What is the average number of Lego parts per year?

    Through a more in-depth analysis of the 'sets' table, we can gain a deeper understanding of the number of pieces that the company has produced over time. This analysis not only provides us with the overall average number of pieces across all sets, but also offers us an opportunity to identify trends and patterns in the data.

    Hidden code df
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.