Lego Freak's Wet Dream: An Interview at Denmark's Jewel
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Lego Freak's Wet Dream: An Interview at Denmark's Jewel

    Iā€™m a Lego freak. This goes back as far as I can remember. My mom would tell me that I would sit with the Lego sets, following instructions, in complete silence, for hours on end. No food, no bathroom break, no reaction to questions. I remember the feeling of being entirely engrossed in the process, tunnel-zoned in on building. In fact, I believe that it is the last time I felt a true flow without even knowing it.

    Fig. 1: Me building Lego in Oct 1994.

    So I recently applied to work as a data analyst intern at the famous Lego Group in Denmark. As part of the job interview process, I received the following take-home assignment:

    You are asked to use the provided dataset and database schema to understand the popularity of different Lego sets and themes. The idea is to become familiarized with the data to be ready for an interview with a business stakeholder.

    Create a report to summarize your findings. Include:

    1. What is the average number of Lego sets released per year?
    2. What is the average number of Lego parts per year?
    3. Create a visualization for item 2.
    4. What are the 5 most popular colors used in Lego parts?
    5. [Optional] What proportion of Lego parts are transparent?
    6. [Optional] What are the 5 rarest lego bricks?
    7. Summarize your findings.

    Database Schema

    Data

    inventory_parts

    • "inventory_id" - id of the inventory the part is in (as in the inventories table)
    • "part_num" - unique id for the part (as in the parts table)
    • "color_id" - id of the color
    • "quantity" - the number of copies of the part included in the set
    • "is_spare" - whether or not it is a spare part

    parts

    • "part_num" - unique id for the part (as in the inventory_parts table)
    • "name" - name of the part
    • "part_cat_id" - part category id (as in part_catagories table)

    part_categories

    • "id" - part category id (as in parts table)
    • "name" - name of the category the part belongs to

    colors

    • "id" - id of the color (as in inventory_parts table)
    • "name" - color name
    • "rgb" - rgb code of the color
    • "is_trans" - whether or not the part is transparent/translucent

    inventories

    • "id" - id of the inventory the part is in (as in the inventory_sets and inventory_parts tables)
    • "version" - version number
    • "set_num" - set number (as in sets table)

    inventory_sets

    • "inventory_id" - id of the inventory the part is in (as in the inventories table)
    • "set_num" - set number (as in sets table)
    • "quantity" - the quantity of sets included

    sets

    • "set_num" - unique set id (as in inventory_sets and inventories tables)
    • "name" - the name of the set
    • "year" - the year the set was published
    • "theme_id" - the id of the theme the set belongs to (as in themes table)
    • num-parts - the number of parts in the set

    themes

    • "id" - the id of the theme (as in the sets table)
    • "name" - the name of the theme
    • "parent_id" - the id of the larger theme, if there is one

    Acknowledgments: Rebrickable.com

    Key Takeaways from the Data Exploration

    • There are 11,673 different sets; 11,684 when we include different versions of sets; 11,723 when we include theme (according to set_num). Thus, there are different versions of the same set and some sets belong to more than one theme, of which there are 614. Although having a unique ID, some sets share names.
    • The time period of 2050-2017 spans 68 years but 2 are missing.
    • While average number of parts is approx. 162, median is around 45. Smaller sets are favored.
    • There are only 2,848 sets in the inventory. Each set only once.
    • There are 133 different colors. Colors differ in the level of darkness, presence of glitters, speckles, glow, transparency, chromatic effect, metallicity, and perl-like tint.
    • There are 25,779 unique parts with non-null part number.

    What is the average number of Lego sets released per year?

    The number of sets per year shows the slow start of the Lego Group. Until 1975, 25 years after its first sets, the number of sets per year had hovered around 50. I nbetween 1975 and 1995, the sales increased to around 100 sets a year. Then came the explosion into high hundreds. The most prolific year appears to be 2014, with 713 sets released.

    Hidden code df
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    Hidden code df
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    For the average number of sets per year, sets were counted by their unique set number and sets where the number of parts was 0 or lower were excluded (mostly, these were booklets and manuals).

    Fig. 3: Average Number of Sets Released per Year.

    What is the average number of Lego parts per year?

    The number of lego parts steadily rises to approx. 35,000 parts per year in 2000. Since then the increase has become more rapid. It is important to note that all parts were included, irrespective of whether they are unique or spare. Therefore, it is a good indicator of how large the sets were throughout the years. For example, the most prolific year appears to be 2016 with 155,072 parts per 596 sets, which means there were fewer sets but larger.

    Hidden code df
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
    ā€Œ
    ā€Œ
    ā€Œ