Workspace
Justin Arcilla/

Personal Project: Michelin Star Data Exploration

0
Beta
Spinner

Michelin Star Data Exploration

Project GOALS:

  1. SQL Skills Demonstration- Explore data from a CSV using SQL queries. Showcase queries that will manipulate the dataset. (DataCamp treats the CSV as a dataframe and converts SQL queries to Python code)
  2. Tableau Skills Demonstration- Export the data from certain queries (on this platform, export dataframes to CSV sheets) and load it into Tableau to create visualizations.

I found this dataset on Kaggle:

https://www.kaggle.com/datasets/ngshiheng/michelin-guide-restaurants-2021

The dataset comes in CSV format.

The different Michelin Restaurant Awards (From the Michelin website)

  • 1 Michelin Star (The lowest rank)
  • 2 Michelin Stars
  • 3 Michelin Stars (The highest rank)
  • Green Star (Special)
  • Bib Gourmand (Special)
  1. A Michelin Star is awarded to restaurants offering outstanding cooking. We take into account five universal criteria: the quality of the ingredients, the harmony of flavours, the mastery of techniques, the personality of the chef as expressed through their cuisine and, just as importantly, consistency both across the entire menu and over time. Along with seeking out new Stars, we continually reassess existing Star restaurants to ensure the same high standard of cooking is being offered to guests.
  2. Green star: The MICHELIN Green Star is an annual award which highlights restaurants at the forefront of the industry when it comes to their sustainable practices.
  3. Bib Gourmand: Best value for money restaurants – offering a three course meal at a reasonable price – have been pointed out using a special award: the Bib Gourmand. The price limit for Bib Gourmand consideration varies from country to country, depending on the cost of living, but the inspectors are searching for the same high quality, wherever they happen to be in the world.

Tableau Dashboard Link:

https://public.tableau.com/views/MichelinStar2024/MichelinStar2024?:language=en-US&publish=yes&:sid=&:display_count=n&:origin=viz_share_link

How To Use:

This dashboard has 5 parts:

  1. Map
  2. Awards Bar
  3. Region Bar
  4. Cuisine Tree Map
  5. Price Range Toggle

Use any of the 2-5 charts to filter the map or the other charts

Dashboard can answer the question:

How many [award type] awards does [region] have for [cuisine] type of food at [price] price?

SQL Queries Demonstration

--1. What are the columns names and column data types? (This won't work in DataCamp, a hypothetical query)

SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = michelin_maps;

--2. A second way: Select all columns and get a quick glance at the "head" of the data.

Unknown integration
DataFrameavailable as
df1
variable
SELECT *
FROM 'michelin_maps.csv'
LIMIT 5
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

--3. How many restaurants are there?

Unknown integration
DataFrameavailable as
df
variable
SELECT COUNT (Name) AS total_restaurants
FROM 'michelin_maps.csv'
WHERE Name IS NOT NULL;
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

--4. Do we have restaurants with the same name in different locations?

Unknown integration
DataFrameavailable as
df4
variable
SELECT 
	COUNT(*) AS count_names,
	Name
FROM michelin_maps.csv
GROUP BY Name
ORDER BY COUNT(*) DESC
LIMIT 20;
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

--5. How many restaurants won each type of award?

Unknown integration
DataFrameavailable as
df3
variable
SELECT
	COUNT (name) AS total_restaurants,
	Award
FROM 'michelin_maps.csv'
GROUP BY Award;
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

The Award column uses a comma serparated value for restaurants with more than one award, let's separate these categories using the comma as a delimiter.

Unknown integration
DataFrameavailable as
df6
variable
SELECT
	COUNT (name) AS total_restaurants,
	UNNEST(STRING_TO_ARRAY(Award, ',')) AS Award
FROM 'michelin_maps.csv'
GROUP BY Award;
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.



  • AI Chat
  • Code