Motorcycle Parts Sales
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Welcome!

    This project originally began as a guided challenge called "Analyzing Motorcycle Sales"; which can be found on Datacamp's Project section in the Learn tab. Following this introduction are two chunks, text containing the prompt and table schema as well as code containing my submission, respectively.

    Motivations

    I was inspired to experiment and expand on this challenge as I have an affinity towards motorcycles. I come from a family full of motorcyclist and enthusiasts; I ride and learned from my dad. Some of my favorite memories from childhood involve feeling the wind or hearing the roar of the engine sitting on the back of my old man's cruiser after he picked me up from school.

    Features

    The purpose of this project is demonstrate my knowledge of SQL. More specifically, this project features skills such as data wrangling and data aggregation. In addition, it has provided an opportunity to improve my markdown skills.

    You're working for a company that sells motorcycle parts, and they've asked for some help in analyzing their sales data!

    They operate three warehouses in the area, selling both retail and wholesale. They offer a variety of parts and accept credit cards, cash, and bank transfer as payment methods. However, each payment type incurs a different fee.

    The board of directors wants to gain a better understanding of wholesale revenue by product line, and how this varies month-to-month and across warehouses. You have been tasked with calculating net revenue for each product line and grouping results by month and warehouse. The results should be filtered so that only "Wholesale" orders are included.

    They have provided you with access to their database, which contains the following table called sales:

    Sales

    ColumnData typeDescription
    order_numberVARCHARUnique order number.
    dateDATEDate of the order, from June to August 2021.
    warehouseVARCHARThe warehouse that the order was made from— North, Central, or West.
    client_typeVARCHARWhether the order was Retail or Wholesale.
    product_lineVARCHARType of product ordered.
    quantityINTNumber of products ordered.
    unit_priceFLOATPrice per product (dollars).
    totalFLOATTotal price of the order (dollars).
    paymentVARCHARPayment method—Credit card, Transfer, or Cash.
    payment_feeFLOATPercentage of total charged as a result of the payment method.
    Unknown integration
    DataFrameavailable as
    revenue_by_product_line
    variable
    -- Solution to prompt
    
    SELECT product_line,
    	CASE 
    		WHEN EXTRACT('month' FROM date) = 6 THEN 'June'
    		WHEN EXTRACT('month' FROM date) = 7 THEN 'July'
    		WHEN EXTRACT('month' FROM date) = 8 THEN 'August'
    		ELSE 'Check'
    	END AS month,
    	warehouse,
    	ROUND(SUM(total * (1 - payment_fee))::numeric, 2) AS net_revenue
    FROM sales
    WHERE client_type = 'Wholesale'
    GROUP BY product_line, warehouse, month
    ORDER BY product_line, month, net_revenue DESC;
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    From this point on this project will be unguided

    Here, we'll continue to examine our sales performance on a month-to-month basis arcoss warehouses.

    Insights
    • Disregarding client type, our North warehouse on average produces the most sales.
    • The highest value sales are found to come from the Central.
    Unknown integration
    DataFrameavailable as
    df5
    variable
    WITH cte AS (
    	SELECT DATE_PART('month', date) AS month
    		,*
    	FROM Sales
    	)
    	
    SELECT	month,
    	warehouse,
    	COUNT(*) AS total_orders,
    	ROUND(AVG(total), 2) AS avg_sales,
    	MAX(total) AS highest_sale,
    	MIN(total) AS lowest_sale
    FROM cte
    GROUP BY month, warehouse
    ORDER BY month, avg_sales DESC
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    In the following query we'll take a look at some key metrics per product line performance by payment type.

    Insights
    • The top revenue generating product line in the lineup is Frame & Body coming from Transfer payments, with a total of $39477.89
    • In all payment types, the product line with the fewest number of transactions on average was the most exspensive; they're all the same product, engines.
      • Transfer, Engine, 60.92
      • Credit card, Engine, 59.71
      • Cash, Engine, 61
    • The majority of transactions were purchased on credit, with a total of 659 transactions.
    Unknown integration
    DataFrameavailable as
    df
    variable
    SELECT payment,
    	product_line,
    	SUM(total) AS total_price,
    	ROUND(AVG(unit_price::int), 2) AS avg_unit_price,
    	COUNT(*) AS num_transactions,
    	SUM(COUNT(*)) OVER (PARTITION BY payment) AS total_transx_payment
    FROM Sales
    GROUP BY payment, product_line
    ORDER BY payment DESC, total_price DESC
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Now, we'll compare warehouse performance per client type.

    Insights
    • Of the three warehouses, regardless of client type, Central produced the most revenue. This warehouse also shipped out the majority of items sold.
    • The lowest performing seems to be West.
    Unknown integration
    DataFrameavailable as
    df2
    variable
    -- Take a look at sales broken down by sensible categorical data.
    SELECT client_type,
    	warehouse,
    	SUM(quantity) AS total_items_sold,
    	SUM(total) AS grand_total
    FROM Sales 
    GROUP BY 1,2
    ORDER BY client_type, grand_total DESC
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Lastly, we'll zoom out and look at our overall product line performance.

    Insights
    • The items most frequently come from Suspenion & traction and Breaking system, with a lead of 500 units.
    • Aside from Miscellaneous products, all other product lines produce more revenue than Engine; the top producer was Suspenion & traction.
    Unknown integration
    DataFrameavailable as
    df3
    variable
    -- Identify the highest value transactions per product_line
    SELECT DISTINCT product_line,
    	SUM(quantity) AS total_quantity,
    	SUM(total) AS total_sales,
    	MAX(unit_price) AS highest_price,
    	MIN(unit_price) AS lowest_price
    FROM Sales
    GROUP BY 1
    ORDER BY total_sales DESC, total_quantity DESC;
    This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

    Outro

    Thank you for joining me on this exploration of summary statistics. Now, I invite you to explore my portfolio!

    Please, message me; I look forward to connecting, as well as your feedback.