this is the nav!
Workspace
Marc Molina/

# Supermarket Transaction Analysis in SQL

0
Beta

## .mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Live training | 2023-03-14 11am EST | Supermarket Transaction Analysis in SQL | Martina Chiari

The checkout line is a familiar part of the retail experience. Any retailer needs to understand their checkout data to answer important questions like "do people spend more with card or cash?", "how many items do people buy at different times?", and "are self-service checkouts quicker than those with a cashier?".

In this live training, you'll use SQL to analyze transaction data from a Polish Supermarket to answer financial and logistical questions.

A financial analysis of point of sale transactions can have many purposes. A non-exhaustive list is:

• increase sales by identifying what matters to the customer, is it promotions? on what items? is it opening hours? prices of particular items? and so on;
• increasing productivity of stores and operators by identifying factors that result in Lower processing time per item or basket size, accurate predictions for labour scheduling, adequate level of inventory, fewer scanning errors at checkout, and so on;
• predict sales in the short, medium and long term to inform decisions on investments in fixed assets and human resources, financing, go-to-market invesments, etc.

### Task 1: Exploring the data

A good first step in exploring a database is to select the first few rows from each table.

#### Instructions

Select all columns and the first 100 rows from `pos_operator_logs` and `pos_transactions`.

Unknown integration
DataFrameavailable as
df
variable
```.mfe-app-workspace-11z5vno{font-family:JetBrainsMonoNL,Menlo,Monaco,'Courier New',monospace;font-size:13px;line-height:20px;}```SELECT *
FROM 'pos_operator_logs.csv'
LIMIT 100``````
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
Unknown integration
DataFrameavailable as
df
variable
``````SELECT *
FROM 'pos_transactions.csv'
LIMIT 100``````
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

### Task 2: Do more people make transactions by card or by cash?

Understanding how people pay is crucial for shop logistics like deciding what kind of checkout equipment to buy.

#### Instructions

Count how many transactions were processed by cash (the case when `t_cash` is true) and by card (the case when `t_card` is true). Use the `pos_transactions` table.

Unknown integration
DataFrameavailable as
df
variable
``````SELECT
COUNT(CASE WHEN t_cash = 'True' THEN 1 END) AS cash_transaction,
COUNT(CASE WHEN t_card = 'True' THEN 1 END) AS card_transaction
FROM 'pos_transactions.csv'``````
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

### Task 3: Do people spend more per transaction when using cash or card?

Similarly, knowing how much people spend per transaction using different payment methods is helpful for deciding on security arrangments, and marketing to encourage shoppers to use different payment methods.

Find the average per transaction type when a transaction is only card, or only cash.

#### Instructions

Calculate the mean (average) amount spent per transaction for cash and for card.

Unknown integration
DataFrameavailable as
df
variable
``````SELECT
AVG(CASE WHEN t_cash AND NOT t_card THEN amount END) AS avg_cash_transaction,
AVG(CASE WHEN t_card AND NOT t_cash THEN amount END) AS avg_card_transaction
FROM 'pos_transactions.csv'
``````
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.