Online Ticket Sales Database
👋 Welcome to your workspace! Here, you can run SQL queries, write Python code, and add text in Markdown. This workspace is automatically connected to an Amazon Redshift database containing tables about online ticket sales (source).
You can click the "Browse tables" button in the upper righthand corner of the cell below to view the available tables.
There is a short query and a visualization of the number of tickets sold and the price per ticket over time rendered in Plotly to get you started.
SELECT
caldate,
eventname,
catgroup,
priceperticket,
SUM(qtysold) AS qtysold
FROM event
INNER JOIN category USING(catid)
INNER JOIN date USING(dateid)
INNER JOIN sales USING(eventid)
INNER JOIN listing USING(eventid)
GROUP BY caldate, eventname, catgroup, priceperticket
ORDER BY qtysold DESC
LIMIT 100
# Import libraries
import pandas as pd
import plotly.express as px
# Create scatter plot
fig = px.scatter(
ticket_prices,
x="caldate",
y="qtysold",
color="catgroup",
size="priceperticket",
hover_data=["eventname"],
)
# Create labels and show plot
fig.update_layout(
title="Biggest Concerts and Shows over Time<br><sup>By Quantity Sold and Price of Tickets</sup>",
title_x=0.5,
xaxis_title="Date",
yaxis_title="Quantity Sold",
legend_title="Category",
template="plotly_dark",
)
fig.show()
This is an interactive plot! Hover over different points to learn the details of each event.
💪 Now it's your turn to construct your own queries and analyze the data! Remember, you can review the tables in the database at any point using the "Browse tables" button.
SELECT *
FROM category;
SELECT *
FROM date;
SELECT *
FROM event;
SELECT *
FROM listing;
SELECT *
FROM sales;
SELECT *
FROM users;
SELECT *
FROM venue;
SELECT e.eventname,
d.caldate AS date,
v.venuename AS venue,
v.venuecity AS location,
SUM(l.numtickets) OVER
(PARTITION BY e.eventname) AS qty_tix_for_sale
FROM event AS e
INNER JOIN date AS d
USING(dateid)
INNER JOIN venue AS v
USING(venueid)
INNER JOIN listing AS l
USING(eventid)
LEFT JOIN sales AS s
USING(listid)
ORDER BY qty_tix_for_sale
LIMIT 100;
Percentage of listing tickets sold for each event
WITH event_sales AS(
SELECT d.caldate AS date,
e.eventname,
v.venuename AS venue,
v.venuecity || ', ' || v.venuestate AS location,
SUM(l.numtickets) AS qty_tix_for_sale,
SUM(s.qtysold) AS qty_tix_sold
FROM date AS d
INNER JOIN event AS e
USING(dateid)
INNER JOIN venue AS v
USING(venueid)
LEFT JOIN listing AS l
-- ON l.eventid = e.eventid
USING(eventid)
LEFT JOIN sales AS s
ON l.eventid = s.eventid
WHERE l.numtickets IS NOT NULL AND s.qtysold IS NOT NULL
GROUP BY 1,2,3,4
ORDER BY qty_tix_for_sale DESC
)
SELECT *,
ROUND((qty_tix_sold::decimal / qty_tix_for_sale::decimal)*100, 2) AS pct_sold
FROM event_sales
ORDER BY pct_sold DESC;
SELECT firstname,
lastname,
username,
pricepaid
FROM users, sales
WHERE userid = buyerid
LIMIT 5;