Workspace
Matt Crabtree/

Duplicate of 2022 12 SQL Live Training Workspace

0
Beta
Spinner

Free Trial Performance Analysis in SQL

How much is a free trial worth? In this SQL live training, you'll learn how to use SQL to analyze marketing data for a free trial of a product. You’ll summarize and aggregate data to calculate the sorts of metrics that are the bread and butter of any marketing analyst role.

This live training will require only SQL and Workspace, it will not involve R or Python.

The Data

Data Access

To access the data, create a PostgreSQL integration with the following details:

host: workspacedemodb.datacamp.com

port: 5432

database: free_trial_performance

username: competition_free_trial

password: workspace

Data Deals

We'll be using synthetic data that was created for this training. This data represents a product that is sold via a 1-month free trial. The Free Trials table records instances of customers beginning a free trial. 1 month after the free trial period starts, the customer may choose to pay, and if so we will have a Purchase record.

There are four tables:

Free Trials

A list of instances of free trials starts.

  • Trial ID - An ID unique to the Free Trial.
  • Free Trial Start Date - The date when the customer began their free trial.
  • Region - The world region where the customer is located.

Purchases

A list of instances of customers paying, following their free trial.

  • Trial ID - The ID of the free trial, from the Free Trials table. This ID is unique as each trial may have a maximum of 1 purchase asociated with it.
  • Purchase Date - The date when the customer made their purchase, 1 month after they began their free trial.
  • Purchase Value - The USD value of the Customer's purchase.

Dates

A list of dates, provided for convenience.

  • Date - A sequential list of dates.
  • Month - The first of the month for each date.

Prices

Optional - a list of prices of the product by region over time. This table will not be used in the live training, and is for optional follow-up activity. Prices are set on a Monthly basis, but the price for each customer is set at the beginning of their free trial, so subsequent price changes will not affect a customer.

  • Free Trial Start Month - the month of free trials that the price applies to.
  • Region - the customer's world region, as in the Free Trials table.
  • The price that will be locked in at the beginning of the customer's Free Trial, based on their Free Trial Start Month & Region.

Unknown integration
Data frameavailable as
df
variable
select * from trials
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
Unknown integration
Data frameavailable as
df
variable
select * from purchases
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

1. Getting Familiar with the Data

We will query the Free Trials & Purchases tables, and produce graphs showing the volume of both of these over time.

a) Group trials by the month of free_trial_start_date (and order by the same).

Count the rows as num_free_trials.

Unknown integration
Data frameavailable as
df
variable
select
        date_trunc('month', free_trial_start_date) as month
    ,	count(*) as num_free_trials
from trials
    group by 1
    order by 1
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

b) Group purchases by the month of purchase_date (and order by the same).

Count the rows as num_purchases, and sum purchase_value as usd_value.

Call the output purchases_by_month.

Unknown integration
Data frameavailable as
purchases_by_month
variable
select
        date_trunc('month', purchase_date) as month
    ,	count(*) as num_purchases
    ,	sum(purchase_value) as usd_value
from purchases
    group by 1
    order by 1
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

c) Create a line graph of num_purchases by month.

Current Type: Line
Current X-axis: month
Current Y-axis: num_purchases
Current Color: None

Purchases per Day

2. Data Aggregation 1 - Velocity Metrics by Month

We will pull metrics for Free Trial Starts, Purchases, and Gross Merchandise Value by month from the Free Trials & Purchases tables.

a) Now that we can aggregate the data by month, create both summaries as Common Table Expressions (CTEs), and left join our purchases per month against the free trials per month to get the results into a combined results table.

Unknown integration
Data frameavailable as
df
variable
with free_trials_per_month as (
    select
            date_trunc('month', free_trial_start_date) as month
        ,	count(*) as num_free_trials
    from trials
        group by 1
        order by 1
)

,	purchases_per_month as (
    select
            date_trunc('month', purchase_date) as month
        ,	count(*) as num_purchases
        ,	sum(purchase_value) as usd_value
    from purchases
        group by 1
        order by 1
)

select
		free_trials_per_month.month
    ,	free_trials_per_month.num_free_trials
    ,	purchases_per_month.num_purchases
    ,	purchases_per_month.usd_value
from free_trials_per_month
	left join purchases_per_month
    	on purchases_per_month.month = free_trials_per_month.month
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

Do you notice that there's some data missing? When we left join purchases_per_month, we only match against months that exist in free_trials_per_month. There are several ways to solve this.