MotherDuck
MotherDuck is a serverless analytics platform that you can query just like any SQL database. It uses DuckDB under the hood to make its queries fast. Let's try it out with a sample dataset!
Unknown integration
DataFrameavailable as
df
variable
SELECT * FROM apps_data
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
Exploring the data
As you can see, this MotherDuck database contains app data.
However, we also have some related data locally in this workspace, in the review_data.parquet
file.
Let's explore it below:
Unknown integration
DataFrameavailable as
df1
variable
SELECT * FROM 'review_data.parquet'
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
Joining local and remote data
Did you know that you can join local and remote data with MotherDuck? Let's try it!
In the following cell we join the remote table with our local review_data.parquet
file to group the reviews by sentiment, while also showing the app rating.
Unknown integration
DataFrameavailable as
df2
variable
SELECT
apps_data.App,
apps_data.Rating,
review_data.Sentiment,
COUNT(*) as number_of_reviews_with_sentiment
FROM apps_data
JOIN 'review_data.parquet' AS review_data ON apps_data.App = review_data.app
GROUP BY 1, 2, 3
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.