Course notes: Intermediate postgreSQL

Set the source to 'Course Databases' to query databases used in SQL courses.
Set the source to 'DataFrames and CSVs' to query this course's CSV files (if any) with SQL
Write SQL! Note that you cannot run queries that modify the database.

The content

Explore your data use SELECT.
Summary statistic.
Filter summery data with WHERE
HAVING with GROIP BY.
Finding and resolving missing data..
Subseting missing values use COLEDSCE.
CASE WHEN statment.
Math with dates.
Rounding and Truncating numbers.
Common Table Expression.
Windows functions

Here we will content with this much and we will not explain the statistical functions such as STDEV() AND Mode maybe next.

Here we use films table from database cinema

Explor your data use SELECT column from specify table

Unknown integration

DataFrameavailable as

df

variable

-- Select all fields from table
SELECT *
FROM cinema.films;

This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

Unknown integration

DataFrameavailable as

df1

variable

-- Specify some column from table
SELECT id, country, duration ,language 
FROM cinema.films
-- detriment numbers of rows lilit 5
LIMIT 5;

This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

Summery statistics include MAX(), MIN(), SUM(), AVG() and COUNT() functions

Why did we use summery statistics in SQL

Summary statistics are used in SQL Server to provide a statistical summary of the entire table or view. They can be used to get a quick glance at the shape and distribution of the data before taking a deep dive³. The postgreSQL query optimizer uses distribution statistics when creating a query plan. The statistics provide information about the distribution of column values across participating rows, helping the optimizer better estimate the number of rows, or cardinality, of the query results⁴.

If you don't use groupby with aggregate function you must take an error

Unknown integration

DataFrameavailable as

df2

variable

-- count the total numbers of gross by country and duration 
SELECT country, duration,
SUM(gross) AS total_gross
FROM cinema.films
-- Grouped by country and duration
GROUP BY country, duration
-- order by country
ORDER BY country;

This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

Unknown integration

DataFrameavailable as

df3

variable

-- Selrct min, max and avg for gross
SELECT
MIN(gross) AS mini_gross,
MAX(gross) AS maxi_gross,
AVG(gross) AS main_gross
FROM cinema.films;

This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

Unknown integration

DataFrameavailable as

df4

variable

-- Selrct min, max and avg for gross
SELECT
MIN(gross) AS mini_gross,
MAX(gross) AS maxi_gross,
AVG(gross) AS main_gross
FROM cinema.films
-- Filter by country 'Argentina'
WHERE country = 'Argentina';

This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

Cant did use WHERE with GROUP BY it's give you an error also you can use HAVING with GROUP BY

The reason why we need to use GROUP BY with HAVING but not with WHERE is because the WHERE statement is evaluated before any aggregations take place.

The alternate HAVING is placed after the GROUP BY and allows you to filter the returned data by an aggregated column. Using HAVING, you can return the aggregate filtered results!

Here we use payment table from dvdrentals database

Unknown integration

DataFrameavailable as

df6

variable

-- Explore the table
SELECT * FROM dvdrentals.payment;

This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

Unknown integration

DataFrameavailable as

df5

variable

-- Select payment
SELECT payment_id, 
-- Calculate sum alias by total amount
SUM(amount) AS total_amount,
-- calculate average of amount
AVG(amount) AS average_amount
FROM dvdrentals.payment
-- Grouped by payment id
GROUP BY payment_id
-- Filter by aggregate column
HAVING SUM(amount) < 8.99;

This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

‌
‌
‌

Course notes: Intermediate postgreSQL

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}The content

Here we will content with this much and we will not explain the statistical functions such as STDEV() AND Mode maybe next.

Here we use films table from database cinema

Explor your data use SELECT column from specify table

Summery statistics include MAX(), MIN(), SUM(), AVG() and COUNT() functions

Why did we use summery statistics in SQL

If you don't use groupby with aggregate function you must take an error

Cant did use WHERE with GROUP BY it's give you an error also you can use HAVING with GROUP BY

The reason why we need to use GROUP BY with HAVING but not with WHERE is because the WHERE statement is evaluated before any aggregations take place.

The alternate HAVING is placed after the GROUP BY and allows you to filter the returned data by an aggregated column. Using HAVING, you can return the aggregate filtered results!

Here we use payment table from dvdrentals database

The content