Workspace
Khanh Nguyen/

Sleep Health and Lifestyle (Python + SQL)

0
Beta
Spinner

Sleep Health and Lifestyle

This synthetic dataset contains sleep and cardiovascular metrics as well as lifestyle factors of close to 400 fictive persons.

The workspace is set up with one CSV file, data.csv, with the following columns:

  • Person ID
  • Gender
  • Age
  • Occupation
  • Sleep Duration: Average number of hours of sleep per day
  • Quality of Sleep: A subjective rating on a 1-10 scale
  • Physical Activity Level: Average number of minutes the person engages in physical activity daily
  • Stress Level: A subjective rating on a 1-10 scale
  • BMI Category
  • Blood Pressure: Indicated as systolic pressure over diastolic pressure
  • Heart Rate: In beats per minute
  • Daily Steps
  • Sleep Disorder: One of None, Insomnia or Sleep Apnea

Check out the guiding questions or the scenario described below to get started with this dataset! Feel free to make this workspace yours by adding and removing cells, or editing any of the existing cells.

Source: Kaggle

Unknown integration
DataFrameavailable as
df
variable
SELECT *
FROM 'data.csv' AS df1
LIMIT 10
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Load the CSV file into a DataFrame
df = pd.read_csv("data.csv")

# Rename the columns
df.rename(columns={
    'Sleep Disorder': 'Sleep_Disorder',
    'Stress Level': 'Stress_Level',
    'Sleep Duration': 'Sleep_Duration',
    'Physical Activity Level': 'PA_level',
    'BMI Category': 'BMI',
    'Blood Pressure': 'Blood_Pressure',
    'Quality of Sleep': 'Sleep_Quality'
}, inplace=True)

# Save the modified DataFrame back to a CSV file
df.to_csv("data.csv", index=False)
df.head()

1. Factor that contribute to sleep disorder

  • Factors exploring
Unknown integration
DataFrameavailable as
df1
variable
SELECT Sleep_Disorder, Occupation, Stress_Level
FROM data.csv
WHERE Sleep_Disorder != 'None'
GROUP BY Gender, Occupation, Sleep_Disorder, Stress_Level
LIMIT 10;
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

Base on the chart

  • We can see top occupation that related to sleep disorder
  • Sale, Software Engineer, Doctor, Lawyer, Scientist, Accountant and Teacher. Especially, lawyer has both(sleep apnea, insomia)
#Distribution of Sleep quality with Occupation 
sns.boxplot(x='Sleep_Quality', y= 'Occupation', data=df)

plt.show()
Unknown integration
DataFrameavailable as
df2
variable
SELECT Sleep_Disorder, Stress_Level, PA_level
FROM data.csv
WHERE Sleep_Disorder != 'None'
GROUP BY Sleep_Disorder, Stress_Level, PA_level
Limit 10;
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.
#Looking at the relationship between BMI and Sleep Disorder
sns.barplot(x='BMI', y="Stress_Level", data=df,
            hue='Sleep_Disorder', color="b")
plt.show()
Unknown integration
DataFrameavailable as
df5
variable
SELECT Age, Sleep_Duration, Sleep_Disorder
FROM data.csv
WHERE Sleep_Disorder != 'None'
GROUP BY Age, Sleep_Duration, Sleep_Disorder;
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

Based on the chart

  • BMI is not likely to contribute in sleeping disorder as the graph showed also normal weight have sleep disorders.
  • Physical level (PA) <= 90 likely contribute to sleep disorder
  1. Does an increased physical activity level result in a better quality of sleep?
Unknown integration
DataFrameavailable as
df3
variable
Run cancelled
SELECT Sleep_Quality, PA_level
FROM data.csv
LIMIT 10;
This query is taking long to finish...Consider adding a LIMIT clause or switching to Query mode to preview the result.

The chart showed that increase Physical activity level will increase Sleep quality

3.Does the presence of a sleep disorder affect the subjective sleep quality metric?




  • AI Chat
  • Code