SleepInc: Helping you find better sleep 😴
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    SleepInc: Helping you find better sleep 😴

    📖 Background

    Your client is SleepInc, a sleep health company that recently launched a sleep-tracking app called SleepScope. The app monitors sleep patterns and collects users' self-reported data on lifestyle habits. SleepInc wants to identify lifestyle, health, and demographic factors that strongly correlate with poor sleep quality. They need your help to produce visualizations and a summary of findings for their next board meeting! They need these to be easily digestible for a non-technical audience!

    📂 Preview of data

    import pandas as pd
    raw_data = pd.read_csv('sleep_health_data.csv')
    raw_data

    📄 Executive Summary

    🎯 Aim:

    To research which factors affect your sleeping.

    🛠 Method:

    1. Validate all data
    2. Make a Machine Learning model
    3. Assess how well it does
    4. Adjust it so it (hopefully) predicts everything right
    5. Look at the impact of each column (calculated by the model) on making predictions
    6. Make charts of the columns which have significance
    7. Double check if the Machine Learning model was right
    8. Form a conclusion

    🏁 Results:

    According to the Machine Learning model and statistics...
    To get better sleep, you should:

    • Sleep longer (or go to bed earlier)
    • Don't be stressed (try relax yourself)
    • Be physically fit - do excercise (to have a low Resting Heart Rate)
    • Do as much steps as possible in a day
    • Older people sleep better (probably because they don't have to work because they are on a pension)

    🤖 Answering the challenge 📊

    📕 Part 1

    from sklearn.ensemble import GradientBoostingRegressor
    from sklearn.model_selection import train_test_split
    
    from sklearn.metrics import mean_absolute_error, accuracy_score
    from sklearn.model_selection import GridSearchCV
    First,

    we need to encode text columns so that our model will understand them.

    for col in ['BMI Category', 'Sleep Disorder', 'BP_category']:
        encoded = pd.get_dummies(raw_data[col])
        raw_data[encoded.columns.to_list()] = encoded.values
    Second,

    we need to create training and testing sets. Training sets are what the ML model learns from, and the Testing set is what the ML model gets tested on.

    X = raw_data.select_dtypes(exclude='object').drop('Quality of Sleep', axis=1)
    y = raw_data['Quality of Sleep'].values
    X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2)
    Third,

    We need to create a model without any adjustments, so we can improve it later, but also make it make predictions to measure the accuracy of its predictions.