Sleep Health and Lifestyle
This synthetic dataset contains sleep and cardiovascular metrics as well as lifestyle factors of close to 400 fictive persons.
The workspace is set up with one CSV file, data.csv
, with the following columns:
Person ID
Gender
Age
Occupation
Sleep Duration
: Average number of hours of sleep per dayQuality of Sleep
: A subjective rating on a 1-10 scalePhysical Activity Level
: Average number of minutes the person engages in physical activity dailyStress Level
: A subjective rating on a 1-10 scaleBMI Category
Blood Pressure
: Indicated as systolic pressure over diastolic pressureHeart Rate
: In beats per minuteDaily Steps
Sleep Disorder
: One ofNone
,Insomnia
orSleep Apnea
Source: Kaggle
In general the data shows that there arent many outliers in different varibles with the exception of Heart rate, that has some outliers. With this in consideration it doesnt seem necesary to drop these values, considering the limited amount of data and the fat that these data probablpy is authentic
It can be observed that there are no null values and the data type matches the column, so there is no need for adjustment
First we need to split the data into the target variable and the features variables. Also we need to split the data between training and testing sets
# Split the data into features and target variable
X = sleep_data.drop('Sleep Disorder', axis=1)
y = sleep_data['Sleep Disorder']
# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
Logistic Regression
To perform logistic regression on the sleep_data
dataset, we can use the LogisticRegression
class from the sklearn.linear_model
module.
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
# Create an instance of the LogisticRegression model
model = LogisticRegression()
# Fit the model to the training data
model.fit(X_train, y_train)
# Predict the target variable for the test data
y_pred_log = model.predict(X_test)
Decision Tree
To build a decision tree model for the sleep_data
dataset, we can use the DecisionTreeClassifier
class from the sklearn.tree
module.
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
# Create an instance of the DecisionTreeClassifier model
model = DecisionTreeClassifier()
# Fit the model to the training data
model.fit(X_train, y_train)
# Predict the target variable for the test data
y_pred_tree = model.predict(X_test)
Neural Network
To build a neural network model for the sleep_data
dataset, we can use the MLPClassifier
class from the sklearn.neural_network
module.