(Copied) Machine Learning Scientist with R Course 8 - 10

Definition = model parameters are being fit DURING training, and they are the result of model fitting or training
Examples in linear model: coefficients (found during fitting)
Examples in machine learning model: weights, biases of neural nets that are optimized during training (model parameters)

1.2 Hyperparameters

Definition = parameters being set BEFORE training, and they specify HOW the training is supposed to happen. (most are optional)
Examples in linear model: method (an option to set before fitting)
Examples in machine learning model: learning rates, weight decay, number of trees in random forest
Why need tuning? = Want to find the best combination of hyperparameters

2. Recap of machine learning basics

2.1 Splitting data (ML with caret)

Load

library(caret)set.seed(42)

Set partition index (proportion for training dataset)

INDEX <- createDataPartition(FULL_DF$RESP_VAR, p =.70, # Set partition e.g. 70:30 list=FALSE)

Split data into training and testing dataset

TRAINING_DATA <- FULL_DF[INDEX,]
TESTING_DATA <- FULL_DF[-INDEX,] Note: training dataset should have enough power and test set should be representative

2.2 Train ML with caret

Set up cross-validation

library(carert)
library(tictoc) # Use for estimating how long it taks to run
fitControl <- trainControl(method ="repeatedcv", number =3, repeats =5) # Set cross validation (number of folds and number of repeats) Note: Valid resampling methods in caret
- LGOCV (leave-group-out cross-validation)
- boot (adabost is not a resampling scheme)
- cv

Train a random forest model with caret (automatic tuning)

tic()
set.seed(INT)
rf_model <- train(Y ~ X, data = TRAINING_DATA, method ="rf", trControl = fitControl, # Specify valiation
verbose =FALSE)
toc()

2.3 List of mode methods in caret

"gbm" = Stochastic Gradient Boosting model
"nnet" = Regulat neural network
"rf" = Random forest
"svmPoly" = Support Vector Machines

3. Hyperparameter tuning

3.1 Specific hyperparameters for model algorithm

modelLookup("gbm") # Specify model
https://topepo.github.io/caret/available-models.html

3.2 Hyperparameter tuning in caret (e.g. gbm)

Define hyperparameter grid (Cartesian grid)

HYPERPARAMETERS <- expand.grid(n.trees = 200, # Number of trees interaction.depth = 1, # Tree complexity shrinkage = 0.1, # Learning rate n.minobsinnode = 10) # Min when splitting

Apply hyperparameter grid to train().

set.seed(INT)
gbm_model <- train(Y ~ X, data = TRAINING_DATA, method = "gbm", trControl = trainControl(method = "repeatedcv", number = 5, repeats = 3), verbose = FALSE, tuneGrid = HYPERPARAMETERS)

3.3 Hyperparameter tuning in Support Vector Machines (SVM)

tuneLength

tic()
set.seed(INT)
SVM <- train(Y ~ X,
data = TRAINING_DATA,
method ="svmPoly",
trControl = fitControl,
verbose =FALSE,
tuneLength =5) # Specify maximum number of tuning parameter combinations
toc()

tuneGrid and expand.grid (manual method)

HYPERPARAMETERS <- expand.grid(degree =4, scale =1, C =1) # Specify grid or hyperparameter(s) to tune
set.seed(INT)
SVM <- train(Y ~ X,
data = TRAINING_DATA,
method ="svmPoly",
trControl = fitControl,
verbose =FALSE, tuneGrid = HYPERPARAMETERS # Specify tuneGrid )
toc()

Chapter 2. Hyperparameter tuning with caret

1. Hyperparameter tuning in caret

1.1 Method 1: Cartesian grid search with caret

Concept: define a grid of possible tune parameters and run; cons = takes longer time to run
Commands

MANUAL_GRID <- expand.grid(n.trees = c(100, 200, 250), interaction.depth = c(1, 4, 6), shrinkage = 0.1, n.minobsinnode = 10)

1.2 Method 2: Plot hyperparameter models

plot(GBM_MODEL) # plot graph with each line representing each value of hyperparameter
plot(GBM_MODEL, metric = "Kappa",plotType = "level") # Use level-plot Note: kappa used to evaluate performane of classification model (comparing expected vs observed accuracy; higher kappa = better)

2. Grid Search (Method 1) vs. Random Search (Method 3)

2.1 Grid search (Cartesian grid)

Concept: need to specify values; cons = get slow and computationally expensive (esp. big grid)
Commands 3.1) Define a grid

BIG_GRID <- expand.grid(n.trees = seq(from = 10, to = 300, by = 50), interaction.depth = seq(from = 1, to = 10, length.out = 6), shrinkage = 0.1, n.minobsinnode = 10)
fitControl <- trainControl(method = "repeatedcv", number = 3, repeats = 5, search = "grid") # Specify argument search "grid" 3.2) Set "tuneGrid" argument
gbm_model <- train(Y ~ X, data = TRAINING_DATA, method = "gbm", trControl = fitControl verbose = FALSE, tuneGrid = BIG_GRID)

2.2 Method 3: Random search

Concept: random values; pros = faster than grid search
Note: in caret, random search cannot be combined with grid search
Commands 3.1) Define random search in trainControl function

library(caret)
fitControl <- trainControl(method = "repeatedcv", number = 3, repeats = 5, search = "random") # Specify argument search "random" 3.2) Set "tuneLength" argument
tic()
set.seed(INT)
GBM_MODEL <- train(Y ~X, data = TRAINING_DATA, method = "gbm", trControl = fitControl, verbose = FALSE, tuneLength = 5) # Set tuneLength or maximum number of tuning parameter combinations
toc()

3. Method 4: Adaptive resampling

3.1 Definition and concept

Grid search (method 1) = all hyperparameters are computed.
Random search (method 3) = random subsets of hyperparameters are computed.
Adaptive resampling = hyperparameter combinations are resampled with values near combinations that performed well (suboptimal is not tested). Pros = faster and more efficient, but not better than methods 1 or 3

3.2 Steps and commands

Set adaptive resampling in trainControl()

fitControl <- trainControl(method = "adaptive_cv", adaptive = list(min = 2, # min no of resamples per hyperparam alpha = 0.05, # confidence level for removing hyperparam method = "gls", # "gls" = linear; "BT = Bradley Terry complete = TRUE), # TRUE = generate full resampling set search = "random")

Apply trainControl() and set tuneLength argument

tic()
set.seed(INT)
GBM_MODEL <- train(Y ~ X, data = TRAINING_DATA, method = "gbm", trControl = fitControl, verbose = FALSE, tuneLength = 7) # Set tuneLength or maximum number of tuning parameter combinations (in reality, maybe 100)(default = 3)
toc()

Chapter 3. Hyperparameter tuning with mlr

1. Machine learning with mlr

1.1 Basics

mlr = another framework for ML in R
Model training follows 3 steps 2.1) Define the task 2.2) Define the learner 2.3) Fit the model

1.2 Tasks in mlr for supervised learning

RegTask() : for regression
ClassifTask(): for binary and multi-class classification
MultilabelTask(): for multi-label classification problems
CostSensTask(): for general cost-sensitive classification

1.3 Learners in mlr

listLearners() # Return available learners: classif. = classification learner, regr. = regression learner, multilabel. = multi-label classification

1.3 Steps and commands

Define the task

TASK <- makeClassifTask(data = DF, target = "RESP_VAR")

Define the learner

LRN <- makeLearner("classif.h2o.deeplearning", # Enter a learner fix.factors.prediction = TRUE, # Handle missing data predict.type = "prob")

Fit the model

MODEL <- train(LRN, TASK)

2. Hyperparameter tuning with mlr (grid and random search)

2.1 Requirements for tuning in mlr

Search space for every hyperparameter
Tuning method (e.g. grid or random)
Resampling strategy

2.2 Steps and commands

Define the search space

getParamSet("classif.h2o.deeplearning") # Returns hyperparameter in the model
PARAM_SET <- makeParamSet( makeDiscreteParam("hidden", values =list(one =10, two =c(10,5,10))), makeDiscreteParam("activation", values =c("Rectifier","Tanh")), makeNumericParam("l1", lower =0.0001, upper =1),
makeNumericParam("l2", lower =0.0001, upper =1))

Define the tuning method 2.1) Grid search: ONLY deal with discrete hyperparameter sets

CTRL_GRID <- makeTuneControlGrid() 2.2) Random search: CAN deal with all kinds of hyperparameter
CTRL_RANDOM <- makeTuneControlRandom(maxit = INT) # Specify max number

Define resampling strategy

CROSS_VAL <-makeResampleDesc("RepCV" # Can be "Holdout" predict = "both", # If want to predict training AND validation data folds = 5 * 3)

Apply 3 steps of model training

TASK <- makeClassifTask(data = DF, target = "RESP_VAR")
LRN <- makeLearner("classif.h2o.deeplearning", # Enter a learner fix.factors.prediction = TRUE, # Handle missing data predict.type = "prob")
MODEL <- train(LRN, TASK)

Tune the model

LRN_TUNE <- tuneParams(LRN, TASK, resampling = CROSS_VAL control = CTRL_GRID, par.set = PARAM_SET)

3. Evaluating hyperparameters with mlr

3.1 Benefits of hyperparameter evaluation

How differert hyperparameters affect the performance of our model
Which hyperparameters have a strong or weak impact on our model performance
Whether our hyperparameter search converged / whether we can be confident that we found the most optimal hyperparameter combination

3.2 Steps and commands

See tuning results

HYPERPAR_EFFECTS <- generateHyperParsEffectData(LRN_TUNE, partial.dep =TRUE) # Set to TRUE if tuning > 2 params

Plot hyperparameter tuning results

plotHyperParsEffect(HYPERPAR_EFFECTS, partial.dep.learn ="regr.randomForest", x ="l1", y ="mmce.test.mean", z ="hidden",
plot.type ="line")

4. Advanced tuning with mlr

4.1 List of advanced tuning controls

1)makeTuneControlCMAES: CMA Evolution Strategy 2) makeTuneControlDesign: Predefined data frame of hyperparameters 3) makeTuneControlGenSA: Generalized simulated annealing (non-linear) 4) makeTuneControlIrace: Tuning with iterated F-Racing (config) 5) makeTuneControlMBO: Model-based/ Bayesian optimization (Use Bayesian statistics to optimize)

4.2 Steps and commands (examples with makeTuneControlGenSA)

Generalized simulated annealing

CTRL_GENSA <- makeTuneControlGenSA()

Create holdout sampling

BOOTSTRAP <- makeResampleDesc("Bootstrap", predict ="both")

Perform tuning

LRN_TUNE <- tuneParams(learner = LRN, task = TASK, resampling = BOOTSTRAP,
control = CTRL_GENSA,
par.set = PARAM_SET,
measures =list(acc, # Customize measures setAggregation(acc, train.mean), mmce, setAggregation(mmce, train.mean))

Nested cross-validation &nested resampling 4.1) Train directly

MODEL_NESTED <- train(LRN, TASK)
getTuneResult(MODEL_NESTED) 4.2) Add 2x nested validation
CV2 <- makeResampleDesc("CV", iters =2)
RES <- resample(LRN, TASK, resampling = CV2,
extract = getTuneResult)
generateHyperParsEffectData(RES)

Choose hyperparameters from a tuning set

LRN_BEST <- setHyperPars(LRN, par.vals =list(minsplit =4, minbucket =3, maxdepth =6))
MODEL_BEST <- train(LRN_BEST, TASK)
predict(MODEL_BEST, newdata = DF)

Chapter 4. Hyperparameter tuning with h2o

1. Machine learning with h2o

1.1 Basics

H2o = a open source for ML in R
Difference from caret and mlr = scalability

1. Steps and commands

Initiate the package

library(h2o)
h2o.init() # Local initiate

Prepare the data 2.1) Define H2o frame

DF_HF <- as.h2o(DF) 2.2) Define features and taget variable
TARGET <- "VAR"
FEATURE <- setdiff(colnames(DF_HF), TARGET) Note: for classification target variable
DF_HF[, TARGET] <- as.factor(DF_HF[, TARGET])

Split training, validation and test sets 3.1) Split data

SFRAME <- h2o.splitFrame(data = DF_HF, ration = c(0.7, 0.15), # Default = 0.75, 0.25 seed = INT) # Set seed
TRAIN_DATA <- SFRAME[[1]]
VALIDATE_DATA <- SFRAME[[2]]
TEST_DATA <- SFRAME[[3]] 3.2) See ratio
summary(TRAIN_DATA$VAR, exact_quantiles = TRUE)

Train model

GBM_MODEL <- h2o.gbm (X = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA) Note: available model for H2o h2o.gbm() & h2o.xgboost(): Gradient boosted models h2o.glm() Generalized linear models h2o.randomForest(): Random forest models h2o.deeplearning(): Neural networks

Predict new data

h2o.predict(GBM_MODEL, TEST_DATA)

Evaluate the model performance

PERFORMANCE <- h2o.performance(GBM_MODEL, TEST_DATA)
h2o.confusionMatrix(PERFORMANCE)
h2o.logloss(PERFORMANCE)

2. Hyperparameter tuning with h2o (grid and random search)

2.1 Hyperparameters in H2O models

?h2o.gbm # Search hyperparameters for gradient boosting

2.2 Steps and commands [search in step 6)]

Initiate the package

library(h2o)
h2o.init() # Local initiate

Prepare the data 2.1) Define H2o frame

DF_HF <- as.h2o(DF) 2.2) Define features and taget variable
TARGET <- "VAR"
FEATURE <- setdiff(colnames(DF_HF), TARGET) Note: for classification target variable
DF_HF[, TARGET] <- as.factor(DF_HF[, TARGET])

Split training, validation and test sets 3.1) Split data

SFRAME <- h2o.splitFrame(data = DF_HF, ration = c(0.7, 0.15), # Default = 0.75, 0.25 seed = INT) # Set seed
TRAIN_DATA <- SFRAME[[1]]
VALIDATE_DATA <- SFRAME[[2]]
TEST_DATA <- SFRAME[[3]] 3.2) See ratio
summary(TRAIN_DATA$VAR, exact_quantiles = TRUE)

Train model

GBM_MODEL <- h2o.gbm (X = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA)

Manage hyperparameter grid 5.1) Define hyperparameters (see parameter by ?MODEL)

GBM_PARAMS <- list(ntrees = c(100, 150, 200), max_depth = c(3, 5, 7), learn_rate = c(0.001, 0.01, 0.1)) 5.2) Define search 5.2.1) Grid search
GBM_GRID <- h2o.grid("gbm", grid_id = "gbm_grid", x = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA, seed = INT, hyper_params = GBM_PARAMS)

5.2.2) Random search [By runtime]

SEARCH_CRITERIA <- list(strategy = "RandomDiscrete", max_runtime_secs = 60, seed = INT [By stopping metric]
SEARCH_CRITERIA <- list(strategy = "RandomDiscrete", stopping_metric = "mean_per_class_error", stopping_tolerance = 0.0001, stopping_rounds = 6)
GBM_GRID <- h2o.grid("gbm", grid_id = "gbm_grid", x = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA, seed = INT, hyper_params = GBM_PARAMS search_criteria = SEARCH_CRITERIA) # Add search criteria 5.3) Examine a grid object
GBM_GRID_PERF <- h2o.getGrid(grid_id = "GBM_GRID", sort_by = "accuracy", decreasing = TRUE) 5.4) Extract the best model from a grid
BEST_GBM <- h2o.getModel(GBM_GRID_PERF@model_ids[[1]]) # First model = highest accuracy
h2o.performance(BEST_GBM, test)

Predict new data

h2o.predict(GBM_MODEL, TEST_DATA)

Evaluate the model performance

PERFORMANCE <- h2o.performance(GBM_MODEL, TEST_DATA)
h2o.confusionMatrix(PERFORMANCE)
h2o.logloss(PERFORMANCE)

3. Automatic machine learning with H2O

3.1 Automatic tuning of algorithm (AutoML)

Makes model tuning and optimization much faster and easier
Only needs a dataset, a target variable, and a time of model number limit for training

3.2 AutoML in H2O:

AutoML trains a number of different algorithms during a default classification run in this specific order:

Generalized Linear Model (GLM)
(Distributed) Random Forest (DRF)
Extremely Randomized Trees (XRT)
Extreme Gradient Boosting (XGBoost)
Gradient Boosting Machines (GBM)
Deep Learning (fully-connected multi-layer articial neural network)
Stacked Ensembles (of all models & of best of family)

3.3 Steps and commands

Use H2o.automl function (same function as regular H2o)

AUTOML_MODEL <- h2o.automl(x = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA, # Have the leaderboard be calculated based on the valiation data, not cross-validation results max_runtime_secs = 60, # Time for grid search sort_metric = "logloss", # Choose metric to rank (e.g. AUC for binary classification, mean per class error for multinomial classification) seed = INT, nfold = INT) Note: the command above returns a leaderboard with the ranking of metric chosen.

View the AutoML leaderboard

LB <- AUTOML_MODEL@leaderboard Note: if not specify leaderboard dataset, metrics will be calculated on 5-fold cross-validation results.

Extract model from the AutoML leaderboard 3.1) List all models by model id

MODEL_ID <- as.data.frame(LB)$model_id 3.2) Get the best model
AML_LEADER <- AUTOML_MODEL@leader

Exercise (Course 8: Hyperparameter Tuning in R)

1. Model parameters vs. hyperparameters

#Fit a linear model on the breast_cancer_data.

linear_model <- lm(concavity_mean ~ symmetry_mean, data = breast_cancer_data) #Look at the summary of the linear_model.
summary(linear_model) #Extract the coefficients.
linear_model$coefficients

2. What are the coefficients?

library(ggplot2) #Plot linear relationship.
ggplot(data = breast_cancer_data, aes(x = symmetry_mean, y = concavity_mean)) + geom_point(color = "grey") + geom_abline(slope = linear_modelcoefficients[1])

3. Machine learning with caret

#Create partition index

index <- createDataPartition(breast_cancer_data$diagnosis, p = 0.7, list = FALSE) #Subset breast_cancer_data with index
bc_train_data <- breast_cancer_data[index, ]
bc_test_data <- breast_cancer_data[-index, ] #Define 3x5 folds repeated cross-validation
fitControl <- trainControl(method = "repeatedcv", number = 5, repeats = 3) #Run the train() function
gbm_model <- train(diagnosis ~ ., data = bc_train_data, method = "gbm", trControl = fitControl, verbose = FALSE)

4. Changing the number of hyperparameters to tune

#Set seed.

set.seed(42) #Start timer.
tic() #Train model.
gbm_model <- train(diagnosis ~ ., data = bc_train_data, method = "gbm", trControl = trainControl(method = "repeatedcv", number = 5, repeats = 3), verbose = FALSE, tuneLength = 4) #Stop timer.
toc()

5. Tune hyperparameters manually

#Define hyperparameter grid.

hyperparams <- expand.grid(n.trees = 200, interaction.depth = 1, shrinkage = 0.1, n.minobsinnode = 10)

#Apply hyperparameter grid to train().
set.seed(42)
gbm_model <- train(diagnosis ~ ., data = bc_train_data, method = "gbm", trControl = trainControl(method = "repeatedcv", number = 5, repeats = 3), verbose = FALSE, tuneGrid = hyperparams)

6. Cartesian grid search in caret

#Define Cartesian grid

man_grid <- expand.grid(degree = c(1, 2, 3), scale = c(0.1, 0.01, 0.001), C = 0.5) #Start timer, set seed & train model
tic()
set.seed(42)
svm_model_voters_grid <- train(turnout16_2016 ~ ., data = voters_train_data, method = "svmPoly", trControl = fitControl, verbose= FALSE, tuneGrid = man_grid)
toc()

7. Plot hyperparameter model output

#Plot default

plot(svm_model_voters_grid) #Plot Kappa level-plot
plot(svm_model_voters_grid, metric = "Kappa", plotType = "level")

8. Grid search with range of hyperparameters

#Define the grid with hyperparameter ranges

big_grid <- expand.grid(size = seq(from = 1, to = 5, by = 1), decay = c(0, 1)) #Train control with grid search
fitControl <- trainControl(method = "repeatedcv", number = 3, repeats = 5, search = "grid") #Train neural net
tic()
set.seed(42)
nn_model_voters_big_grid <- train(turnout16_2016 ~ ., data = voters_train_data, method = "nnet", trControl = fitControl, verbose = FALSE, tuneGrid = big_grid)
toc()

9. Adaptive Resampling with caret

#Define trainControl function

fitControl <- trainControl(method = "adaptive_cv", number = 3, repeats = 3, adaptive = list(min = 3, alpha = 0.05, method = "BT", complete = FALSE), search = "random") #Start timer & train model
tic()
svm_model_voters_ar <- train(turnout16_2016 ~ ., data = voters_train_data, method = "nnet", trControl = fitControl, verbose = FALSE, tuneLength = 6)
toc()

10. Modeling with mlr

#Create classification taks

task <- makeClassifTask(data = knowledge_train_data, target = "UNS") #Call the list of learners
listLearners() %>% as.data.frame() %>% select(class, short.name, package) %>% filter(grepl("classif.", class)) #Create learner
lrn <- makeLearner("classif.randomForest", predict.type = "prob", fix.factors.prediction = TRUE)

11. Random search with mlr

#Get the parameter set for neural networks of the nnet package

getParamSet("classif.nnet") #Define set of parameters
param_set <- makeParamSet( makeDiscreteParam("size", values = c(2,3,5)), makeNumericParam("decay", lower = 0.0001, upper = 0.1) ) #Print parameter set
print(param_set) #Define a random search tuning method.
ctrl_random <- makeTuneControlRandom()

12. Perform hyperparameter tuning with mlr

#Define a random search tuning method.

ctrl_random <- makeTuneControlRandom(maxit = 6) #Define a 3 x 3 repeated cross-validation scheme
cross_val <- makeResampleDesc("RepCV", folds = 3 * 3) #Tune hyperparameters
tic()
lrn_tune <- tuneParams(lrn, task, resampling = cross_val, control = ctrl_random, par.set = param_set)
toc()

13. Evaluating hyperparameter tuning results

#Create holdout sampling

holdout <- makeResampleDesc("Holdout") #Perform tuning
lrn_tune <- tuneParams(learner = lrn, task = task, resampling = holdout, control = ctrl_random, par.set = param_set) #Generate hyperparameter effect data
hyperpar_effects <- generateHyperParsEffectData(lrn_tune, partial.dep = TRUE) #Plot hyperparameter effects
plotHyperParsEffect(hyperpar_effects, partial.dep.learn = "regr.glm", x = "minsplit", y = "mmce.test.mean", z = "maxdepth", plot.type = "line")

14. Define aggregated measures

#Create holdout sampling

holdout <- makeResampleDesc("Holdout", predict = "both") #Perform tuning
lrn_tune <- tuneParams(learner = lrn, task = task, resampling = holdout, control = ctrl_random, par.set = param_set, measures = list(acc, setAggregation(acc, train.mean), mmce, setAggregation(mmce, train.mean)))

15. Setting hyperparameters

#Set hyperparameters

lrn_best <- setHyperPars(lrn, par.vals = list(size = 1, maxit = 150, decay = 0)) #Train model
model_best <- train(lrn_best, task)

16. Prepare data for modelling with h2o

#Initialise h2o cluster

h2o.init() #Convert data to h2o frame
seeds_train_data_hf <- as.h2o(seeds_train_data) #Identify target and features
y <- "seed_type"
x <- setdiff(colnames(seeds_train_data_hf), y) #Split data into train & validation sets
sframe <- h2o.splitFrame(seeds_train_data_hf, seed = 42)
train <- sframe[[1]]
valid <- sframe[[2]] #Calculate ratio of the target variable in the training set
summary(train$seed_type, exact_quantiles = TRUE)

17. Modeling with h2o

#Train random forest model

rf_model <- h2o.randomForest(x = x, y = y, training_frame = train, validation_frame = valid) #Calculate model performance
perf <- h2o.performance(rf_model, valid = TRUE) #Extract confusion matrix
h2o.confusionMatrix(perf) #Extract logloss
h2o.logloss(perf)

18. Grid search with h2o

#Define hyperparameters

dl_params <- list(hidden = list(c(50,50), c(100,100)), epochs = c(5, 10, 15), rate = c(0.001, 0.005, 0.01))

19. Random search with h2o

#Define search criteria

search_criteria <- list(strategy = "RandomDiscrete", max_runtime_secs = 10, seed = 42) #Train with random search
dl_grid <- h2o.grid("deeplearning", grid_id = "dl_grid", x = x, y = y, training_frame = train, validation_frame = valid, seed = 42, hyper_params = dl_params, search_criteria = search_criteria)

20. Stopping criteria

#Define early stopping

stopping_params <- list(strategy = "RandomDiscrete", stopping_metric = "misclassification", stopping_rounds = 2, stopping_tolerance = 0.1, seed = 42)

21. AutoML in h2o

#Run automatic machine learning

automl_model <- h2o.automl(x = x, y = y, training_frame = train, max_runtime_secs = 10, sort_metric = "mean_per_class_error", nfold = 3, seed = 42) #Run automatic machine learning
automl_model <- h2o.automl(x = x, y = y, training_frame = train, max_runtime_secs = 10, sort_metric = "mean_per_class_error", validation_frame = valid, seed = 42)

22. Extract h2o models and evaluate performance

#Extract the leaderboard

lb <- automl_model@leaderboard
head(lb) #Assign best model new object name
aml_leader <- automl_model@leader #Look at best model
summary(aml_leader) Bayesian Regression Modeling with rstanarm

Course 9: Bayesian Regression Modeling with rstanarm

Chapter 1. Introduction to Bayesian Linear Models

1. Non-Bayesian Linear Regression

1.1 Review of frequentist regression

Use ordinary least square
Comparing frequentist (linear model) and Bayesian probablities P(M+|C) = 0.9 (Women with breast cancer has 90% positive mammogram; = p-value of data given the null hypothesis P(C) = 0.04 (We have data 0.4% of women in US have breast cancer; Prior OR parameter we believe before looking into data) P(M+) = (0.9 x 0.04) + (0.1 x 0.996) = 0.1032 P(C|M+) = 0.036 (actual data are very diffent from 0.9; importance of making inferences about the parameter we are interested in (the probability of cancer), rather than the data (the probability of a positive mammogram))

1.2 Steps and commands

Predict new data

LM_MODEL <- lm(Y ~ X, data = DF)
summary(LM_MODEL)

Examine model coefficients

library(broom)
tidy(LM_MODEL)

2. Bayesian Linear Regression

2.1

3.

3.1

Chapter 2. Modifying a Bayesian Model

1.

1.1

2.

2.1

3.

3.1

Chapter 3. Assessing Model Fit

1.

1.1

Chapter 4. Presenting and Using a Bayesian Regression

1.

1.1

2.

2.1

3.

3.1

Exercise (Course 9: Bayesian Regression Modeling with rstanarm)

1. Exploring the data

#Print the first 6 rows

head(songs) #Print the structure
str(songs)

2. Fitting a frequentist linear regression

#Create the model here

(Copied) Machine Learning Scientist with R Course 8 - 10

.mfe-app-workspace-kj242g{position:absolute;top:-8px;}.mfe-app-workspace-11ezf91{display:inline-block;}.mfe-app-workspace-11ezf91:hover .Anchor__copyLink{visibility:visible;}Course 8: Hyperparameter Tuning in R

Chapter 1. Introduction to hyperparameters

1. Parameters vs hyperparameters

1.1 Parameters (model parameters)

1.2 Hyperparameters

2. Recap of machine learning basics

2.1 Splitting data (ML with caret)

2.2 Train ML with caret

2.3 List of mode methods in caret

3. Hyperparameter tuning

3.1 Specific hyperparameters for model algorithm

3.2 Hyperparameter tuning in caret (e.g. gbm)

3.3 Hyperparameter tuning in Support Vector Machines (SVM)

Chapter 2. Hyperparameter tuning with caret

1. Hyperparameter tuning in caret

1.1 Method 1: Cartesian grid search with caret

1.2 Method 2: Plot hyperparameter models

2. Grid Search (Method 1) vs. Random Search (Method 3)

2.1 Grid search (Cartesian grid)

2.2 Method 3: Random search

3. Method 4: Adaptive resampling

3.1 Definition and concept

3.2 Steps and commands

Chapter 3. Hyperparameter tuning with mlr

1. Machine learning with mlr

1.1 Basics

1.2 Tasks in mlr for supervised learning

1.3 Learners in mlr

1.3 Steps and commands

2. Hyperparameter tuning with mlr (grid and random search)

2.1 Requirements for tuning in mlr

2.2 Steps and commands

3. Evaluating hyperparameters with mlr

3.1 Benefits of hyperparameter evaluation

3.2 Steps and commands

4. Advanced tuning with mlr

4.1 List of advanced tuning controls

4.2 Steps and commands (examples with makeTuneControlGenSA)

Chapter 4. Hyperparameter tuning with h2o

1. Machine learning with h2o

1.1 Basics

1. Steps and commands

2. Hyperparameter tuning with h2o (grid and random search)

2.1 Hyperparameters in H2O models

2.2 Steps and commands [search in step 6)]

3. Automatic machine learning with H2O

3.1 Automatic tuning of algorithm (AutoML)

3.2 AutoML in H2O:

3.3 Steps and commands

Exercise (Course 8: Hyperparameter Tuning in R)

1. Model parameters vs. hyperparameters

2. What are the coefficients?

3. Machine learning with caret

4. Changing the number of hyperparameters to tune

5. Tune hyperparameters manually

6. Cartesian grid search in caret

7. Plot hyperparameter model output

8. Grid search with range of hyperparameters

9. Adaptive Resampling with caret

10. Modeling with mlr

11. Random search with mlr

12. Perform hyperparameter tuning with mlr

13. Evaluating hyperparameter tuning results

14. Define aggregated measures

15. Setting hyperparameters

16. Prepare data for modelling with h2o

17. Modeling with h2o

18. Grid search with h2o

19. Random search with h2o

20. Stopping criteria

21. AutoML in h2o

22. Extract h2o models and evaluate performance

Course 9: Bayesian Regression Modeling with rstanarm

Chapter 1. Introduction to Bayesian Linear Models

1. Non-Bayesian Linear Regression

1.1 Review of frequentist regression

1.2 Steps and commands

2. Bayesian Linear Regression

2.1

Course 8: Hyperparameter Tuning in R