(Copied) Machine Learning Scientist with R Course 8 - 10
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Course 8: Hyperparameter Tuning in R

    Chapter 1. Introduction to hyperparameters

    1. Parameters vs hyperparameters

    1.1 Parameters (model parameters)

    1. Definition = model parameters are being fit DURING training, and they are the result of model fitting or training
    2. Examples in linear model: coefficients (found during fitting)
    3. Examples in machine learning model: weights, biases of neural nets that are optimized during training (model parameters)

    1.2 Hyperparameters

    1. Definition = parameters being set BEFORE training, and they specify HOW the training is supposed to happen. (most are optional)
    2. Examples in linear model: method (an option to set before fitting)
    3. Examples in machine learning model: learning rates, weight decay, number of trees in random forest
    4. Why need tuning? = Want to find the best combination of hyperparameters

    2. Recap of machine learning basics

    2.1 Splitting data (ML with caret)

    1. Load
    • library(caret)set.seed(42)
    1. Set partition index (proportion for training dataset)
    • INDEX <- createDataPartition(FULL_DF$RESP_VAR, p =.70, # Set partition e.g. 70:30 list=FALSE)
    1. Split data into training and testing dataset
    • TRAINING_DATA <- FULL_DF[INDEX,]
    • TESTING_DATA <- FULL_DF[-INDEX,] Note: training dataset should have enough power and test set should be representative

    2.2 Train ML with caret

    1. Set up cross-validation
    • library(carert)
    • library(tictoc) # Use for estimating how long it taks to run
    • fitControl <- trainControl(method ="repeatedcv", number =3, repeats =5) # Set cross validation (number of folds and number of repeats) Note: Valid resampling methods in caret
      • LGOCV (leave-group-out cross-validation)
      • boot (adabost is not a resampling scheme)
      • cv
    1. Train a random forest model with caret (automatic tuning)
    • tic()
    • set.seed(INT)
    • rf_model <- train(Y ~ X, data = TRAINING_DATA, method ="rf", trControl = fitControl, # Specify valiation
      verbose =FALSE)
    • toc()

    2.3 List of mode methods in caret

    • "gbm" = Stochastic Gradient Boosting model
    • "nnet" = Regulat neural network
    • "rf" = Random forest
    • "svmPoly" = Support Vector Machines

    3. Hyperparameter tuning

    3.1 Specific hyperparameters for model algorithm

    • modelLookup("gbm") # Specify model
    • https://topepo.github.io/caret/available-models.html

    3.2 Hyperparameter tuning in caret (e.g. gbm)

    1. Define hyperparameter grid (Cartesian grid)
    • HYPERPARAMETERS <- expand.grid(n.trees = 200, # Number of trees interaction.depth = 1, # Tree complexity shrinkage = 0.1, # Learning rate n.minobsinnode = 10) # Min when splitting
    1. Apply hyperparameter grid to train().
    • set.seed(INT)
    • gbm_model <- train(Y ~ X, data = TRAINING_DATA, method = "gbm", trControl = trainControl(method = "repeatedcv", number = 5, repeats = 3), verbose = FALSE, tuneGrid = HYPERPARAMETERS)

    3.3 Hyperparameter tuning in Support Vector Machines (SVM)

    1. tuneLength
    • tic()
    • set.seed(INT)
    • SVM <- train(Y ~ X,
      data = TRAINING_DATA,
      method ="svmPoly",
      trControl = fitControl,
      verbose =FALSE,
      tuneLength =5) # Specify maximum number of tuning parameter combinations
    • toc()
    1. tuneGrid and expand.grid (manual method)
    • HYPERPARAMETERS <- expand.grid(degree =4, scale =1, C =1) # Specify grid or hyperparameter(s) to tune
    • set.seed(INT)
    • SVM <- train(Y ~ X,
      data = TRAINING_DATA,
      method ="svmPoly",
      trControl = fitControl,
      verbose =FALSE, tuneGrid = HYPERPARAMETERS # Specify tuneGrid )
    • toc()

    Chapter 2. Hyperparameter tuning with caret

    1. Hyperparameter tuning in caret

    1.1 Method 1: Cartesian grid search with caret

    1. Concept: define a grid of possible tune parameters and run; cons = takes longer time to run
    2. Commands
    • MANUAL_GRID <- expand.grid(n.trees = c(100, 200, 250), interaction.depth = c(1, 4, 6), shrinkage = 0.1, n.minobsinnode = 10)

    1.2 Method 2: Plot hyperparameter models

    • plot(GBM_MODEL) # plot graph with each line representing each value of hyperparameter
    • plot(GBM_MODEL, metric = "Kappa",plotType = "level") # Use level-plot Note: kappa used to evaluate performane of classification model (comparing expected vs observed accuracy; higher kappa = better)

    2. Grid Search (Method 1) vs. Random Search (Method 3)

    2.1 Grid search (Cartesian grid)

    1. Concept: need to specify values; cons = get slow and computationally expensive (esp. big grid)
    2. Commands 3.1) Define a grid
    • BIG_GRID <- expand.grid(n.trees = seq(from = 10, to = 300, by = 50), interaction.depth = seq(from = 1, to = 10, length.out = 6), shrinkage = 0.1, n.minobsinnode = 10)
    • fitControl <- trainControl(method = "repeatedcv", number = 3, repeats = 5, search = "grid") # Specify argument search "grid" 3.2) Set "tuneGrid" argument
    • gbm_model <- train(Y ~ X, data = TRAINING_DATA, method = "gbm", trControl = fitControl verbose = FALSE, tuneGrid = BIG_GRID)

    2.2 Method 3: Random search

    1. Concept: random values; pros = faster than grid search
    2. Note: in caret, random search cannot be combined with grid search
    3. Commands 3.1) Define random search in trainControl function
    • library(caret)
    • fitControl <- trainControl(method = "repeatedcv", number = 3, repeats = 5, search = "random") # Specify argument search "random" 3.2) Set "tuneLength" argument
    • tic()
    • set.seed(INT)
    • GBM_MODEL <- train(Y ~X, data = TRAINING_DATA, method = "gbm", trControl = fitControl, verbose = FALSE, tuneLength = 5) # Set tuneLength or maximum number of tuning parameter combinations
    • toc()

    3. Method 4: Adaptive resampling

    3.1 Definition and concept

    1. Grid search (method 1) = all hyperparameters are computed.
    2. Random search (method 3) = random subsets of hyperparameters are computed.
    3. Adaptive resampling = hyperparameter combinations are resampled with values near combinations that performed well (suboptimal is not tested). Pros = faster and more efficient, but not better than methods 1 or 3

    3.2 Steps and commands

    1. Set adaptive resampling in trainControl()
    • fitControl <- trainControl(method = "adaptive_cv", adaptive = list(min = 2, # min no of resamples per hyperparam alpha = 0.05, # confidence level for removing hyperparam method = "gls", # "gls" = linear; "BT = Bradley Terry complete = TRUE), # TRUE = generate full resampling set search = "random")
    1. Apply trainControl() and set tuneLength argument
    • tic()
    • set.seed(INT)
    • GBM_MODEL <- train(Y ~ X, data = TRAINING_DATA, method = "gbm", trControl = fitControl, verbose = FALSE, tuneLength = 7) # Set tuneLength or maximum number of tuning parameter combinations (in reality, maybe 100)(default = 3)
    • toc()

    Chapter 3. Hyperparameter tuning with mlr

    1. Machine learning with mlr

    1.1 Basics

    1. mlr = another framework for ML in R
    2. Model training follows 3 steps 2.1) Define the task 2.2) Define the learner 2.3) Fit the model

    1.2 Tasks in mlr for supervised learning

    1. RegTask() : for regression
    2. ClassifTask(): for binary and multi-class classification
    3. MultilabelTask(): for multi-label classification problems
    4. CostSensTask(): for general cost-sensitive classification

    1.3 Learners in mlr

    • listLearners() # Return available learners: classif. = classification learner, regr. = regression learner, multilabel. = multi-label classification

    1.3 Steps and commands

    1. Define the task
    • TASK <- makeClassifTask(data = DF, target = "RESP_VAR")
    1. Define the learner
    • LRN <- makeLearner("classif.h2o.deeplearning", # Enter a learner fix.factors.prediction = TRUE, # Handle missing data predict.type = "prob")
    1. Fit the model
    • MODEL <- train(LRN, TASK)

    2. Hyperparameter tuning with mlr (grid and random search)

    2.1 Requirements for tuning in mlr

    1. Search space for every hyperparameter
    2. Tuning method (e.g. grid or random)
    3. Resampling strategy

    2.2 Steps and commands

    1. Define the search space
    • getParamSet("classif.h2o.deeplearning") # Returns hyperparameter in the model
    • PARAM_SET <- makeParamSet( makeDiscreteParam("hidden", values =list(one =10, two =c(10,5,10))), makeDiscreteParam("activation", values =c("Rectifier","Tanh")), makeNumericParam("l1", lower =0.0001, upper =1),
      makeNumericParam("l2", lower =0.0001, upper =1))
    1. Define the tuning method 2.1) Grid search: ONLY deal with discrete hyperparameter sets
    • CTRL_GRID <- makeTuneControlGrid() 2.2) Random search: CAN deal with all kinds of hyperparameter
    • CTRL_RANDOM <- makeTuneControlRandom(maxit = INT) # Specify max number
    1. Define resampling strategy
    • CROSS_VAL <-makeResampleDesc("RepCV" # Can be "Holdout" predict = "both", # If want to predict training AND validation data folds = 5 * 3)
    1. Apply 3 steps of model training
    • TASK <- makeClassifTask(data = DF, target = "RESP_VAR")
    • LRN <- makeLearner("classif.h2o.deeplearning", # Enter a learner fix.factors.prediction = TRUE, # Handle missing data predict.type = "prob")
    • MODEL <- train(LRN, TASK)
    1. Tune the model
    • LRN_TUNE <- tuneParams(LRN, TASK, resampling = CROSS_VAL control = CTRL_GRID, par.set = PARAM_SET)

    3. Evaluating hyperparameters with mlr

    3.1 Benefits of hyperparameter evaluation

    1. How differert hyperparameters affect the performance of our model
    2. Which hyperparameters have a strong or weak impact on our model performance
    3. Whether our hyperparameter search converged / whether we can be confident that we found the most optimal hyperparameter combination

    3.2 Steps and commands

    1. See tuning results
    • HYPERPAR_EFFECTS <- generateHyperParsEffectData(LRN_TUNE, partial.dep =TRUE) # Set to TRUE if tuning > 2 params
    1. Plot hyperparameter tuning results
    • plotHyperParsEffect(HYPERPAR_EFFECTS, partial.dep.learn ="regr.randomForest", x ="l1", y ="mmce.test.mean", z ="hidden",
      plot.type ="line")

    4. Advanced tuning with mlr

    4.1 List of advanced tuning controls

    1)makeTuneControlCMAES: CMA Evolution Strategy 2) makeTuneControlDesign: Predefined data frame of hyperparameters 3) makeTuneControlGenSA: Generalized simulated annealing (non-linear) 4) makeTuneControlIrace: Tuning with iterated F-Racing (config) 5) makeTuneControlMBO: Model-based/ Bayesian optimization (Use Bayesian statistics to optimize)

    4.2 Steps and commands (examples with makeTuneControlGenSA)

    1. Generalized simulated annealing
    • CTRL_GENSA <- makeTuneControlGenSA()
    1. Create holdout sampling
    • BOOTSTRAP <- makeResampleDesc("Bootstrap", predict ="both")
    1. Perform tuning
    • LRN_TUNE <- tuneParams(learner = LRN, task = TASK, resampling = BOOTSTRAP,
      control = CTRL_GENSA,
      par.set = PARAM_SET,
      measures =list(acc, # Customize measures setAggregation(acc, train.mean), mmce, setAggregation(mmce, train.mean))
    1. Nested cross-validation &nested resampling 4.1) Train directly
    • MODEL_NESTED <- train(LRN, TASK)
    • getTuneResult(MODEL_NESTED) 4.2) Add 2x nested validation
    • CV2 <- makeResampleDesc("CV", iters =2)
    • RES <- resample(LRN, TASK, resampling = CV2,
      extract = getTuneResult)
    • generateHyperParsEffectData(RES)
    1. Choose hyperparameters from a tuning set
    • LRN_BEST <- setHyperPars(LRN, par.vals =list(minsplit =4, minbucket =3, maxdepth =6))
    • MODEL_BEST <- train(LRN_BEST, TASK)
    • predict(MODEL_BEST, newdata = DF)

    Chapter 4. Hyperparameter tuning with h2o

    1. Machine learning with h2o

    1.1 Basics

    1. H2o = a open source for ML in R
    2. Difference from caret and mlr = scalability

    1. Steps and commands

    1. Initiate the package
    • library(h2o)
    • h2o.init() # Local initiate
    1. Prepare the data 2.1) Define H2o frame
    • DF_HF <- as.h2o(DF) 2.2) Define features and taget variable
    • TARGET <- "VAR"
    • FEATURE <- setdiff(colnames(DF_HF), TARGET) Note: for classification target variable
    • DF_HF[, TARGET] <- as.factor(DF_HF[, TARGET])
    1. Split training, validation and test sets 3.1) Split data
    • SFRAME <- h2o.splitFrame(data = DF_HF, ration = c(0.7, 0.15), # Default = 0.75, 0.25 seed = INT) # Set seed
    • TRAIN_DATA <- SFRAME[[1]]
    • VALIDATE_DATA <- SFRAME[[2]]
    • TEST_DATA <- SFRAME[[3]] 3.2) See ratio
    • summary(TRAIN_DATA$VAR, exact_quantiles = TRUE)
    1. Train model
    • GBM_MODEL <- h2o.gbm (X = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA) Note: available model for H2o h2o.gbm() & h2o.xgboost(): Gradient boosted models h2o.glm() Generalized linear models h2o.randomForest(): Random forest models h2o.deeplearning(): Neural networks
    1. Predict new data
    • h2o.predict(GBM_MODEL, TEST_DATA)
    1. Evaluate the model performance
    • PERFORMANCE <- h2o.performance(GBM_MODEL, TEST_DATA)
    • h2o.confusionMatrix(PERFORMANCE)
    • h2o.logloss(PERFORMANCE)

    2. Hyperparameter tuning with h2o (grid and random search)

    2.1 Hyperparameters in H2O models

    • ?h2o.gbm # Search hyperparameters for gradient boosting

    2.2 Steps and commands [search in step 6)]

    1. Initiate the package
    • library(h2o)
    • h2o.init() # Local initiate
    1. Prepare the data 2.1) Define H2o frame
    • DF_HF <- as.h2o(DF) 2.2) Define features and taget variable
    • TARGET <- "VAR"
    • FEATURE <- setdiff(colnames(DF_HF), TARGET) Note: for classification target variable
    • DF_HF[, TARGET] <- as.factor(DF_HF[, TARGET])
    1. Split training, validation and test sets 3.1) Split data
    • SFRAME <- h2o.splitFrame(data = DF_HF, ration = c(0.7, 0.15), # Default = 0.75, 0.25 seed = INT) # Set seed
    • TRAIN_DATA <- SFRAME[[1]]
    • VALIDATE_DATA <- SFRAME[[2]]
    • TEST_DATA <- SFRAME[[3]] 3.2) See ratio
    • summary(TRAIN_DATA$VAR, exact_quantiles = TRUE)
    1. Train model
    • GBM_MODEL <- h2o.gbm (X = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA)
    1. Manage hyperparameter grid 5.1) Define hyperparameters (see parameter by ?MODEL)
    • GBM_PARAMS <- list(ntrees = c(100, 150, 200), max_depth = c(3, 5, 7), learn_rate = c(0.001, 0.01, 0.1)) 5.2) Define search 5.2.1) Grid search
    • GBM_GRID <- h2o.grid("gbm", grid_id = "gbm_grid", x = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA, seed = INT, hyper_params = GBM_PARAMS)

    5.2.2) Random search [By runtime]

    • SEARCH_CRITERIA <- list(strategy = "RandomDiscrete", max_runtime_secs = 60, seed = INT [By stopping metric]

    • SEARCH_CRITERIA <- list(strategy = "RandomDiscrete", stopping_metric = "mean_per_class_error", stopping_tolerance = 0.0001, stopping_rounds = 6)

    • GBM_GRID <- h2o.grid("gbm", grid_id = "gbm_grid", x = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA, seed = INT, hyper_params = GBM_PARAMS search_criteria = SEARCH_CRITERIA) # Add search criteria 5.3) Examine a grid object

    • GBM_GRID_PERF <- h2o.getGrid(grid_id = "GBM_GRID", sort_by = "accuracy", decreasing = TRUE) 5.4) Extract the best model from a grid

    • BEST_GBM <- h2o.getModel(GBM_GRID_PERF@model_ids[[1]]) # First model = highest accuracy

    • h2o.performance(BEST_GBM, test)

    1. Predict new data
    • h2o.predict(GBM_MODEL, TEST_DATA)
    1. Evaluate the model performance
    • PERFORMANCE <- h2o.performance(GBM_MODEL, TEST_DATA)
    • h2o.confusionMatrix(PERFORMANCE)
    • h2o.logloss(PERFORMANCE)

    3. Automatic machine learning with H2O

    3.1 Automatic tuning of algorithm (AutoML)

    1. Makes model tuning and optimization much faster and easier
    2. Only needs a dataset, a target variable, and a time of model number limit for training

    3.2 AutoML in H2O:

    AutoML trains a number of different algorithms during a default classification run in this specific order:

    1. Generalized Linear Model (GLM)
    2. (Distributed) Random Forest (DRF)
    3. Extremely Randomized Trees (XRT)
    4. Extreme Gradient Boosting (XGBoost)
    5. Gradient Boosting Machines (GBM)
    6. Deep Learning (fully-connected multi-layer articial neural network)
    7. Stacked Ensembles (of all models & of best of family)

    3.3 Steps and commands

    1. Use H2o.automl function (same function as regular H2o)
    • AUTOML_MODEL <- h2o.automl(x = FEATURE, y = TARGET, training_frame = TRAIN_DATA, validation_frame = VALIDATE_DATA, # Have the leaderboard be calculated based on the valiation data, not cross-validation results max_runtime_secs = 60, # Time for grid search sort_metric = "logloss", # Choose metric to rank (e.g. AUC for binary classification, mean per class error for multinomial classification) seed = INT, nfold = INT) Note: the command above returns a leaderboard with the ranking of metric chosen.
    1. View the AutoML leaderboard
    • LB <- AUTOML_MODEL@leaderboard Note: if not specify leaderboard dataset, metrics will be calculated on 5-fold cross-validation results.
    1. Extract model from the AutoML leaderboard 3.1) List all models by model id
    • MODEL_ID <- as.data.frame(LB)$model_id 3.2) Get the best model
    • AML_LEADER <- AUTOML_MODEL@leader

    Exercise (Course 8: Hyperparameter Tuning in R)

    1. Model parameters vs. hyperparameters

    #Fit a linear model on the breast_cancer_data.
    • linear_model <- lm(concavity_mean ~ symmetry_mean, data = breast_cancer_data) #Look at the summary of the linear_model.
    • summary(linear_model) #Extract the coefficients.
    • linear_model$coefficients

    2. What are the coefficients?

    • library(ggplot2) #Plot linear relationship.
    • ggplot(data = breast_cancer_data, aes(x = symmetry_mean, y = concavity_mean)) + geom_point(color = "grey") + geom_abline(slope = linear_modelcoefficients[1])

    3. Machine learning with caret

    #Create partition index
    • index <- createDataPartition(breast_cancer_data$diagnosis, p = 0.7, list = FALSE) #Subset breast_cancer_data with index
    • bc_train_data <- breast_cancer_data[index, ]
    • bc_test_data <- breast_cancer_data[-index, ] #Define 3x5 folds repeated cross-validation
    • fitControl <- trainControl(method = "repeatedcv", number = 5, repeats = 3) #Run the train() function
    • gbm_model <- train(diagnosis ~ ., data = bc_train_data, method = "gbm", trControl = fitControl, verbose = FALSE)

    4. Changing the number of hyperparameters to tune

    #Set seed.
    • set.seed(42) #Start timer.
    • tic() #Train model.
    • gbm_model <- train(diagnosis ~ ., data = bc_train_data, method = "gbm", trControl = trainControl(method = "repeatedcv", number = 5, repeats = 3), verbose = FALSE, tuneLength = 4) #Stop timer.
    • toc()

    5. Tune hyperparameters manually

    #Define hyperparameter grid.
    • hyperparams <- expand.grid(n.trees = 200, interaction.depth = 1, shrinkage = 0.1, n.minobsinnode = 10)

      #Apply hyperparameter grid to train().

    • set.seed(42)

    • gbm_model <- train(diagnosis ~ ., data = bc_train_data, method = "gbm", trControl = trainControl(method = "repeatedcv", number = 5, repeats = 3), verbose = FALSE, tuneGrid = hyperparams)

    6. Cartesian grid search in caret

    #Define Cartesian grid
    • man_grid <- expand.grid(degree = c(1, 2, 3), scale = c(0.1, 0.01, 0.001), C = 0.5) #Start timer, set seed & train model
    • tic()
    • set.seed(42)
    • svm_model_voters_grid <- train(turnout16_2016 ~ ., data = voters_train_data, method = "svmPoly", trControl = fitControl, verbose= FALSE, tuneGrid = man_grid)
    • toc()

    7. Plot hyperparameter model output

    #Plot default
    • plot(svm_model_voters_grid) #Plot Kappa level-plot
    • plot(svm_model_voters_grid, metric = "Kappa", plotType = "level")

    8. Grid search with range of hyperparameters

    #Define the grid with hyperparameter ranges
    • big_grid <- expand.grid(size = seq(from = 1, to = 5, by = 1), decay = c(0, 1)) #Train control with grid search
    • fitControl <- trainControl(method = "repeatedcv", number = 3, repeats = 5, search = "grid") #Train neural net
    • tic()
    • set.seed(42)
    • nn_model_voters_big_grid <- train(turnout16_2016 ~ ., data = voters_train_data, method = "nnet", trControl = fitControl, verbose = FALSE, tuneGrid = big_grid)
    • toc()

    9. Adaptive Resampling with caret

    #Define trainControl function
    • fitControl <- trainControl(method = "adaptive_cv", number = 3, repeats = 3, adaptive = list(min = 3, alpha = 0.05, method = "BT", complete = FALSE), search = "random") #Start timer & train model
    • tic()
    • svm_model_voters_ar <- train(turnout16_2016 ~ ., data = voters_train_data, method = "nnet", trControl = fitControl, verbose = FALSE, tuneLength = 6)
    • toc()

    10. Modeling with mlr

    #Create classification taks
    • task <- makeClassifTask(data = knowledge_train_data, target = "UNS") #Call the list of learners
    • listLearners() %>% as.data.frame() %>% select(class, short.name, package) %>% filter(grepl("classif.", class)) #Create learner
    • lrn <- makeLearner("classif.randomForest", predict.type = "prob", fix.factors.prediction = TRUE)

    11. Random search with mlr

    #Get the parameter set for neural networks of the nnet package
    • getParamSet("classif.nnet") #Define set of parameters
    • param_set <- makeParamSet( makeDiscreteParam("size", values = c(2,3,5)), makeNumericParam("decay", lower = 0.0001, upper = 0.1) ) #Print parameter set
    • print(param_set) #Define a random search tuning method.
    • ctrl_random <- makeTuneControlRandom()

    12. Perform hyperparameter tuning with mlr

    #Define a random search tuning method.
    • ctrl_random <- makeTuneControlRandom(maxit = 6) #Define a 3 x 3 repeated cross-validation scheme
    • cross_val <- makeResampleDesc("RepCV", folds = 3 * 3) #Tune hyperparameters
    • tic()
    • lrn_tune <- tuneParams(lrn, task, resampling = cross_val, control = ctrl_random, par.set = param_set)
    • toc()

    13. Evaluating hyperparameter tuning results

    #Create holdout sampling
    • holdout <- makeResampleDesc("Holdout") #Perform tuning
    • lrn_tune <- tuneParams(learner = lrn, task = task, resampling = holdout, control = ctrl_random, par.set = param_set) #Generate hyperparameter effect data
    • hyperpar_effects <- generateHyperParsEffectData(lrn_tune, partial.dep = TRUE) #Plot hyperparameter effects
    • plotHyperParsEffect(hyperpar_effects, partial.dep.learn = "regr.glm", x = "minsplit", y = "mmce.test.mean", z = "maxdepth", plot.type = "line")

    14. Define aggregated measures

    #Create holdout sampling
    • holdout <- makeResampleDesc("Holdout", predict = "both") #Perform tuning
    • lrn_tune <- tuneParams(learner = lrn, task = task, resampling = holdout, control = ctrl_random, par.set = param_set, measures = list(acc, setAggregation(acc, train.mean), mmce, setAggregation(mmce, train.mean)))

    15. Setting hyperparameters

    #Set hyperparameters
    • lrn_best <- setHyperPars(lrn, par.vals = list(size = 1, maxit = 150, decay = 0)) #Train model
    • model_best <- train(lrn_best, task)

    16. Prepare data for modelling with h2o

    #Initialise h2o cluster
    • h2o.init() #Convert data to h2o frame
    • seeds_train_data_hf <- as.h2o(seeds_train_data) #Identify target and features
    • y <- "seed_type"
    • x <- setdiff(colnames(seeds_train_data_hf), y) #Split data into train & validation sets
    • sframe <- h2o.splitFrame(seeds_train_data_hf, seed = 42)
    • train <- sframe[[1]]
    • valid <- sframe[[2]] #Calculate ratio of the target variable in the training set
    • summary(train$seed_type, exact_quantiles = TRUE)

    17. Modeling with h2o

    #Train random forest model
    • rf_model <- h2o.randomForest(x = x, y = y, training_frame = train, validation_frame = valid) #Calculate model performance
    • perf <- h2o.performance(rf_model, valid = TRUE) #Extract confusion matrix
    • h2o.confusionMatrix(perf) #Extract logloss
    • h2o.logloss(perf)

    18. Grid search with h2o

    #Define hyperparameters
    • dl_params <- list(hidden = list(c(50,50), c(100,100)), epochs = c(5, 10, 15), rate = c(0.001, 0.005, 0.01))

    19. Random search with h2o

    #Define search criteria
    • search_criteria <- list(strategy = "RandomDiscrete", max_runtime_secs = 10, seed = 42) #Train with random search
    • dl_grid <- h2o.grid("deeplearning", grid_id = "dl_grid", x = x, y = y, training_frame = train, validation_frame = valid, seed = 42, hyper_params = dl_params, search_criteria = search_criteria)

    20. Stopping criteria

    #Define early stopping
    • stopping_params <- list(strategy = "RandomDiscrete", stopping_metric = "misclassification", stopping_rounds = 2, stopping_tolerance = 0.1, seed = 42)

    21. AutoML in h2o

    #Run automatic machine learning
    • automl_model <- h2o.automl(x = x, y = y, training_frame = train, max_runtime_secs = 10, sort_metric = "mean_per_class_error", nfold = 3, seed = 42) #Run automatic machine learning
    • automl_model <- h2o.automl(x = x, y = y, training_frame = train, max_runtime_secs = 10, sort_metric = "mean_per_class_error", validation_frame = valid, seed = 42)

    22. Extract h2o models and evaluate performance

    #Extract the leaderboard
    • lb <- automl_model@leaderboard
    • head(lb) #Assign best model new object name
    • aml_leader <- automl_model@leader #Look at best model
    • summary(aml_leader) Bayesian Regression Modeling with rstanarm

    Course 9: Bayesian Regression Modeling with rstanarm

    Chapter 1. Introduction to Bayesian Linear Models

    1. Non-Bayesian Linear Regression

    1.1 Review of frequentist regression

    1. Use ordinary least square
    2. Comparing frequentist (linear model) and Bayesian probablities P(M+|C) = 0.9 (Women with breast cancer has 90% positive mammogram; = p-value of data given the null hypothesis P(C) = 0.04 (We have data 0.4% of women in US have breast cancer; Prior OR parameter we believe before looking into data) P(M+) = (0.9 x 0.04) + (0.1 x 0.996) = 0.1032 P(C|M+) = 0.036 (actual data are very diffent from 0.9; importance of making inferences about the parameter we are interested in (the probability of cancer), rather than the data (the probability of a positive mammogram))

    1.2 Steps and commands

    1. Predict new data
    • LM_MODEL <- lm(Y ~ X, data = DF)
    • summary(LM_MODEL)
    1. Examine model coefficients
    • library(broom)
    • tidy(LM_MODEL)

    2. Bayesian Linear Regression

    2.1

    3.

    3.1

    Chapter 2. Modifying a Bayesian Model

    1.

    1.1

    2.

    2.1

    3.

    3.1

    Chapter 3. Assessing Model Fit

    1.

    1.1

    Chapter 4. Presenting and Using a Bayesian Regression

    1.

    1.1

    2.

    2.1

    3.

    3.1

    Exercise (Course 9: Bayesian Regression Modeling with rstanarm)

    1. Exploring the data

    #Print the first 6 rows
    • head(songs) #Print the structure
    • str(songs)

    2. Fitting a frequentist linear regression

    #Create the model here
    • lm_model <- lm(popularity ~ song_age, data = songs) #Produce the summary
    • summary(lm_model) #Print a tidy summary of the coefficients
    • tidy(lm_model)

    3.

    4.

    5.

    6.

    7.

    8.

    9.

    10.

    11.

    12.

    13.

    14.

    15.

    16.

    17.

    18.

    19.

    20.

    21.

    22.

    23.

    24.

    25.

    26.

    27.

    28.

    29.

    30.

    [Template

    Course 10:

    Chapter 1.

    1.

    1.1

    2.

    2.1

    3.

    3.1

    Chapter 2.

    1.

    1.1

    2.

    2.1

    3.

    3.1

    Chapter 3.

    1.

    1.1

    Chapter 4.

    1.

    1.1

    2.

    2.1

    3.

    3.1

    Exercise (Course 10: xxxx)

    1.

    2.

    3.

    4.

    5.

    6.

    7.

    8.

    9.

    10.

    11.

    12.

    13.

    14.

    15.

    16.

    17.

    18.

    19.

    20.

    21.

    22.

    23.

    24.

    25.

    26.

    27.

    28.

    29.

    30.]

    [Template

    Course 11:

    Chapter 1.

    1.

    1.1

    2.

    2.1

    3.

    3.1

    Chapter 2.

    1.

    1.1

    2.

    2.1

    3.

    3.1

    Chapter 3.

    1.

    1.1

    Chapter 4.

    1.

    1.1

    2.

    2.1

    3.

    3.1

    Exercise (Course 11: xxxx)

    1.

    2.

    3.

    4.

    5.

    6.

    7.

    8.

    9.

    10.

    11.

    12.

    13.

    14.

    15.

    16.

    17.

    18.

    19.

    20.

    21.

    22.

    23.

    24.

    25.

    26.

    27.

    28.

    29.

    30.]

    Resource center

    1. H2o automatic tuning

    https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html