trainControl参数详解

源码

caret::trainControl <-
function (method = "boot", number = ifelse(grepl("cv", method), 10, 25), repeats = ifelse(grepl("[d_]cv$", method), 1, NA), p = 0.75, search = "grid", initialWindow = NULL,  horizon = 1, fixedWindow = TRUE, skip = 0, verboseIter = FALSE， returnData = TRUE, returnResamp = "final", savePredictions = FALSE, classProbs = FALSE, summaryFunction = defaultSummary, selectionFunction = "best", preProcOptions = list(thresh = 0.95, ICAcomp = 3, k = 5, freqCut = 95/5, uniqueCut = 10, cutoff = 0.9), sampling = NULL, index = NULL, indexOut = NULL, indexFinal = NULL, timingSamps = 0, predictionBounds = rep(FALSE, 2), seeds = NA, adaptive = list(min = 5, alpha = 0.05, method = "gls", complete = TRUE), trim = FALSE, allowParallel = TRUE)
{if (is.null(selectionFunction)) stop("null selectionFunction values not allowed")if (!(returnResamp %in% c("all", "final", "none"))) stop("incorrect value of returnResamp")if (length(predictionBounds) > 0 && length(predictionBounds) != 2) stop("'predictionBounds' should be a logical or numeric vector of length 2")if (any(names(preProcOptions) == "method")) stop("'method' cannot be specified here")if (any(names(preProcOptions) == "x")) stop("'x' cannot be specified here")if (!is.na(repeats) & !(method %in% c("repeatedcv", "adaptive_cv"))) warning("`repeats` has no meaning for this resampling method.", call. = FALSE)if (!(adaptive$method %in% c("gls", "BT"))) stop("incorrect value of adaptive$method")if (adaptive$alpha < 1e-07 | adaptive$alpha > 1) stop("incorrect value of adaptive$alpha")if (grepl("adapt", method)) {num <- if (method == "adaptive_cv") number * repeatselse numberif (adaptive$min >= num) stop(paste("adaptive$min should be less than", num))if (adaptive$min <= 1) stop("adaptive$min should be greater than 1")}if (!(search %in% c("grid", "random"))) stop("`search` should be either 'grid' or 'random'")if (method == "oob" & any(names(match.call()) == "summaryFunction")) {warning("Custom summary measures cannot be computed for out-of-bag resampling. ", "This value of `summaryFunction` will be ignored.", call. = FALSE)}list(method = method, number = number, repeats = repeats, search = search, p = p, initialWindow = initialWindow, horizon = horizon, fixedWindow = fixedWindow, skip = skip, verboseIter = verboseIter, returnData = returnData, returnResamp = returnResamp, savePredictions = savePredictions, classProbs = classProbs, summaryFunction = summaryFunction, selectionFunction = selectionFunction, preProcOptions = preProcOptions, sampling = sampling, index = index, indexOut = indexOut, indexFinal = indexFinal, timingSamps = timingSamps, predictionBounds = predictionBounds, seeds = seeds, adaptive = adaptive, trim = trim, allowParallel = allowParallel)
}

参数详解

trainControl	所有参数详解
method	重抽样方法:`Bootstrap(有放回随机抽样)` 、`Bootstrap632(有放回随机抽样扩展)`、`LOOCV(留一交叉验证)`、`LGOCV(蒙特卡罗交叉验证)`、`cv(k折交叉验证)`、`repeatedcv(重复的k折交叉验证)`、`optimism_boot(Efron, B., & Tibshirani, R. J. (1994). “An introduction to the bootstrap”, pages 249-252. CRC press.)`、`none(仅使用一个训练集拟合模型)`、`oob(袋外估计：随机森林、多元自适应回归样条、树模型、灵活判别分析、条件树)`
number	控制K折交叉验证的数目或者Bootstrap和LGOCV的抽样迭代次数
repeats	控制重复交叉验证的次数
p	LGOCV：控制训练比例
verboseIter	输出训练日志的逻辑变量
returnData	逻辑变量，把数据保存到`trainingData`中（`str(trainControl)`查看）
search	search = `grid(网格搜索)`，`random(随机搜索)`
returnResamp	包含以下值的字符串：`final、all、none`，设定有多少抽样性能度量被保存。
classProbs	是否计算类别概率
summaryFunction	根据重抽样计算模型性能的函数
selectionFunction	选择最优参数的函数
index	指定重抽样样本(使用相同的重抽样样本评估不同的算法、模型)
allowParallel	是否允许并行

示例

library(mlbench) #使用包中的数据
Warning message:
程辑包‘mlbench’是用R版本4.1.3 来建造的
> data(Sonar)
> str(Sonar[, 1:10])
'data.frame':   208 obs. of  10 variables:$ V1 : num  0.02 0.0453 0.0262 0.01 0.0762 0.0286 0.0317 0.0519 0.0223 0.0164 ...$ V2 : num  0.0371 0.0523 0.0582 0.0171 0.0666 0.0453 0.0956 0.0548 0.0375 0.0173 ...$ V3 : num  0.0428 0.0843 0.1099 0.0623 0.0481 ...$ V4 : num  0.0207 0.0689 0.1083 0.0205 0.0394 ...$ V5 : num  0.0954 0.1183 0.0974 0.0205 0.059 ...$ V6 : num  0.0986 0.2583 0.228 0.0368 0.0649 ...$ V7 : num  0.154 0.216 0.243 0.11 0.121 ...$ V8 : num  0.16 0.348 0.377 0.128 0.247 ...$ V9 : num  0.3109 0.3337 0.5598 0.0598 0.3564 ...$ V10: num  0.211 0.287 0.619 0.126 0.446 ...

数据分割：

library(caret)
set.seed(998)
inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE)
training <- Sonar[ inTraining,] #训练集
testing  <- Sonar[-inTraining,] #测试集

模型拟合：

fitControl <- trainControl(## 10折交叉验证method = "repeatedcv",number = 10,## 重复10次repeats = 1)set.seed(825)
gbmFit1 <- train(Class ~ ., data = training, method = "gbm", # 助推树trControl = fitControl,verbose = FALSE)
gbmFit1
Stochastic Gradient Boosting 157 samples60 predictor2 classes: 'M', 'R' No pre-processing
Resampling: Cross-Validated (10 fold, repeated 10 times)
Summary of sample sizes: 141, 142, 141, 142, 141, 142, ...
Resampling results across tuning parameters:interaction.depth  n.trees  Accuracy   Kappa    1                   50      0.7935784  0.57978391                  100      0.8171078  0.62902081                  150      0.8219608  0.63831732                   50      0.8041912  0.60277712                  100      0.8296176  0.65447132                  150      0.8283627  0.65201813                   50      0.8110343  0.61703173                  100      0.8301275  0.65513793                  150      0.8310343  0.6577252Tuning parameter 'shrinkage' was held constant at a value of 0.1Tuning parameter 'n.minobsinnode' was held constant at a value of 10
Accuracy was used to select the optimal model using the largest value.
The final values used for the model were n.trees = 150, interaction.depth= 3, shrinkage = 0.1 and n.minobsinnode = 10.

以上就是关于trainControl函数使用过程的一些细节。欢迎有问题的同学评论区留言！

机器学习之R语言caret包trainControl函数(控制调参)相关推荐

R语言caret包构建机器学习回归模型（regression model）、使用DALEX包进行模型解释分析、特征重要度、偏依赖分析等
R语言caret包构建机器学习回归模型(regression model).使用DALEX包进行模型解释分析.特征重要度.偏依赖分析等目录
R语言stringr包str_dup函数字符串多次复制实战
R语言stringr包str_dup函数字符串多次复制实战目录 R语言stringr包str_dup函数字符串多次复制实战 #导入stringr包 #仿真数据
R语言stringr包str_count函数计算字符串匹配个数实战
R语言stringr包str_count函数计算字符串匹配个数实战目录 R语言stringr包str_count函数计算字符串匹配个数实战 #导入stringr包 #仿真数据
R语言ggpubr包ggsummarystats函数可视化分组条形图(自定义分组颜色、添加抖动数据点jitter、误差条)并在X轴标签下方添加分组对应的统计值（样本数N、中位数、四分位数的间距iqr)
R语言ggpubr包ggsummarystats函数可视化分组条形图(自定义分组颜色.添加抖动数据点jitter.误差条error bar)并在X轴标签下方添加分组对应的统计值(样本数N.中位数med ...
R语言plyr包round_any函数将向量数据近似到任意精度实战
R语言plyr包round_any函数向量将数据近似到任意精度实战目录 R语言plyr包round_any函数向量将数据近似到任意精度实战 #导入plyr包 #仿真数据
R语言stringr包str_detect函数检测字符串中模式存在与否实战
R语言stringr包str_detect函数检测字符串中模式存在与否实战目录 R语言stringr包str_detect函数检测字符串中模式存在与否实战 #导入stringr包
R语言caret包构建xgboost模型实战：特征工程（连续数据离散化、因子化、无用特征删除）、配置模型参数（随机超参数寻优、10折交叉验证）并训练模型
R语言caret包构建xgboost模型实战:特征工程(连续数据离散化.因子化.无用特征删除).配置模型参数(随机超参数寻优.10折交叉验证)并训练模型目录
R语言dplyr包arrage函数排序dataframe实战：单列排序、多列排序、自定义排序
R语言dplyr包arrage函数排序dataframe实战:单列排序.多列排序.自定义排序目录 R语言dplyr包arrage函数排序dataframe实战:单列排序.多列排序
R语言dplyr包mutate_if函数修改所有满足条件的数据列的内容实战
R语言dplyr包mutate_if函数修改所有满足条件的数据列的内容实战目录 R语言dplyr包mutate_if函数修改所有满足条件的数据列的内容实战

机器学习之R语言caret包trainControl函数(控制调参)

机器学习之R语言caret包trainControl函数（控制调参）

trainControl参数详解

源码

参数详解

示例

机器学习之R语言caret包trainControl函数(控制调参)相关推荐

最新文章

热门文章