本文介绍了R 理解 {caret} train(tuneLength = ) 和来自 {kernlab} 的 SVM 方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

试图更好地理解 train(tuneLength = ){caret} 中的工作原理.当我试图从 {kernlab} 中理解 SVM 方法之间的一些差异时,我的困惑发生了我已经查看了文档(

有没有办法查看幕后"发生了什么?例如,M1 (svmRadial) 和 M3 (svmRadialSigma) 都采用相同的调子参数,但基于调用 $results 似乎以不同的方式使用它们?

我对 train(tuneLength = 9) 的理解是,两个模型都会产生 sigmaC 的结果,每个都带有 9values, 9 times 因为 9 是每个调整参数的级别数(随机搜索除外)?类似地,M4 将是 9^3,因为 train(tuneLength = 9) 并且有 3 调整参数?

迈克尔

解决方案

我需要更多地更新包文档,但详细信息在 打包网页进行随机搜索:

唯一组合的总数由 tuneLength 选项指定到 train."

然而,这是使用 RBF 内核的特别混乱的 SVM.这是一个破败:

  • svmRadial 根据 kern labsigest 函数调整成本并使用 sigma 的单个值.对于网格搜索,tuneLength 是要测试的成本值的数量,对于随机搜索,它是要评估的 (cost, sigma) 对的总数.
  • svmRadialCostsvmRadial 相同,但 sigest 在每个重采样循环内运行.对于随机搜索,它不会调整 sigma.
  • svmRadialSigma 使用网格搜索调整成本和 sigma.在认知表现欠佳的时刻,我将其设置为在网格搜索期间最多尝试 6 个 sigma 值,因为我觉得成本空间需要更大的范围.对于随机搜索,它的作用与 svmRadial 相同.
  • svmRadialWeightsvmRadial 相同,但也考虑了类权重,仅适用于 2 类问题.

至于网页上的 SOM 示例,这是一个错误.我对 SOM 参数空间进行了过采样,因为 xdim <= ydim & 需要一个过滤器.xdim*ydim .错误来自我没有保留适量的参数.

Trying to better understand how train(tuneLength = ) works in {caret}. My confusion happened when trying to understand some of the differences between the SVM methods from {kernlab} I've reviewed the documentation (here) and the caret training page (here).

My toy example was creating five models using the iris dataset. Results are here, and reproducible code is here (they're rather long so I didn't copy and paste them into the post).

From the {caret} documentation:

In this example, trainControl(search = "random") and train(tuneLength = 30), but there appears to be 67 results, not 30 (the maximum number of tuning parameter combinations)? I tried playing around to see if maybe there were 30 unique ROC values, or even ydim values, but by my count they're not.

For the toy example, I created the following table:

Is there a way to see what's going on "under the hood"? For instance, M1 (svmRadial) and M3 (svmRadialSigma) both take, and are given, the same tune parameters, but based on calling $results appear to use them differently?

My understanding of train(tuneLength = 9) was that both models would produce results of sigma and C each with 9 values, 9 times since 9 is the number of levels for each tuning parameter (the exception being random search)? Similarly, M4 would be 9^3 since train(tuneLength = 9) and there are 3 tuning parameters?

Michael

解决方案

I need to update the package documentation more but the details are spelled on on the package web page for random search:

However, this is particularly muddy SVMs using the RBF kernel. Here is a run down:

  • svmRadial tunes over cost and uses a single value of sigma based on kern lab's sigest function. For grid search, tuneLength is the number of cost values to test and for random search it is the total number of (cost, sigma) pairs to evaluate.
  • svmRadialCost is the same as svmRadial but sigest is run inside of each resampling loop. For random, search, it does not tune over sigma.
  • svmRadialSigma with grid search tunes over both cost and sigma. In a moment of sub-optimal cognitive performance, I set this up to try at most 6 values of sigma during grid search since I felt that cost space needed a wider range. For random search it does the same as svmRadial.
  • svmRadialWeight is the same as svmRadial but also considered class weights and is for 2-class problems only.

As for the SOM example on the webpage, well that's a bug. I over-sample the SOM parameter space since there needs to be a filter for xdim <= ydim & xdim*ydim < nrow(x). The bug is from me not keeping the right amount of parameters.

这篇关于R 理解 {caret} train(tuneLength = ) 和来自 {kernlab} 的 SVM 方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-01 07:58