i have some data and Y variable is a factor - Good or Bad. I am building a Support vector machine using 'train' method from 'caret' package. Using 'train' function i was able to finalize values of various tuning parameters and got the final Support vector machine . For the test data i can predict the 'class'. But when i try to predict probabilities for test data, i get below error (for example my model tells me that 1st data point in test data has y='good', but i want to know what is the probability of getting 'good' ...generally in case of support vector machine, model will calculate probability of prediction..if Y variable has 2 outcomes then model will predict probability of each outcome. The outcome which has the maximum probability is considered as the final solution)**Warning message:In probFunction(method, modelFit, ppUnk) : kernlab class probability calculations failed; returning NAs**示例代码如下library(caret)trainset <- data.frame( class=factor(c("Good", "Bad", "Good", "Good", "Bad", "Good", "Good", "Good", "Good", "Bad", "Bad", "Bad")), age=c(67, 22, 49, 45, 53, 35, 53, 35, 61, 28, 25, 24))testset <- data.frame( class=factor(c("Good", "Bad", "Good" )), age=c(64, 23, 50))library(kernlab)set.seed(231)### finding optimal value of a tuning parametersigDist <- sigest(class ~ ., data = trainset, frac = 1)### creating a grid of two tuning parameters, .sigma comes from the earlier line. we are trying to find best value of .CsvmTuneGrid <- data.frame(.sigma = sigDist[1], .C = 2^(-2:7))set.seed(1056)svmFit <- train(class ~ ., data = trainset, method = "svmRadial", preProc = c("center", "scale"), tuneGrid = svmTuneGrid, trControl = trainControl(method = "repeatedcv", repeats = 5))### svmFit finds the optimal values of tuning parameters and builds the model using the best parameters### to predict class of test datapredictedClasses <- predict(svmFit, testset )str(predictedClasses)### predict probablities but i get an errorpredictedProbs <- predict(svmFit, newdata = testset , type = "prob")head(predictedProbs)此行下方的新问题:根据以下输出,有 9 个支持向量.如何识别 12 个训练数据点中的 9 个?svmFit$finalModel类ksvm"的支持向量机对象Support Vector Machine object of class "ksvm"SV 类型:C-svc(分类)参数:成本 C = 1SV type: C-svc (classification) parameter : cost C = 1高斯径向基核函数.超参数:西格玛 = 0.72640759446315Gaussian Radial Basis kernel function. Hyperparameter : sigma = 0.72640759446315支持向量的数量:9目标函数值:-5.6994训练误差:0.083333Objective Function Value : -5.6994Training error : 0.083333推荐答案在列车控制语句中,您必须指定是否希望返回类概率 classProbs = TRUE.In the train control statement, you have to specify if you want the class probabilities classProbs = TRUE returned.svmFit <- train(class ~ ., data = trainset, method = "svmRadial", preProc = c("center", "scale"), tuneGrid = svmTuneGrid, trControl = trainControl(method = "repeatedcv", repeats = 5,classProbs = TRUE))predictedClasses <- predict(svmFit, testset )predictedProbs <- predict(svmFit, newdata = testset , type = "prob")给出在测试数据集中属于 Bad 或 Good 类的概率:giving the probabilities of being in the Bad or Good class in the test dataset as:print(predictedProbs) Bad Good1 0.2302979 0.76970212 0.7135050 0.28649503 0.2230889 0.7769111编辑要回答您的新问题,您可以使用 alphaindex(svmFit$finalModel) 和系数 coef(svmFit$finalModel) 访问原始数据集中支持向量的位置代码>.EDITTo answer your new question, you can access the position of the support vectors in your original data set with alphaindex(svmFit$finalModel) with coefficients coef(svmFit$finalModel). 这篇关于支持向量机 train caret error kernlab 类概率计算失败;返回 NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
07-01 08:55