本文介绍了为交叉验证指定nfolds时出现h2o deeplearning错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题现在已经解决了吗?我遇到相同的错误消息.

has this issue been resolved by now? I encounter the same error message.

用例:我正在使用h2o的deeplearning()函数进行二进制分类.下面,我提供了随机生成的数据,其大小与我的实际用例相同.系统规格:

Usecase: I am doing binary classification using h2o's deeplearning() function. Below, I provide randomly generated data the same size as my actual usecase. System specs:

# R version 3.3.2 (2016-10-31)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows >= 8 x64 (build 9200)
# h2o version h2o_3.20.0.2

我目前正在学习如何使用h2o,因此我已经使用该功能了很多.在我指定交叉验证的参数之前,一切都会顺利进行.

I am currently learning how to use h2o, so I have played with that function quite a bit. Everything runs smoothly until I specify parameters for cross validation.

为交叉验证指定nfolds参数时,会出现问题.有趣的是,我可以为nfolds指定较低的值,然后一切都会好起来的.对于我的用例,即使nfolds> 3也会产生错误消息(请参见下文).我在下面提供了一个示例,在这里我可以指定nfolds< 7(不是真正一致的...有时高达nfolds = 3).在这些值之上,REST API给出上述错误:object not found for argument: key.

The problem occurs when specifying the nfolds parameter for cross-validation. Interestingly, I can specify low values for nfolds and everything goes fine. For my use case, even nfolds > 3 produced an error message (see below). I provide an example below, here I was able to specify nfolds < 7 (not really consistent... sometimes just up to nfolds = 3). Above those values, the REST API give the above mentioned error: object not found for argument: key.

# R version 3.3.2 (2016-10-31)
# Platform: x86_64-w64-mingw32/x64 (64-bit)
# Running under: Windows >= 8 x64 (build 9200)
# h2o version h2o_3.20.0.2


#does not matter whether run on debian or windows, does not matter how many threads are used
#error occurs with options for cross validation, otherwise works fine
#no error occurs with specifying a low nfold number(in my actual use case, maximum of 3 folds possible without running into that error message)

require(h2o)
h2o.init(nthreads = -1)

x = matrix(rnorm(900*96, mean=10, sd=2), nrow=900, ncol=96)
y = sample(size=900, x=c(0,1), replace=T)

sampleData = cbind(x, y)
sampleData = data.frame(sampleData)
sampleData[,97] = as.factor(sampleData[,97])

m = h2o.deeplearning(x = 1:96, y = 97,
                     training_frame = as.h2o(sampleData), reproducible = T,
                     activation = "Tanh", hidden = c(64,16), epochs = 10000, verbose=T,
                     nfolds = 4, balance_classes = TRUE, #Cross-validation
                     standardize = TRUE, variable_importances = TRUE, seed=123,
                     stopping_rounds=2, stopping_metric="misclassification", stopping_tolerance=0.01, #early stopping
)

performance = h2o.performance(model = m)
print(performance)

######### gives error message
# ERROR: Unexpected HTTP Status code: 404 Not Found (url = http://localhost:xxxxx/3/Models/DeepLearning_model_R_1535036938222_489)
# 
# water.exceptions.H2OKeyNotFoundArgumentException
# [1] "water.exceptions.H2OKeyNotFoundArgumentException: Object 'DeepLearning_model_R_1535036938222_489' not found for argument: key"

我不明白为什么它仅对低nfolds起作用.有什么建议?我在这里想念什么?我已经在Google网上论坛中搜索了大多数与远程相关的线程,也在此处进行了stackoverflow搜索,但是没有成功.如果这与上面建议的h2o 3.x的API更改有关(尽管该帖子是在18个月前...),我将非常感谢一些有关如何正确指定使用h2o进行CV语法的记录片. ).预先感谢!

I cannot understand why it does work only for low values of nfolds. Any suggestions? What am I missing here? I've searched most remotely related threads on Google Groups and also here on stackoverflow, but without success. If this is to do with a changed API for h2o 3.x as suggested above (though that post was 18 months ago...) I would highly appreciate some documentary on how to correctly specify the syntax to do CV with h2o.deeplearning(). Thanks in advance!

推荐答案

这是将verbose参数设置为True引起的错误,解决方法是将verbose参数保留为默认值,即FALSE .我已经创建了一张吉拉票来跟踪问题,此处

This is a bug caused by setting the verbose parameter to True, the workaround is to leave the verbose parameter as the default which is FALSE. I've created a jira ticket to track the issue here

这篇关于为交叉验证指定nfolds时出现h2o deeplearning错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-11 00:57