本文介绍了R随机森林:数据(x)有0行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 randomForest 包中的 randomForest 函数来查找最重要的变量:我的数据框被称为城市,我的响应变量是数字的收入.

I am using randomForest function from randomForest package to find the most important variable:my dataframe is called urban and my response variable is revenue which is numeric.

urban.random.forest <- randomForest(revenue ~ .,y=urban$revenue, data = urban, ntree=500,    keep.forest=FALSE,importance=TRUE,na.action = na.omit)

我收到以下错误:

Error in randomForest.default(m, y, ...) : data (x) has 0 rows

在源代码上它与x变量相关:

on the source code it is related to x variable:

n <- nrow(x)
p <- ncol(x)
if (n == 0)
stop("data (x) has 0 rows")

但我不明白什么是x.

推荐答案

我解决了这个问题.我有一些列,它们的所有值都是 NA 或相同.我放下了它们,一切顺利.我的列类是字符、数字和因子.

I solved that. I had some columns that all their values were NA or the same. I dropped them and it went OK. my columns classes were character, numeric and factor.

 candidatesnodata.index <- c()
 for (j in (1 : ncol(dataframe)))   {

   if (    is.numeric(dataframe[ ,j])  &  length(unique(as.numeric(dataframe[ ,j]))) == 1      )
     {candidatesnodata.index <- append(candidatesnodata.index,j)}
                                }

dataframe <- dataframe[ , - candidatesnodata.index]

这篇关于R随机森林:数据(x)有0行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-13 18:51