通过不同数量的逻辑列过滤数据帧

本文介绍了通过不同数量的逻辑列过滤数据帧的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个问题我确定有一个优雅的解决方案，我无法找到。

我有一个功能，创建一个数据帧与不同的一组逻辑向量。在功能结束时，我想结合所有现有的逻辑向量。潜在的名称是已知的，但是足够的是，if语句的各种排列是不可行的。

例如，下面的两个数据表。潜在的逻辑向量是夜，宠，上升，其中1到3将存在。我想要可以可靠地组合任何逻辑向量存在的代码。

我得到了列出与列的名称相匹配的列号列表，但无法将其归还

希望这是清楚的，谢谢你的帮助

  df1<  - （小时=结构（c（1123624800,1123628400,1123632000，
 1123635600,1123639200），class = c（POSIXct，POSIXt），tzone =UTC），
 night = c FALSE，FALSE，TRUE，TRUE，TRUE），pet = c（TRUE，
 TRUE，TRUE，TRUE，TRUE）），.Names = c（hour，night，pet
），row.names = c（NA，5L），class =data.frame）
 
结构（list（小时=结构（c（1123624800,1123628400,1123632000，
 1123635600,1123639200），class = c（POSIXct，POSIXt），tzone =UTC），
 night = c（FALSE，FALSE，TRUE，TRUE，TRUE），pet = 
 TRUE，TRUE，TRUE，TRUE），rising = c（TRUE，TRUE，FALSE，TRUE，
 FALSE）），.Names = c（小时，夜 rise），row.names = c（NA，
 5L），class =data 
 
 
 $ b过滤器<  -  c（上升，宠物，夜）
匹配（过滤器，名称（df））[！ .na（match（filters，names（df）））]

如果我要明确写出来我想要代码做什么：

  return（df1 [df1 $ night& df1 $ pet，]）
 return（df2 [df2 $ night& df2 $ pet $ df2 $ rising，]）

编辑：我要重写这个，希望更清楚。
我有一个数据帧，最多包含三个包含各种数据质量过滤器标志的逻辑向量。例如，三个潜在载体的名称是夜，宠物，上升。数据帧将具有1到3个这些向量的组合。有时它会有宠物和夜，或夜和上升，或宠物和上升，或全部三个....

我想返回所有现有逻辑向量的真实记录。问题是，我不会预先知道哪些向量存在（这取决于函数调用中的选项），所以我想编写代码来处理所有的各种组合。如下所示：

 检查哪些逻辑向量存在
 return（df [（所有现有向量都为真），]

如果我只是尝试

  return（df [df $ rising& df $ pet $ df $ night，]）

当这些列中的某一列丢失时，代码将失败，所以我需要一个更强大的方法来完成这个。

希望这更清楚！一般如果我不能表达这个问题，这意味着我正在做一些愚蠢的事情...

解决方案

更新：

  df2 [Reduce（`&`，df2 [sapply（df2，is.logical）]）]]

将返回所有逻辑列为 TRUE 的行，还可以使用 apply 方法描述在最后。

你可以实现你的目标与减少和& ：

  df1 [Reduce（`&`， df1 [-1]），] 
＃小时夜宠物
＃3 2005-08-10 00:00:00 TRUE TRUE 
＃4 2005-08-10 01:00:00 TRUE TRUE 
＃5 2005-08-10 02:00:00 TRUE TRUE

以上我们排除了 -1 的第一列。下面我们使用您在过滤器中定义的列列表：

  df2 [Reduce（`&`，df2 [filters]），] 
＃小时夜宠物上涨
＃4 2005-08-10 01:00:00 TRUE TRUE TRUE

减少迭代应用& 到第二个参数中的元素对（数据框中的列）。

或者，您可以使用 apply

  df2 [apply（df2 [filters]，1，all），] 
 df1 [申请（df1 [-1]，1，all），]

I have a problem I'm sure has an elegant solution I haven't been able to find.

I have a function that creates a dataframe with a varying set of logical vectors. At the end of the function I'd like to combine all the existing logical vectors. The potential names are known, but there are enough that the various permutations with if statements are unworkable.

For example, the two data tables below. The potential logical vectors are "night", "pet", "rising", and from 1 to 3 of them will exist. I'd like code that can reliably combine whichever of the logical vectors exist.

I got as far as coming up with a list of the column numbers that match the names of the potential columns, but couldn't bring it home

Hopefully that was clear, thanks for the help

df1 <- structure(list(hour = structure(c(1123624800, 1123628400, 1123632000, 
1123635600, 1123639200), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
    night = c(FALSE, FALSE, TRUE, TRUE, TRUE), pet = c(TRUE, 
    TRUE, TRUE, TRUE, TRUE)), .Names = c("hour", "night", "pet"
), row.names = c(NA, 5L), class = "data.frame")

structure(list(hour = structure(c(1123624800, 1123628400, 1123632000, 
1123635600, 1123639200), class = c("POSIXct", "POSIXt"), tzone = "UTC"), 
    night = c(FALSE, FALSE, TRUE, TRUE, TRUE), pet = c(TRUE, 
    TRUE, TRUE, TRUE, TRUE), rising = c(TRUE, TRUE, FALSE, TRUE, 
    FALSE)), .Names = c("hour", "night", "pet", "rising"), row.names = c(NA, 
5L), class = "data.frame")


filters <- c("rising", "pet", "night")
match(filters, names(df))[!is.na(match(filters, names(df)))]

If I were to write out explicitly what I'd like the code to do:

return(df1[df1$night & df1$pet, ]) 
return(df2[df2$night & df2$pet $ df2$rising, ])

EDIT: I'm going to rewrite this to hopefully be more clear.I have a dataframe with up to three logical vectors containing flags for various data quality filters. For example, the names of the three potential vectors are "night", "pet", "rising". The data frame will have some combination of from 1 to 3 of these vectors. Sometimes it will have "pet" and "night", or "night" and "rising", or "pet" and "rising", or all three....

I'd like to return the records where all of the existing logical vectors are true. The problem is that I will not know beforehand which of the vectors exist (this depends on the options in the function call), so I'd like to code to be able to handle all of the various combinations. Something like:

check which logical vectors exist
return(df[(all existing vectors are true), ]

If I just try

return(df[df$rising & df$pet $ df$night, ])

The code will fail whenever one of those columns is missing, so I need a more robust way to accomplish this.

Hopefully this is clearer! Generally if I can't articulate the problem it means I'm doing something stupid...

解决方案

UPDATE:

df2[Reduce(`&`, df2[sapply(df2, is.logical)]),]

will return rows for which all logical columns are TRUE. You can also use the apply method described at the end.

You can achieve your objective with Reduce and &:

df1[Reduce(`&`, df1[-1]),]
#                  hour night  pet
# 3 2005-08-10 00:00:00  TRUE TRUE
# 4 2005-08-10 01:00:00  TRUE TRUE
# 5 2005-08-10 02:00:00  TRUE TRUE

Above we excluded the first column with -1. Below we use the list of columns you defined in filters:

df2[Reduce(`&`, df2[filters]),]
#                  hour night  pet rising
# 4 2005-08-10 01:00:00  TRUE TRUE   TRUE

Reduce iteratively applies & to pairs of elements in its second argument (the columns in the data frame).

Alternatively, you can use apply:

df2[apply(df2[filters], 1, all),]
df1[apply(df1[-1], 1, all),]

这篇关于通过不同数量的逻辑列过滤数据帧的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！