本文介绍了从弃用的summary_移至dplyr中的新摘要的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个函数,用于计算根据变量 VarName 的内容选择的列的分组数据库的平均值。当前函数使用 dplyr :: summarize _ ,但是现在我不赞成使用此函数,并且我想在完全删除它之前对其进行替换。

I have a function that calculates the means of a grouped database for a column which is chosen based on the content of a variable VarName. The current function uses dplyr::summarize_, but now I see this is deprecated, and I want to replace it before it is fully removed.

但是,我不确定如何使用新的取消引号来实现我要执行的操作。这是我当前的代码:

However, I'm not sure how to use the new unquoting to achieve what I'm trying to do. Here's my current code:

means<-summarize_(group_by(dat,Grade),.dots = setNames(paste0('mean(',VarName,',na.rm=TRUE)'),'means'))

我尝试用 means = mean(!! VarName,na.rm = TRUE)替换 .dots 部分,但是刚返回VarName内的字符串。我需要的是将VarName中的字符串评估为 dat 中的列名称,这样我将获得列名称 means,其含义为每组的平均值。如何使用新的 summaryize 来实现?

I tried replacing the .dots part with means=mean(!!VarName, na.rm=TRUE), but that just returned the string inside VarName. What I need is for the string in VarName to be evaluated as the column name within dat, so that I'll get a column name "means" with the mean of each group. How can I achieve that with the new summarize?

可重复性的样本数据集:

Sample dataset for reproducibility:

VarName<-"Things"
dat<-data.frame(students=c("a","b","c","d","e"),Grade=c(2,2,2,3,3),varA=c(41:45),Things=c(90,100,80,75,80))

谢谢!

推荐答案

将其转换为函数并泛化任意数据,分组变量和值变量:

Turning this into a function and generalizing for arbitrary data, grouping variable, and value variable:

library(tidyverse)

means <- function(data, group, value) {

  group = enquo(group)
  value = enquo(value)
  value_name = paste0("mean_", value)[2]

  data %>% group_by(!!group) %>%
    summarise(!!value_name := mean(!!value, na.rm=TRUE))
}

means(dat, Grade, Things)



  Grade mean_Things
  <dbl>       <dbl>
1  2.00        90.0
2  3.00        77.5


如果我理解您的评论,下面的函数如何处理,该函数为 value 参数使用字符串:

If I understand your comment, how about the function below, which takes a string for the value argument:

means <- function(data, group, value) {

  group = enquo(group)
  value_name = paste0("mean_", value)
  value = sym(value)

  data %>% group_by(!!group) %>%
    summarise(!!value_name := mean(!!value, na.rm=TRUE))
}

VarName = "Things"

means(dat, Grade, VarName)



  Grade mean_Things
  <dbl>       <dbl>
1  2.00        90.0
2  3.00        77.5


由于该函数是通用的,因此您可以对任何数据框执行此操作。例如:

Since the function is generalized, you can do this with any data frame. For example:

means(mtcars, cyl, "mpg")



    cyl mean_mpg
  <dbl>    <dbl>
1  4.00     26.7
2  6.00     19.7
3  8.00     15.1


您可以进一步推广该功能。例如,此版本采用任意数量的分组列:

You can generalize the function still further. For example, this version takes an arbitrary number of grouping columns:

means <- function(data, value, ...) {

  group = quos(...)
  value_name = paste0("mean_", value)
  value = sym(value)

  data %>% group_by(!!!group) %>%
    summarise(!!value_name := mean(!!value, na.rm=TRUE))
}

VarName = "Things"

means(dat, VarName, students, Grade)



  students Grade mean_Things
  <fct>    <dbl>       <dbl>
1 a         2.00        90.0
2 b         2.00       100
3 c         2.00        80.0
4 d         3.00        75.0
5 e         3.00        80.0


这篇关于从弃用的summary_移至dplyr中的新摘要的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-01 03:27