从弃用的summary_移至dplyr中的新摘要

本文介绍了从弃用的summary_移至dplyr中的新摘要的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个函数，用于计算根据变量 VarName 的内容选择的列的分组数据库的平均值。当前函数使用 dplyr :: summarize _ ，但是现在我不赞成使用此函数，并且我想在完全删除它之前对其进行替换。

I have a function that calculates the means of a grouped database for a column which is chosen based on the content of a variable VarName. The current function uses dplyr::summarize_, but now I see this is deprecated, and I want to replace it before it is fully removed.

但是，我不确定如何使用新的取消引号来实现我要执行的操作。这是我当前的代码：

However, I'm not sure how to use the new unquoting to achieve what I'm trying to do. Here's my current code:

means<-summarize_(group_by(dat,Grade),.dots = setNames(paste0('mean(',VarName,',na.rm=TRUE)'),'means'))

我尝试用 means = mean（!! VarName，na.rm = TRUE）替换 .dots 部分，但是刚返回VarName内的字符串。我需要的是将VarName中的字符串评估为 dat 中的列名称，这样我将获得列名称 means，其含义为每组的平均值。如何使用新的 summaryize 来实现？

I tried replacing the .dots part with means=mean(!!VarName, na.rm=TRUE), but that just returned the string inside VarName. What I need is for the string in VarName to be evaluated as the column name within dat, so that I'll get a column name "means" with the mean of each group. How can I achieve that with the new summarize?

可重复性的样本数据集：

Sample dataset for reproducibility:

VarName<-"Things"
dat<-data.frame(students=c("a","b","c","d","e"),Grade=c(2,2,2,3,3),varA=c(41:45),Things=c(90,100,80,75,80))

谢谢！

推荐答案

将其转换为函数并泛化任意数据，分组变量和值变量：

Turning this into a function and generalizing for arbitrary data, grouping variable, and value variable:

library(tidyverse)

means <- function(data, group, value) {

  group = enquo(group)
  value = enquo(value)
  value_name = paste0("mean_", value)[2]

  data %>% group_by(!!group) %>%
    summarise(!!value_name := mean(!!value, na.rm=TRUE))
}

means(dat, Grade, Things)

  Grade mean_Things
  <dbl>       <dbl>
1  2.00        90.0
2  3.00        77.5

如果我理解您的评论，下面的函数如何处理，该函数为 value 参数使用字符串：

If I understand your comment, how about the function below, which takes a string for the value argument:

means <- function(data, group, value) {

  group = enquo(group)
  value_name = paste0("mean_", value)
  value = sym(value)

  data %>% group_by(!!group) %>%
    summarise(!!value_name := mean(!!value, na.rm=TRUE))
}

VarName = "Things"

means(dat, Grade, VarName)

  Grade mean_Things
  <dbl>       <dbl>
1  2.00        90.0
2  3.00        77.5

由于该函数是通用的，因此您可以对任何数据框执行此操作。例如：

Since the function is generalized, you can do this with any data frame. For example:

means(mtcars, cyl, "mpg")

    cyl mean_mpg
  <dbl>    <dbl>
1  4.00     26.7
2  6.00     19.7
3  8.00     15.1

您可以进一步推广该功能。例如，此版本采用任意数量的分组列：

You can generalize the function still further. For example, this version takes an arbitrary number of grouping columns:

means <- function(data, value, ...) {

  group = quos(...)
  value_name = paste0("mean_", value)
  value = sym(value)

  data %>% group_by(!!!group) %>%
    summarise(!!value_name := mean(!!value, na.rm=TRUE))
}

VarName = "Things"

means(dat, VarName, students, Grade)

  students Grade mean_Things
  <fct>    <dbl>       <dbl>
1 a         2.00        90.0
2 b         2.00       100
3 c         2.00        80.0
4 d         3.00        75.0
5 e         3.00        80.0

这篇关于从弃用的summary_移至dplyr中的新摘要的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！