问题描述
我想知道如何在data.table中传递用户定义的函数。
我使用data.table创建了以下代码,用于计算两个组中所有有效响应('a'或'b')中响应'b'的百分比; grp1和grp2:
数据(带有警告信息):
<$ c $ (c(I,II,III,IV)),rep(c(A,B (c)(a,a,b,b,b) (grp1,grp2,Q1)
计算%respondents的代码:$>
问题出在你用参数指定的方式。我们还可以使用 keyby 来代替,,可以一步完成排序:
test =函数(question,groupA,groupB){
dt [,sum(get(question)%in%b)/ sum (!is.na(get(question)))* 100,
keyby = c(groupA,groupB)]
}
ans = test(question =Q1 ,groupA =grp1,groupB =grp2)
#grp1 grp2 V1
#1:IA 55.55556
#2:IB 62.50000
#3:IC 62.50000
#4:II A 62.50000
#5:II B 55.55556
#6:II C 62.50000
#7:III A 50.00000
#8:III B 62.50000
#9:III C 66.66667
#10:IV A 66.66667
#11:IV B 62.50000
#12:IV C 50.00000
I would like to know how to pass a user-defined function in a data.table.
I created the following code using data.table to calculate % of responses 'b' out of all valid responses ('a' or 'b') by two groups; grp1 and grp2:
The data (with a warning message):
library(data.table) dt = data.table(rep(c("I", "II", "III", "IV")), rep(c("A", "B", "C")), rep(c("a", "a", "b", "b", "b"), 20)) colnames(dt) = c("grp1", "grp2", "Q1")The code to calculate % respondents:
dt[, sum(Q1 %in% "b")/sum(!is.na(Q1))*100, by = grp1:grp2][order(grp1, grp2)]This produces what I need (thanks @Frank your help at Calculate % respondents by more than one group for a survey data):
grp1 grp2 V1 1: I A 55.55556 2: I B 62.50000 3: I C 62.50000 4: II A 62.50000 5: II B 55.55556 6: II C 62.50000 7: III A 50.00000 8: III B 62.50000 9: III C 66.66667 10: IV A 66.66667 11: IV B 62.50000 12: IV C 50.00000What I would like to do is to create a function and use it to calculate the equivalent set of values for 50 other items. I created the following function hoping to minimize the repetitive process;
test = function(question, groupA, groupB){ dt[, sum(get(question) %in% "b")/sum(!is.na(get(question)))*100, by = eval((c(groupA, groupB)))][order(groupA, groupB)] } test(question = "Q1", groupA = "grp1", groupB ="grp2")However, this returns only the top row :
grp1 grp2 V1 1: I A 55.55556I've read other items on Stack Overflow (e.g. Using data.table i and j arguments in functions) and tried other codes but I haven't been able to find a way to get it work.
I'm new to R and would very much appreciate any feedback you may have.
解决方案The issue is in the way you specify the by argument. Also we can use keyby instead of by, to do the sorting in one step:
test = function(question, groupA, groupB){ dt[, sum(get(question) %in% "b") / sum(!is.na(get(question))) * 100, keyby = c(groupA, groupB)] } ans = test(question = "Q1", groupA = "grp1", groupB ="grp2") # grp1 grp2 V1 # 1: I A 55.55556 # 2: I B 62.50000 # 3: I C 62.50000 # 4: II A 62.50000 # 5: II B 55.55556 # 6: II C 62.50000 # 7: III A 50.00000 # 8: III B 62.50000 # 9: III C 66.66667 # 10: IV A 66.66667 # 11: IV B 62.50000 # 12: IV C 50.00000
这篇关于在data.table中获取用户定义的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!