本文介绍了data.table:一步一步进行分组,总和,命名新列和切片列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这似乎应该很容易,但是我一直无法弄清楚该怎么做.使用 data.table ,我想用另一列 A 对一列 C 求和,而仅保留这两列.同时,我希望能够命名新列.我的尝试和期望的输出:

This seems like it should be easy, but I've never been able to figure out how to do it. Using data.table I want to sum a column, C, by another column A, and just keep those two columns. At the same time, I want to be able to name the new column. My attempts and desired output:

library(data.table)
dt <- data.table(A= c('a', 'b', 'b', 'c', 'c'), B=c('19', '20', '21', '22', '23'),
C=c(150,250,20,220,130))

# Desired Output - is there a way to do this in one step using data.table? #
new.data <- dt[, sum(C), by=A]
setnames(new.data,'V1', 'C.total')
new.data
   A C.total
1: a     150
2: b     270
3: c     350

# Attempt 1: Problem is that columns B and C kept, extra rows kept #
new.data <- dt[, 'C.total' := sum(C), by=A]
new.data
   A  B   C C.total
1: a 19 150     150
2: b 20 250     270
3: b 21  20     270
4: c 22 220     350
5: c 23 130     350

# Attempt 2: Problem is that new column not named #
new.data <- dt[, sum(C), by=A]
new.data
   A  V1
1: a 150
2: b 270
3: c 350

推荐答案

使用 list (或.):

> dt[, list(C.total = sum(C)), by=A]
   A C.total
1: a     150
2: b     270
3: c     350

这篇关于data.table:一步一步进行分组,总和,命名新列和切片列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

11-03 06:52