整形包中的熔铸功能很好,但是当测量变量为不同类型时,我不确定是否有简单的方法来应用它们。例如,以下是数据摘要,其中每个MD均提供了三名患者的性别和体重:

ID PT1 WT1 PT2 WT2 PT3 WT3
1  "M" 170 "M" 175 "F" 145
...

目标是重塑形状,以便每一行都是一个病人:
ID PTNUM GENDER WEIGHT
1    1     "M"    170
1    2     "M"    175
1    3     "F"    145
...

我知道在stats包中使用reshape函数是其中一个选项,但是我在这里发布是希望,比我更有经验的R用户可以发布其他希望更好的方法。非常感谢!

--

@Vincent Zoonekynd:

我非常喜欢您的示例,因此将其归纳为多个变量。
# Sample data
n <- 5
d <- data.frame(
  id = 1:n,
  p1 = sample(c("M","F"),n,replace=TRUE),
  q1 = sample(c("Alpha","Beta"),n,replace=TRUE),
  w1 = round(runif(n,100,200)),
  y1 = round(runif(n,100,200)),
  p2 = sample(c("M","F"),n,replace=TRUE),
  q2 = sample(c("Alpha","Beta"),n,replace=TRUE),
  w2 = round(runif(n,100,200)),
  y2 = round(runif(n,100,200)),
  p3 = sample(c("M","F"),n,replace=TRUE),
  q3 = sample(c("Alpha","Beta"),n,replace=TRUE),
  w3 = round(runif(n,100,200)),
  y3 = round(runif(n,100,200))
  )
# Reshape the data.frame, one variable at a time
library(reshape)
d1 <- melt(d, id.vars="id", measure.vars=c("p1","p2","p3","q1","q2","q3"))
d2 <- melt(d, id.vars="id", measure.vars=c("w1","w2","w3","y1","y2","y3"))
d1 = cbind(d1,colsplit(d1$variable,names=c("var","ptnum")))
d2 = cbind(d2,colsplit(d2$variable,names=c("var","ptnum")))
d1$variable = NULL
d2$variable = NULL
d1c = cast(d1,...~var)
d2c = cast(d2,...~var)
# Join the two data.frames
d3 = merge(d1c, d2c, by=c("id","ptnum"), all=TRUE)

--

最后的想法:我对这个问题的动机是学习除了stats::reshape函数以外的其他重塑包。目前,我已经得出以下结论:
  • 尽可能坚持使用stats::reshape。只要您记得对“可变”参数使用列表而不是简单的向量,就不会遇到麻烦。对于较小的数据集-这次我要处理的是几千个患者变量,总共少于200个变量-此功能的较低速度值得简化代码。
  • 要在Hadley Wickham的reshape(或reshape2)包中使用强制转换/熔化方法,必须将变量分成两组,一组由数字变量组成,另一组由字符变量组成。当您的数据集足够大以至于您发现stats::reshape难以忍受时,我认为将变量分为两组的额外步骤似乎并不那么糟糕。
  • 最佳答案

    您可以分别处理每个变量,
    并合并结果两个data.frames。

    # Sample data
    n <- 5
    d <- data.frame(
      id = 1:n,
      pt1 = sample(c("M","F"),n,replace=TRUE),
      wt1 = round(runif(n,100,200)),
      pt2 = sample(c("M","F"),n,replace=TRUE),
      wt2 = round(runif(n,100,200)),
      pt3 = sample(c("M","F"),n,replace=TRUE),
      wt3 = round(runif(n,100,200))
    )
    # Reshape the data.frame, one variable at a time
    library(reshape2)
    d1 <- melt(d,
      id.vars="id", measure.vars=c("pt1","pt2","pt3"),
      variable.name="patient", value.name="gender"
    )
    d2 <- melt(d,
      id.vars="id", measure.vars=c("wt1","wt2","wt3"),
      variable.name="patient", value.name="weight"
    )
    d1$patient <- as.numeric(gsub("pt", "", d1$patient))
    d2$patient <- as.numeric(gsub("wt", "", d1$patient))
    # Join the two data.frames
    merge(d1, d2, by=c("id","patient"), all=TRUE)
    

    关于r - stats::reshape的替代品,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/9341865/

    10-12 13:58