本文介绍了将标识符(元数据)列添加到长格式数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出如下这样的长格式数据集:

Given a long format data set like this:

ID_2<-c('A','A','A','B','B','C','E','E','F','F','H','H','H')
Type<-c('Blk','Wht','Gre','Blk','Wht','Blk','Blk','Wht','Blk','Wht','Wht','Blk','Gre')
Count<-c(1,2,2,1,2,1,2,1,2,1,2,1,2)
DF2<-data.frame(ID_2,Type,Count)

我想为每个唯一ID(ID_2)添加一组特定的元数据.元数据将在单独的数据框中找到,如下所示:

I would like to add a specific set of metadata for each unique ID (ID_2). The metadata would be found in a separate data frame like so:

Year<-c(2005,2005,2006,2006,2007,2008,2008,2008)
Location<-c('EAST','EAST','WEST','WEST','NORTH','EAST','EAST','EAST')
Site<-c(1,2,3,4,5,6,7,8)
ID_1<-c('A','B','C','NAN','E','F','NAN','H')
DF1<-data.frame(Year,Location,Site,ID_1)

我想将DF1中的元数据添加到DF2的长格式中(匹配ID_1和ID_2),以便DF2的每一行都包含DF1中正确的元数据

I would like to add the metadata from DF1 to the long format of DF2 (matching up ID_1 and ID_2), so that each row of DF2 contains the proper metadata from DF1

我还需要处理空白位置,以便DF_1中任何在DF_2中没有相应数据条目的唯一站点编号都将获得标记条目.最终结果将如下所示:

I also need to deal with the blank locations, so that any unique site number from DF_1 that does not have a corresponding data entry in DF_2 gets a flagged entry. The end result would look like this:

Year  Location  Site  ID Type Count
2005     EAST    1    A  Blk     1
2005     EAST    1    A  Wht     2
2005     EAST    1    A  Gre     2
2005     EAST    2    B  Blk     1
2006     WEST    3    C  Blk     1
2007    NORTH    5    E  Blk     2
2007    NORTH    5    E  Wht     1
2008     EAST    6    F  Blk     2
2008     EAST    6    F  Wht     1
2008     EAST    8    H  Wht     2
2008     EAST    8    H  Blk     1
2008     EAST    8    H  Gre     2
2006     WEST    4  Flag Flag   -999
2008     EAST    7  Flag Flag   -999

推荐答案

这似乎是简单的merge()

> merge(DF1, DF2, by.x = "ID_1", by.y = "ID_2", all = TRUE)
   ID_1 Year Location Site Type Count
1     A 2005     EAST    1  Blk     1
2     A 2005     EAST    1  Wht     2
3     A 2005     EAST    1  Gre     2
4     B 2005     EAST    2  Blk     1
5     B 2005     EAST    2  Wht     2
6     C 2006     WEST    3  Blk     1
7     E 2007    NORTH    5  Blk     2
8     E 2007    NORTH    5  Wht     1
9     F 2008     EAST    6  Blk     2
10    F 2008     EAST    6  Wht     1
11    H 2008     EAST    8  Wht     2
12    H 2008     EAST    8  Blk     1
13    H 2008     EAST    8  Gre     2
14  NAN 2006     WEST    4 <NA>    NA
15  NAN 2008     EAST    7 <NA>    NA

您需要做一些额外的工作,以用您实际想要使用的任何值替换NA值.

You'll have to do a little bit of extra work to replace the NA values with whatever you actually want to use.

这篇关于将标识符(元数据)列添加到长格式数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-22 16:54