本文介绍了如何相对于另一列和一组更改列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有2列

  PERNO TPURP循环
1循环行程1
1循环行程2
1居所2
1购物2
2工作1
2环行旅行2
2学校2
3环行旅行1
4工作1

如果TPURP ==环行,我想在该行后加1来循环。 / p>对于每个PERNO,如果Loop行程恰好在另一次Loop行程的下一行,则为每个PERNO

我们不会在第一个行程中添加1,但在第二个行程中添加。



输出

  PERNO TPURP回路
1回路行程1
1环行旅行2
1住所3
1购物3
2工作1
2环行旅行2
2学校3
3环行旅行1
4工作1

数据

 结构(列表(PERNO = c(1,1,1,1,1,1),TPURP =结构(c(8L,
1L,22L,22L,9L,2L),.Label = c( (1)在家工作(有偿),
(2)所有其他家庭活动,(3)工作/工作,(4)所有其他工作活动,
(5)上课,(6)在学校的所有其他活动,
(7)改变运输/转移的类型,(8)下车的乘客,
(9)接客,(10)其他,注明-交通,
(11)与工作/业务有关,(12)服务性私家车,
(13)例行购物,(14)大宗购物,
(15)家庭事务,(16)个人业务,(17)在家外用餐,
(18)保健,(19)公民/宗教活动,(20)娱乐/娱乐,
,(21)拜访朋友/亲戚,(24)环游,( 97)其他,指定
),class = factor),循环= c(1,1,2,2,2,2)),class = c( tbl_df,
tbl, data.frame),row.names = c(NA,-6L))


解决方案

使用 dplyr ,我们可以 group_by PERNO ,并在最后一次出现 Loop trip之后增加 loop 的值

 库(dplyr)

df%>%
group_by(PERNO)% >%
mutate(loop1 = ifelse(any(TPURP == Loop trip)&
row_number()> max(which(TPURP == Loop trip)),loop + 1,循环))

#PERNO TPURP循环loop1
#< int> < fct> < int> < dbl>
#1 1环行1 1
#2 1环行2 2
#3 1住所2 3
#4 1购物2 3
#5 2工作1 1
#6 2环行旅行2 2
#7 2学校2 3
#8 3环行旅行1 1
#9 4工作1 1

如果任何组都没有循环旅行 但可以忽略。



数据

  df< -structure(list(PERNO = c(1L,1L,1L,1L,2L,2L,2L,3L,4L),
TPURP = structure(c(2L,2L,1L,5L,6L,2L, 4L,3L,6L),.Label = c( home,
Loop trip, Looptrip, school, shopping, work),class = factor),
循环= c(1L,2L,2L,2L,1L,2L,2L,1L,1L)),类= data.frame,
row.names = c(NA,-9L) )






或者我们可以使用 grepl / grep 进行部分匹配,而不是@Sotos提到的完全匹配。在更新的数据集上,我们可以做

  df%>%
group_by(PERNO)%>%
dplyr :: mutate(loop1 = ifelse(any(grepl('Loop',TPURP))&
row_number()> max(grep('Loop',TPURP)),loop + 1,循环))

#PERNO TPURP循环loop1
#< dbl> < fct> < dbl> < dbl>
#1 1(8)下车的乘客1 1
#2 1(1)在家工作(付费)1 1
#3 1(24)环岛旅行2 2
#4 1(24)环岛旅行2 2
#5 1(9)接客2 3
#6 1(2)所有其他家庭活动2 3


I have 2 columns

 PERNO      TPURP       loop
 1      Loop trip     1
 1      Loop trip     2
 1      home          2
 1      shopping      2
 2      work          1
 2      Loop trip     2
 2      school        2
 3      Looptrip      1
 4      work          1

for each perno if TPURP== Loop trip I want to add 1 to loop after that row.

for each PERNO if Loop trip is exactly in next row of another Loop trip we don't add 1 to first one but we do for second one.

output

 PERNO      TPURP       loop
 1      Loop trip     1
 1      Loop trip     2
 1      home          3
 1      shopping      3
 2      work          1
 2      Loop trip     2
 2      school        3
 3      Looptrip      1
 4      work          1

data

structure(list(PERNO = c(1, 1, 1, 1, 1, 1), TPURP = structure(c(8L, 
1L, 22L, 22L, 9L, 2L), .Label = c("(1) Working at home (for pay)", 
"(2) All other home activities", "(3) Work/Job", "(4) All other activities at work", 
"(5) Attending class", "(6) All other activities at school", 
"(7) Change type of transportation/transfer", "(8) Dropped off passenger", 
"(9) Picked up passenger", "(10) Other, specify - transportation", 
"(11) Work/Business related", "(12) Service Private Vehicle", 
"(13) Routine Shopping", "(14) Shopping for major purchases", 
"(15) Household errands", "(16) Personal Business", "(17) Eat meal outside of home", 
"(18) Health care", "(19) Civic/Religious activities", "(20) Recreation/Entertainment", 
"(21) Visit friends/relative", "(24) Loop trip", "(97) Other, specify"
), class = "factor"), loop = c(1, 1, 2, 2, 2, 2)), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -6L))
解决方案

Using dplyr, we can group_by PERNO and increment the value of loop after the last occurrence of "Loop trip" in the group.

library(dplyr)

df %>%
  group_by(PERNO) %>%
  mutate(loop1 = ifelse(any(TPURP == "Loop trip") & 
            row_number() > max(which(TPURP == "Loop trip")),loop + 1, loop))

# PERNO TPURP      loop loop1
#  <int> <fct>     <int> <dbl>
#1     1 Loop trip     1     1
#2     1 Loop trip     2     2
#3     1 home          2     3
#4     1 shopping      2     3
#5     2 work          1     1
#6     2 Loop trip     2     2
#7     2 school        2     3
#8     3 Looptrip      1     1
#9     4 work          1     1

This returns a warning message if for any group there is no "Loop trip" but it can be ignored.

data

df <- structure(list(PERNO = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 4L), 
TPURP = structure(c(2L, 2L, 1L, 5L, 6L, 2L, 4L, 3L, 6L), .Label = c("home", 
"Loop trip", "Looptrip", "school", "shopping", "work"), class = "factor"), 
loop = c(1L, 2L, 2L, 2L, 1L, 2L, 2L, 1L, 1L)), class = "data.frame", 
row.names = c(NA, -9L))


Or we can use grepl/grep to partial match instead of exact match as mentioned by @Sotos. On the updated dataset we can do

df %>% 
  group_by(PERNO) %>%
  dplyr::mutate(loop1 = ifelse(any(grepl('Loop', TPURP)) & 
     row_number() > max(grep('Loop', TPURP)), loop + 1, loop))

#   PERNO TPURP                          loop loop1
#   <dbl> <fct>                         <dbl> <dbl>
#1     1 (8) Dropped off passenger         1     1
#2     1 (1) Working at home (for pay)     1     1
#3     1 (24) Loop trip                    2     2
#4     1 (24) Loop trip                    2     2
#5     1 (9) Picked up passenger           2     3
#6     1 (2) All other home activities     2     3

这篇关于如何相对于另一列和一组更改列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-23 01:09