具有这样的df:

df_in <- data.frame(x = c('x1','x2','x3','x4'),
                     col1 = c('http://youtube.com/something','NA','https://www.yahooexample.com','https://www.yahooexample2.com'),
                     col2 = c('https://google.com', 'http://www.bbcnews2.com?id=321','NA','https://google.com/text'),
                     col3 = c('http://www.bbcnews.com?id=321', 'http://google.com?id=1234','NA','https://bbcnews.com/search'),
                     col4 = c('NA', 'https://www.youtube/com','NA', 'www.youtube.com/searcht'))

在col1,col2和col3中,如何仅保留其中包含“google”或“youtube”或“bbc”其他内容的单元格,使该单元格NA成为可能?

预期输出示例:
   x                          col1                           col2                          col3                    col4
1 x1  http://youtube.com/something             https://google.com http://www.bbcnews.com?id=321                      NA
2 x2                            NA http://www.bbcnews2.com?id=321     http://google.com?id=1234 https://www.youtube/com
3 x3  NA                             NA                            NA                      NA
4 x4 NA        https://google.com/text    https://bbcnews.com/search www.youtube.com/searcht

最佳答案

我们可以使用mutate_at将列'col1'更改为'col4',并通过str_detect检查其是否包含'google'或'youtube'或'bbc'并将其他元素替换为NA

library(dplyr)
library(stringr)
df_in %>%
     mutate_at(vars(col1:col4), funs(ifelse(str_detect(.,
                "google|youtube|bbc"), as.character(.), NA)))

-输出
#    x                         col1                           col2                          col3                    col4
#  1 x1 http://youtube.com/something             https://google.com http://www.bbcnews.com?id=321                    <NA>
#  2 x2                         <NA> http://www.bbcnews2.com?id=321     http://google.com?id=1234 https://www.youtube/com
#  3 x3                         <NA>                           <NA>                          <NA>                    <NA>
#  4 x4                         <NA>        https://google.com/text    https://bbcnews.com/search www.youtube.com/searcht

关于r - 仅保留字符串中的值,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/48907934/

10-12 20:00