通过从R中给定列中提取字符来循环创建列/变量

本文介绍了通过从R中给定列中提取字符来循环创建列/变量的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！问题描述我的数据集如下所示：关键日期人口普查j 1：01_35004_10-14 _ + _ M 11NOV2001 2.934397 01 2：01_35004_10-14 _ + _ M 06JAN2002 3.028231 01 3：01_35004_10-14 _ + _ M 07APR2002 3.180712 01 4：01_35004_10-14 _ + _ M 02JUN2002 3.274546 01 5： 01_35004_10-14 _ + _ M 28JUL2002 3.368380 01 6：01_35004_10-14 _ + _ M 22SEP2002 3.462214 01 7：01_35004_10-14 _ + _ M 22DEC2002 3.614694 01 8：01_35004_10-14 _ + _ M 16FEB2003 3.708528 01 9：01_35004_10-14 _ + _ M 13JUL2003 3.954843 01 10：01_35004_10-14 _ + _ M 07SEP2003 4.048677 01 列key中的某些字符对应于不同的变量。例如：01是州， 35004是邮政编码， 10-14是年龄组， +是比赛， M是性别我想提取每个字符为它们创建单独的变量（例如，填充01的状态列，填充35004的Zip Code列等）这是我的代码： pre code var = c（（$ Var）{ play $ j = gsub（_。* $，，打$ key code $ b显然这是不正确的。我希望循环遍历key列中的每个观测值，并生成一个变量，其中包含与变量相关联的提取字符。解决方案 read.csv ：＃您的数据摘录（仅包含模型点坐标的坐标列）x ＃简单的方法是将字符串视为CSV行:-) $ b $由< - read.csv（text = x，sep =_，header = FALSE）＃修正错误的列名名（y）< -c（State，Zip_Code，Age_Group，Race，Gender）＃现在通过使用翻译（lookup）表 gender.lookup< - data.frame（gender.code = c（M，F），gender.name = c 男，女））＃将重新编码的值添加为新列。注意：查找失败将被忽略y $ GenderName< - gender.lookup $ gender.name [match（y $ Gender，gender.lookup $ gender.code）] 因为我没有更多的查询数据在你的问题中，所以我将循环的实现留给了你的想象...（例如，使用 lapply 以及与列索引具有相同索引位置的查找表的列表）。 My data set looks like this: key date census j1: 01_35004_10-14_+_M 11NOV2001 2.934397 012: 01_35004_10-14_+_M 06JAN2002 3.028231 013: 01_35004_10-14_+_M 07APR2002 3.180712 014: 01_35004_10-14_+_M 02JUN2002 3.274546 015: 01_35004_10-14_+_M 28JUL2002 3.368380 016: 01_35004_10-14_+_M 22SEP2002 3.462214 017: 01_35004_10-14_+_M 22DEC2002 3.614694 018: 01_35004_10-14_+_M 16FEB2003 3.708528 019: 01_35004_10-14_+_M 13JUL2003 3.954843 0110: 01_35004_10-14_+_M 07SEP2003 4.048677 01Certain characters within the column "key" correspond to different variables.For instance: 01 is the State, 35004 is the Zip Code, 10-14 is the Age Group, + is the Race, M is the GenderI want to extract each of these characters to create separate variables for them (i.e. a column for state filled with 01, a column for Zip Code filled with 35004, etc)Here is my code:Var = c("State","Zip_Code", "Age_Group", "Race", "Gender")for(j in Var){play$j = gsub("_.*$","",play$key) }Clearly this is not correct. I would like the loop to iterate through each observation in the "key" column and produce a variable with the extracted character associated with the variable. 解决方案 The basic solution (without expecting a good performance) uses read.csv:# excerpt of your data (only the "coordinate" column containing the model point coordinates)x <- c("01_35004_10-14_+_M", "01_35004_10-14_+_M")# simple way is treating the string as CSV row :-)y <- read.csv(text = x, sep="_", header=FALSE)# Fix the wrong column namesnames(y) <- c("State","Zip_Code", "Age_Group", "Race", "Gender")# Now recode one example column by using translation ("lookup") tablegender.lookup <- data.frame( gender.code=c("M", "F"), gender.name=c("Male", "Female"))# Add the recoded value as new column. Note: Lookup failures are ignoredy$GenderName <- gender.lookup$gender.name[match(y$Gender, gender.lookup$gender.code)]I am leaving the implementation of the loop to your imagination since I don't have more lookup data in your question... (e. g. use lapply and a list of lookup tables with the same index positions as the column indices). 这篇关于通过从R中给定列中提取字符来循环创建列/变量的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！