本文介绍了如何将由不同长度的向量组成的列表转换为 R 中可用的数据框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个(相当长的)向量列表.这些向量由我通过对句子使用 strsplit() 函数得到的俄语单词组成.

I have a (fairly long) list of vectors. The vectors consist of Russian words that I got by using the strsplit() function on sentences.

以下是 head() 返回的内容:

The following is what head() returns:

[[1]]
[1] "модно"     "создавать" "резюме"    "в"         "виде"     

[[2]]
[1] "ты"        "начианешь" "работать"  "с"         "этими"    

[[3]]
[1] "модно"            "называть"         "блогер-рилейшенз" "―"                "начинается"       "задолго"         

[[4]]
[1] "видел" "по"    "сыну," "что"   "он"   

[[5]]
[1] "четырнадцать," "я"             "поселился"     "на"            "улице"        

[[6]]
[1] "широко"     "продолжали" "род."

注意向量的长度不同.

我想要的是能够从每个句子中读取第一个单词,第二个单词,第三个等等.

What I want is to be able to read the first words from each sentence, the second word, the third, etc.

想要的结果是这样的:

    P1              P2           P3                 P4    P5           P6
[1] "модно"         "создавать"  "резюме"           "в"   "виде"       NA
[2] "ты"            "начианешь"  "работать"         "с"   "этими"      NA
[3] "модно"         "называть"   "блогер-рилейшенз" "―"   "начинается" "задолго"         
[4] "видел"         "по"         "сыну,"            "что" "он"         NA
[5] "четырнадцать," "я"          "поселился"        "на"  "улице"      NA
[6] "широко"        "продолжали" "род."             NA    NA           NA

我尝试只使用 data.frame() 但这不起作用,因为行的长度不同.我还尝试了 plyr 包中的 rbind.fill(),但该函数只能处理矩阵.

I have tried to just use data.frame() but that didn't work because the rows are of different length. I also tried rbind.fill() from the plyr package, but that function can only process matrices.

我在这里发现了一些其他问题(这是我从 plyr 获得帮助的地方),但这些都是关于组合例如两个不同大小的数据帧.

I found some other questions here (that's where I got the plyr help from), but those were all about combining for instance two data frames of different size.

感谢您的帮助.

推荐答案

试试这个:

word.list <- list(letters[1:4], letters[1:5], letters[1:2], letters[1:6])
n.obs <- sapply(word.list, length)
seq.max <- seq_len(max(n.obs))
mat <- t(sapply(word.list, "[", i = seq.max))

诀窍是,

c(1:2)[1:4]

返回向量 + 两个 NA

returns the vector + two NAs

这篇关于如何将由不同长度的向量组成的列表转换为 R 中可用的数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-30 04:23