word2vec-最好的是什么?添加，连接或平均单词向量?

本文介绍了word2vec-最好的是什么?添加，连接或平均单词向量?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在研究一种递归语言模型.为了学习可用于初始化我的语言模型的词嵌入，我正在使用gensim的word2vec模型.经过训练后，word2vec模型为词汇表中的每个单词保留两个向量:单词嵌入(输入/隐藏矩阵的行)和上下文嵌入(隐藏/输出矩阵的列).

I am working on a recurrent language model. To learn word embeddings that can be used to initialize my language model, I am using gensim's word2vec model.After training, the word2vec model holds two vectors for each word in the vocabulary: the word embedding (rows of input/hidden matrix) and the context embedding (columns of hidden/output matrix).

这篇文章中概述的至少有三种常见的方式来组合这两个嵌入向量:

As outlined in this post there are at least three common ways to combine these two embedding vectors:

总结每个单词的上下文和单词向量
求和&平均
连接上下文和单词向量

但是，我找不到有关最佳策略的适当论文或报告.所以我的问题是:

However, I couldn't find proper papers or reports on the best strategy. So my questions are:

对向量求和，求平均值或将其串联是否有一个通用的解决方案?
还是最好的方法完全取决于所讨论的任务?如果是这样，哪种方法最适合单词级语言模型?
为什么要完全合并向量?为什么不为每个单词(即输入和隐藏神经元之间的权重矩阵中包含的单词)使用原始"单词嵌入.

单词

word2vec-最好的是什么?添加，连接或平均单词向量?

问题描述

推荐答案