论文阅读笔记 Word Embeddings A Survey

收获

Word Embedding 的定义

dense, distributed, fixed-length word vectors, built using word co-occurrence statistics as per the distributional hypothesis.

分布式假说(distributional hypothesis)

word with similar contexts have the same meaning.

知网词语相关性

词语在同一语境中共现的可能性。

综上述,相关性和分布式假说如出一辙!

Word Embedding 的分类

Prediction-based

neural network language model based.

predict next word.

E.g. NNLM, word2vec.

Count-based

word-context matrix based.

accout word-context co-occurrence.

E.g. GloVe.

04-05 20:07