问题描述
我正在阅读Sklearn中朴素贝叶斯的实现,但我无法理解BernoulliNB的预测部分:
I was reading up on the implementation of naive bayes in Sklearn, and I was not able to understand the predict part of BernoulliNB:
Code borrowed from source
def _joint_log_likelihood(self, X):
#.. some code ommited
neg_prob = np.log(1 - np.exp(self.feature_log_prob_))
# Compute neg_prob · (1 - X).T as ∑neg_prob - X · neg_prob
jll = safe_sparse_dot(X, (self.feature_log_prob_ - neg_prob).T)
jll += self.class_log_prior_ + neg_prob.sum(axis=1)
return jll
neg_prob
在其中的作用是什么.有人可以解释这种方法吗?
What is the role of neg_prob
in this. Can someone explain this approach?
我在网上阅读的所有地方(来源)简单的方法是:
Everywhere I am reading online (source) the simple approach is that:
For word in document:
For class in all_class:
class_prob[class] += np.log(class_prob_for[word])
# basically add up the log probability of word given that class.
# (Which is pre computed from training data)
# finally add up the log probability of the class itself.
For class in all_class:
class_prob[class] += np.log(class_prob_for[class])
但这与BernoulliNB
任何信息都将不胜感激.请让我知道是否需要添加更多详细信息,谢谢.
Any information is much appreciated. Please let me know if I should add more detail, thanks.
推荐答案
发现BernoulliNB
与MultinomialNB
略有不同.
如此处所述: http://blog.datumbox.com/machine-learning-tutorial-the-naive-bayes-text-classifier/
文档中未出现的术语也用作:(1 - conditional_probability_of_term_in_class)
Terms which don't occur within the document are also used as: (1 - conditional_probability_of_term_in_class)
在sklearn中使用的算法来源: https ://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-model-1.html
Algo used in sklearn Source: https://nlp.stanford.edu/IR-book/html/htmledition/the-bernoulli-model-1.html
这篇关于BernoulliNB在sklearn软件包中用于预测的方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!