本文介绍了为什么在Apache Spark中Accuracy和Weighted Recall值始终都是相同的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用Apache Spark的朴素贝叶斯实现中,我始终获得相同的准确性值和加权召回值.

In my Naive Bayes implementation with using Apache Spark, I get same values for accuracy and weighted recall values all the times.

我从Spark的教程中实现了朴素贝叶斯算法,除了上面提到的内容之外,它都可以正常工作.

I implemented Naive Bayes algorithm from Spark's tutorials and it works fine except the thing that I mentioned above.

        Dataset<Row>[] splits = dataFrame.randomSplit(new double[]
                {mainController.getTrainingDataRate(), mainController.getTestDataRate()});
        Dataset<Row> train = splits[0];
        Dataset<Row> test = splits[1];

        NaiveBayes nb = new NaiveBayes();
        NaiveBayesModel model = nb.fit(train);

        Dataset<Row> predictions = model.transform(test);

        MulticlassClassificationEvaluator evaluator = new MulticlassClassificationEvaluator()
                .setLabelCol("label")
                .setPredictionCol("prediction")
                .setMetricName("weightedPrecision");

        precisionSum += (evaluator.evaluate(predictions));

        evaluator.setMetricName("weightedRecall");
        recallSum += (evaluator.evaluate(predictions));

        evaluator.setMetricName("accuracy");
        accuracySum += (evaluator.evaluate(predictions));  

我运行了一百次以上的代码,即使在包含数十万行的不同数据文件中进行尝试,其每次的准确性结果也都等于加权的召回值.我在哪里做错了?

I run the code above hundred times and in every of them accuracy results were equal to weighted recall values even I tried in different data files which consists of hundreds of thousands rows. Where am I doing wrong?

推荐答案

对于单任务分类,微平均召回率(所谓的加权召回率)始终具有相同的准确性.

For single task classification, micro-averaged recall(so-called weighted recall) is always the same with accuracy.

这篇关于为什么在Apache Spark中Accuracy和Weighted Recall值始终都是相同的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-26 20:08