本文介绍了通过关键字段查找 MongoDB 集合中的所有重复文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个包含一些文档集的集合.像这样的东西.

Suppose I have a collection with some set of documents. something like this.

{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":1, "name" : "foo"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":2, "name" : "bar"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":3, "name" : "baz"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":4, "name" : "foo"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":5, "name" : "bar"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":6, "name" : "bar"}

我想通过名称"字段查找此集合中所有重复的条目.例如.foo"出现两次,bar"出现3次.

I want to find all the duplicated entries in this collection by the "name" field. E.g. "foo" appears twice and "bar" appears 3 times.

推荐答案

注意:这个方案最容易理解,但不是最好的.

Note: this solution is the easiest to understand, but not the best.

您可以使用 mapReduce 来查找找出文档包含某个字段的次数:

You can use mapReduce to find out how many times a document contains a certain field:

var map = function(){
   if(this.name) {
        emit(this.name, 1);
   }
}

var reduce = function(key, values){
    return Array.sum(values);
}

var res = db.collection.mapReduce(map, reduce, {out:{ inline : 1}});
db[res.result].find({value: {$gt: 1}}).sort({value: -1});

这篇关于通过关键字段查找 MongoDB 集合中的所有重复文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-26 08:40