本文介绍了通过关键字段查找 MongoDB 集合中的所有重复文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
假设我有一个包含一些文档集的集合.像这样的东西.
Suppose I have a collection with some set of documents. something like this.
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":1, "name" : "foo"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":2, "name" : "bar"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":3, "name" : "baz"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":4, "name" : "foo"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":5, "name" : "bar"}
{ "_id" : ObjectId("4f127fa55e7242718200002d"), "id":6, "name" : "bar"}
我想通过名称"字段查找此集合中所有重复的条目.例如.foo"出现两次,bar"出现3次.
I want to find all the duplicated entries in this collection by the "name" field. E.g. "foo" appears twice and "bar" appears 3 times.
推荐答案
注意:这个方案最容易理解,但不是最好的.
Note: this solution is the easiest to understand, but not the best.
您可以使用 mapReduce
来查找找出文档包含某个字段的次数:
You can use mapReduce
to find out how many times a document contains a certain field:
var map = function(){
if(this.name) {
emit(this.name, 1);
}
}
var reduce = function(key, values){
return Array.sum(values);
}
var res = db.collection.mapReduce(map, reduce, {out:{ inline : 1}});
db[res.result].find({value: {$gt: 1}}).sort({value: -1});
这篇关于通过关键字段查找 MongoDB 集合中的所有重复文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!