在MongoDB中，查找数组中每个文档的重复项。

Question

13 浏览2023年5月24日

匿名的 2023年5月25日

0 Comments

假设我有一个具有以下结构的文档：\n

   _id: ObjectId('444455'),
   name: 'test',
   email: 'email',
   points: {
      spendable: 23,
      history: [
          {
             comment: 'Points earned by transaction #1234',
             points: 1
          },
          {
             comment: 'Points earned by transaction #456',
             points: 3
          },
          {
             comment: 'Points earned by transaction #456',
             points: 3
          }
      ]
   }
}

\n现在我有一个问题，一些文档在points.history数组中包含重复的对象。\n有没有一种简单的方法通过查询来找到所有这些重复项？\n我已经尝试了这个查询：在MongoDB中查找重复记录，但是这只显示了所有文档中每个重复行的总计数。我需要一个按文档总览的重复项，就像这样：\n

{
    _id: ObjectId('444455') //文档的_id，而不是数组项本身的_id
    duplicates: [
       {
        comment: 'Points earned by transaction #456'
       }
    ]
}, 
{
    _id: ObjectId('444456') //文档的_id，而不是数组项本身的_id
    duplicates: [
         {
            comment: 'Points earned by transaction #66234'
         },
         {
            comment: 'Points earned by transaction #7989'
         }
    ]
}

\n我该如何实现这个要求？

0

1 答案

匿名的 · Answer 1 · 2023-09-12T07:40:09+00:00

在使用MongoDB时，有时候需要查找数组中的重复项。下面提供了一个解决方法，通过聚合管道来找到数组中每个文档中的重复项。

首先，我们使用$unwind操作符来展开数组，使得每个文档中的每个元素都成为一个单独的文档。接着，使用$group操作符来按照指定的字段进行分组，这里我们使用了id、comment和points字段。然后，使用$sum操作符来计算每个组中的文档数量。接下来，使用$match操作符来筛选出文档数量大于1的组，即找到了重复项。最后，使用$project操作符来重新组织输出结果，将_id字段设置为文档本身的_id，将duplicates字段设置为重复的comment。

这个解决方法可以通过以下聚合管道实现：

collectionName.aggregate([
  {
    $unwind: "$points.history"
  },
  {
    $group: {
      _id: {
        id: "$_id",
        comment: "$points.history.comment",
        points: "$points.history.points"
      },
      sum: {
        $sum: 1
      },
    }
  },
  {
    $match: {
      sum: {
        $gt: 1
      }
    }
  },
  {
    $project: {
      _id: "$_id._id",
      duplicates: {
        comment: "$_id.comment"
      }
    }
  }
])

以上是一个完整的解决方案，可以找到每个文档中的重复项并输出。如果你还想将文档本身的_id添加到输出结果中，可以在最后的$project中添加一个新的字段。

如果你在使用这个解决方案时遇到问题，可以参考上面的讨论，可能会找到答案。祝你好运！