MongoDB aggregate framework very slow when have group

it is hard to determine speed as we don't have environment details. What you could try to see how explain is predicting you query by adding:

{
   explain:true
}

to your aggregation query db.coll.aggregate([pipeline], {explain:true},{allowDiskUse: true}). What also need to be considered that $unwind doubles amount of documents to process.

As you re going to count amount of documents -> it could be faster just take size of an array (after first unwind) and sum it later

db.inventory.aggregate(
   [
      {
         $group: {
            _id: null,
            numberOfdocs: { $sum:{$size: "$requested_items.winner" }}
         }
      }
   ]
)

EDIT

after playing with this query I was able to reduce it execution time circa 45%. The main point is to skip second $match as this scans full result set , so last $group contains all data and we can filter out what's needed at the end as this operation is done on a small result set.

db.coll.aggregate([{
            $match : {
                "status" : "Homologado"
            }
        }, {
            $unwind : "$requested_items"
        }, {
            $unwind : "$requested_items.winner"
        }, {
            $project : {
                x : "$requested_items.status",
            }
        }, {
            $group : {
                _id : "$x",
                numberOfdocs : {
                    $sum : 1
                }
            }
        }, {
            $match : {
                "_id" : /acesssito/i
            }
        }
    ], {
        allowDiskUse: true
});

Finally solved the problem on my query with group. It was an error of design patterns. Thinking in SQL world, I designed the collections before thinking about my app. As a result, slow queries.

To resolve it I had to redesign my collections and put the relevant data in first level of my docs. In my searches, I found that on Aggregation, index needs to be in first stage of pipeline. If I use a field with index after the stage $unwind, it is not considered.

Besides that, I created an int hash for text fields using the package https://github.com/darkskyapp/string-hash. So, my text fields can be indexed.

So my queries changed from 300s to 5s.

MongoDB aggregate framework very slow when have group

EDIT

Tags:

Performance

Mongodb

Aggregation Framework

Related

Recent Posts