Mongoose Query to filter an array and Populate related content

You need to "project" the match here since all the MongoDB query does is look for a "document" that has "at least one element" that is "greater than" the condition you asked for.

So filtering an "array" is not the same as the "query" condition you have.

A simple "projection" will just return the "first" matched item to that condtion. So it's probably not what you want, but as an example:

Order.find({ "articles.quantity": { "$gte": 5 } })
    .select({ "articles.$": 1 })
    .populate({
        "path": "articles.article",
        "match": { "price": { "$lte": 500 } }
    }).exec(function(err,orders) {
       // populated and filtered twice
    }
)

That "sort of" does what you want, but the problem is really going to be that will only ever return at most one element within the "articles" array.

To do this properly you need .aggregate() to filter the array content. Ideally this is done with MongoDB 3.2 and $filter. But there is also a special way to .populate() here:

Order.aggregate(
    [
        { "$match": { "artciles.quantity": { "$gte": 5 } } },
        { "$project": {
            "orderdate": 1,
            "articles": {
                "$filter": {
                    "input": "$articles",
                    "as": "article",
                    "cond": {
                       "$gte": [ "$$article.quantity", 5 ]
                    }
                }
            },
            "__v": 1
        }}
    ],
    function(err,orders) {
        Order.populate(
            orders.map(function(order) { return new Order(order) }),
            {
                "path": "articles.article",
                "match": { "price": { "$lte": 500 } }
            },
            function(err,orders) {
                // now it's all populated and mongoose documents
            }
        )
    }
)

So what happens here is the actual "filtering" of the array happens within the .aggregate() statement, but of course the result from this is no longer a "mongoose document" because one aspect of .aggregate() is that it can "alter" the document structure, and for this reason mongoose "presumes" that is the case and just returns a "plain object".

That's not really a problem, since when you see the $project stage, we are actually asking for all of the same fields present in the document according to the defined schema. So even though it's just a "plain object" there is no problem "casting" it back into an mongoose document.

This is where the .map() comes in, as it returns an array of converted "documents", which is then important for the next stage.

Now you call Model.populate() which can then run the further "population" on the "array of mongoose documents".

The result then is finally what you want.


MongoDB older versions than 3.2.x

The only things that really change here are the aggregation pipeline, So that is all that needs to be included for brevity.

MongoDB 2.6 - Can filter arrays with a combination of $map and $setDifference. The result is a "set" but that is not a problem when mongoose creates an _id field on all sub-document arrays by default:

    [
        { "$match": { "artciles.quantity": { "$gte": 5 } } },
        { "$project": {
            "orderdate": 1,
            "articles": {
                "$setDiffernce": [
                   { "$map": {
                      "input": "$articles",
                      "as": "article",
                      "in": {
                         "$cond": [
                             { "$gte": [ "$$article.price", 5 ] },
                             "$$article",
                             false
                         ]
                      }
                   }},
                   [false]
                ]
            },
            "__v": 1
        }}
    ],

Older revisions of than that must use $unwind:

    [
        { "$match": { "artciles.quantity": { "$gte": 5 } }},
        { "$unwind": "$articles" },
        { "$match": { "artciles.quantity": { "$gte": 5 } }},
        { "$group": {
          "_id": "$_id",
          "orderdate": { "$first": "$orderdate" },
          "articles": { "$push": "$articles" },
          "__v": { "$first": "$__v" }
        }}
    ],

The $lookup Alternative

Another alternate is to just do everything on the "server" instead. This is an option with $lookup of MongoDB 3.2 and greater:

Order.aggregate(
    [
        { "$match": { "artciles.quantity": { "$gte": 5 } }},
        { "$project": {
            "orderdate": 1,
            "articles": {
                "$filter": {
                    "input": "$articles",
                    "as": "article",
                    "cond": {
                       "$gte": [ "$$article.quantity", 5 ]
                    }
                }
            },
            "__v": 1
        }},
        { "$unwind": "$articles" },
        { "$lookup": {
            "from": "articles",
            "localField": "articles.article",
            "foreignField": "_id",
            "as": "articles.article"
        }},
        { "$unwind": "$articles.article" },
        { "$group": {
          "_id": "$_id",
          "orderdate": { "$first": "$orderdate" },
          "articles": { "$push": "$articles" },
          "__v": { "$first": "$__v" }
        }},
        { "$project": {
            "orderdate": 1,
            "articles": {
                "$filter": {
                    "input": "$articles",
                    "as": "article",
                    "cond": {
                       "$lte": [ "$$article.article.price", 500 ]
                    }
                }
            },
            "__v": 1
        }}
    ],
    function(err,orders) {

    }
)

And though those are just plain documents, it's just the same results as what you would have got from the .populate() approach. And of course you can always go and "cast" to mongoose documents in all cases again if you really must.

The "shortest" Path

This really goes back to the orginal statement where you basically just "accept" that the "query" is not meant to "filter" the array content. The .populate() can happilly do so becuse it's just another "query" and is stuffing in "documents" by convenience.

So if you really are not saving "bucketloads" of bandwith by the removal of additional array members in the orginal document array, then just .filter() them out in post processing code:

Order.find({ "articles.quantity": { "$gte": 5 } })
    .populate({
        "path": "articles.article",
        "match": { "price": { "$lte": 500 } }
    }).exec(function(err,orders) {
        orders = orders.filter(function(order) {
            order.articles = order.articles.filter(function(article) {
                return (
                    ( article.quantity >= 5 ) &&
                    ( article.article != null )
                )
            });
            return order.aricles.length > 0;
        })

        // orders has non matching entries removed            
    }
)