Cogs and Levers A blog full of technical stuff

MongoDB Aggregation Framework

Introduction

Sometimes having the power of MapReduce at your fingertips and applying this technology to simpler aggregate queries can be more hassle than it needs to be. MongoDB provides a simpler solution (for a simpler class of problems) in the form of the Aggregation framework. This framework allows you to develop queries within the mongo environment that are analogous to GROUP BY, HAVING, COUNT, SUM, etc. that you would normally use in “relational land”.

Today’s post, I want to walk through a couple of simple queries on using this framework to maximise productivity when pivoting data.

Marrying the old with the new

As a bit of a cheat’s reference, the following table provides the some examples of aggregate queries in a relational database and how they transpose over to the Mongo aggregation environment.

The source of this table can be found here.

Technique Relational Aggregation Framework
Criteria matching WHERE $match
Grouping GROUP BY $group
Aggregate criteria filtering HAVING $match
Result projection SELECT $project
Record sorting ORDER BY $sort
Limiting result sets LIMIT or TOP $limit
Accumulation SUM $sum
Counting COUNT $sum
Dropping SKIP $skip

Counting records

Sql example:

	
SELECT COUNT(*) AS count
FROM items

Mongo example:

	
db.items.aggregate( [
{ $group: { _id: null,
            count: { $sum: 1 } } }
] )

Accumulating values

Sql example:

	
SELECT SUM(price) AS total
FROM items

Mongo example:

	
db.items.aggregate( [
{ $group: { _id: null,
            total: { $sum: "$price" } } }
] )

Aggregation with identifier

Sql example:

	
SELECT category_id, SUM(price) AS total
FROM items
GROUP BY category_id

Mongo example:

	
db.items.aggregate( [
{ $group: { _id: "$category_id",
            total: { $sum: "$price" } } }
] )

Aggregation with identifier and sorting

Sql example:

	
SELECT category_id, SUM(price) AS total
FROM items
GROUP BY category_id
ORDER BY total

Mongo example:

	
db.items.aggregate( [
{ $group: { _id: "$category_id",
            total: { $sum: "$price" } } },
{ $sort: { total: 1 } }
] )

Multiple Aggregation

Sql example:

	
SELECT category_id, when SUM(price) AS total
FROM items
GROUP BY category_id, when

Mongo example:

	
db.items.aggregate( [
{ $group: { _id: { category_id: "$category_id",
                   when: "$when" },
            total: { $sum: "$price" } } }
] )

Aggregate Filtration

Sql example:

	
SELECT category_id, count(*)
FROM items
GROUP BY category_id
HAVING count(*) > 1

Mongo example:

	
db.items.aggregate( [
{ $group: { _id: "$category_id",
            count: { $sum: 1 } } },
{ $match: { count: { $gt: 1 } } } 
] )

Multiple Aggregate Filtration

Sql example:

	
SELECT category_id, when, SUM(price) AS total
FROM items
GROUP BY category_id, when
HAVING total > 100

Mongo example:

	
db.items.aggregate( [
{ $group: { _id: { category_id: "$category_id",
                   when: "$when" },
            total: { $sum: "$price" } } },
{ $match: { total: { $gt: 100 } } } 
] )

Aggregate with Source Filtration

Sql example:

	
SELECT category_id, SUM(price) AS total
FROM items
WHERE active = 1
GROUP BY category_id

Mongo example:

	
db.items.aggregate( [
{ $match: { active: 1 } },
{ $group: { _id: "$category_id",
            total: { $sum: "$price" } } }
] )

Aggregate Filtration with Source Filtration

Sql example:

	
SELECT category_id, SUM(price) AS total
FROM items
WHERE active = 1
GROUP BY category_id
HAVING total > 100

Mongo example:

	
db.items.aggregate( [
{ $match: { active: 1 } },
{ $group: { _id: "$category_id",
            total: { $sum: "$price" } } },
{ $match: { total: { $gt: 100 } } }
] )

Aggregated joins

Sql example:

	
SELECT category_id, SUM(co.weight) AS weight
FROM items i, components co
WHERE co.item_id = i.id
GROUP BY category_id

Mongo example:

	
db.items.aggregate( [
{ $unwind: "$components" },
{ $group: { _id: "$category_id",
            weight: { $sum: "$components.weight" } } }
] )

Sub-query Aggregation

Sql example:

	
SELECT COUNT(*)
FROM (SELECT category_id, when
      FROM items
      GROUP BY category_id, when) AS Table1

Mongo example:

	
db.items.aggregate( [
{ $group: { _id: { category_id: "$category_id",
                   when: "$when" } } },
{ $group: { _id: null, count: { $sum: 1 } } }
] )