MongoDB Aggregation Framework
05 Mar 2013
Introduction
Sometimes having the power of MapReduce at your fingertips and applying this technology to simpler aggregate queries can be more hassle than it needs to be. MongoDB provides a simpler solution (for a simpler class of problems) in the form of the Aggregation framework . This framework allows you to develop queries within the mongo environment that are analogous to GROUP BY , HAVING , COUNT , SUM , etc. that you would normally use in “relational land”.
Today’s post, I want to walk through a couple of simple queries on using this framework to maximise productivity when pivoting data.
Marrying the old with the new
As a bit of a cheat’s reference, the following table provides the some examples of aggregate queries in a relational database and how they transpose over to the Mongo aggregation environment.
The source of this table can be found here .
Technique
Relational
Aggregation Framework
Criteria matching
WHERE
$match
Grouping
GROUP BY
$group
Aggregate criteria filtering
HAVING
$match
Result projection
SELECT
$project
Record sorting
ORDER BY
$sort
Limiting result sets
LIMIT or TOP
$limit
Accumulation
SUM
$sum
Counting
COUNT
$sum
Dropping
SKIP
$skip
Counting records
Sql example:
SELECT COUNT ( * ) AS count
FROM items
Mongo example:
db . items . aggregate ( [
{ $group : { _id : null ,
count : { $sum : 1 } } }
] )
Accumulating values
Sql example:
SELECT SUM ( price ) AS total
FROM items
Mongo example:
db . items . aggregate ( [
{ $group : { _id : null ,
total : { $sum : " $price " } } }
] )
Aggregation with identifier
Sql example:
SELECT category_id , SUM ( price ) AS total
FROM items
GROUP BY category_id
Mongo example:
db . items . aggregate ( [
{ $group : { _id : " $category_id " ,
total : { $sum : " $price " } } }
] )
Aggregation with identifier and sorting
Sql example:
SELECT category_id , SUM ( price ) AS total
FROM items
GROUP BY category_id
ORDER BY total
Mongo example:
db . items . aggregate ( [
{ $group : { _id : " $category_id " ,
total : { $sum : " $price " } } },
{ $sort : { total : 1 } }
] )
Multiple Aggregation
Sql example:
SELECT category_id , when SUM ( price ) AS total
FROM items
GROUP BY category_id , when
Mongo example:
db . items . aggregate ( [
{ $group : { _id : { category_id : " $category_id " ,
when : " $when " },
total : { $sum : " $price " } } }
] )
Aggregate Filtration
Sql example:
SELECT category_id , count ( * )
FROM items
GROUP BY category_id
HAVING count ( * ) > 1
Mongo example:
db . items . aggregate ( [
{ $group : { _id : " $category_id " ,
count : { $sum : 1 } } },
{ $match : { count : { $gt : 1 } } }
] )
Multiple Aggregate Filtration
Sql example:
SELECT category_id , when , SUM ( price ) AS total
FROM items
GROUP BY category_id , when
HAVING total > 100
Mongo example:
db . items . aggregate ( [
{ $group : { _id : { category_id : " $category_id " ,
when : " $when " },
total : { $sum : " $price " } } },
{ $match : { total : { $gt : 100 } } }
] )
Aggregate with Source Filtration
Sql example:
SELECT category_id , SUM ( price ) AS total
FROM items
WHERE active = 1
GROUP BY category_id
Mongo example:
db . items . aggregate ( [
{ $match : { active : 1 } },
{ $group : { _id : " $category_id " ,
total : { $sum : " $price " } } }
] )
Aggregate Filtration with Source Filtration
Sql example:
SELECT category_id , SUM ( price ) AS total
FROM items
WHERE active = 1
GROUP BY category_id
HAVING total > 100
Mongo example:
db . items . aggregate ( [
{ $match : { active : 1 } },
{ $group : { _id : " $category_id " ,
total : { $sum : " $price " } } },
{ $match : { total : { $gt : 100 } } }
] )
Aggregated joins
Sql example:
SELECT category_id , SUM ( co . weight ) AS weight
FROM items i , components co
WHERE co . item_id = i . id
GROUP BY category_id
Mongo example:
db . items . aggregate ( [
{ $unwind : " $components " },
{ $group : { _id : " $category_id " ,
weight : { $sum : " $components.weight " } } }
] )
Sub-query Aggregation
Sql example:
SELECT COUNT ( * )
FROM ( SELECT category_id , when
FROM items
GROUP BY category_id , when ) AS Table1
Mongo example:
db . items . aggregate ( [
{ $group : { _id : { category_id : " $category_id " ,
when : " $when " } } },
{ $group : { _id : null , count : { $sum : 1 } } }
] )