Saturday, February 8, 2014

Aggregate in MongoDB

Aggregate are use to perform calculation or operation on a group of values from multiple documents (rows). The outcome of an aggregate function is one computed single value.

In SQL Server 2012 a popular RDBMS system we have following aggregate function available.

-    AVG
-    MIN
-    CHECKSUM_AGG
-    SUM
-    COUNT
-    STDEV
-    COUNT_BIG
-    STDEVP
-    GROUPING
-    VAR
-    GROUPING_ID
-    VARP
-    MAX

In MongoDB we can perform the aggregate function using three different approaches.

1.    Aggregation Pipeline
2.    Map-Reduce
3.    Single Purpose Aggregation

Aggregation Pipeline

Aggregation Pipeline method is a framework for performing aggregate task. Technically, MongoDB passes the documents (rows) of a single collection (table) through a pipeline. The data is processed in each of the pipe and outcome is handed over to next pipe and at the end the computed result is returned. This approach uses “aggregate()” method provided by MongoDB.

The syntax to use Aggregate Pipeline approach is following:

db.collection_name.aggregate(group_by_field_name, [matching_criteria])

For example let us consider we have following employee data.

To add data into MongoDB we can issue following command on MongodB shell. This will create a employee collection (table) and insert all the documents (rows) into it.

db.employee.insert(
[
{EmpName:'Sam Pitroda',Age:35,Salary:3126,Gender:'M',Dept:10},
{EmpName:'Anil Shastri',Age:48,Salary:2724,Gender:'M',Dept:20},
{EmpName:'Bill Rama',Age:48,Salary:2270,Gender:'M',Dept:10},
{EmpName:'John Butler',Age:45,Salary:3622,Gender:'M',Dept:30},
{EmpName:'Srini Arya',Age:30,Salary:3966,Gender:'F',Dept:40},
{EmpName:'Ajay Khanna',Age:49,Salary:3711,Gender:'M',Dept:20},
{EmpName:'Supriay Khanna',Age:32,Salary:3066,Gender:'F',Dept:10},
{EmpName:'Ismail Paun',Age:32,Salary:3608,Gender:'M',Dept:30},
{EmpName:'Akshay Kumar',Age:22,Salary:2651,Gender:'M',Dept:10},
{EmpName:'Steve Allan',Age:29,Salary:4391,Gender:'M',Dept:20},
{EmpName:'Rahul Puri',Age:32,Salary:2111,Gender:'M',Dept:30},
{EmpName:'Jayant Singh',Age:31,Salary:2931,Gender:'M',Dept:20},
{EmpName:'Saurab Khanna',Age:39,Salary:2566,Gender:'M',Dept:20},
{EmpName:'Pappu Kaun',Age:24,Salary:4133,Gender:'M',Dept:10}
]
)

We will use Aggregate Pipeline approach to read total salary of Dept 10.

$sum

To read sum total of salary for each of the department we can write aggregate function like this:

db.employee.aggregate({$group: {_id:"$Dept", TotalSalary:{$sum : "$Salary"}}})

To read sum total of salary for one specific department (Dept 20) we can write aggregate function like this:

db.employee.aggregate({$group:
      {_id:"$Dept",TotalSalary:{$sum : "$Salary"}}},
      {$match:{_id:{$lt:20}}})

$avg

To get the average salary of all the departments we can write following aggregate function on MongoDB shell.

db.employee.aggregate({$group:{_id:"$Dept", TotalSalary:{$avg : "$Salary"}}})

$first, $last

To get the Maximum and Minimum salary from each Dept, we can write following command in MongodB shell.

db.employee.aggregate({$sort:{Salary:1}},{$group:{_id:{Dept: "$Dept"},MaxOfSalary:{$first: "$Salary"},MinOfSalary:{$last: "$Salary"}}})

Popular Posts

Blog Archive

Real Time Web Analytics