* How to Group & Summarize MongoDB Subdocuments

Sometimes we have the requirement of dividing MongoDB documents into multiple segments by specified intervals and performing an aggregation over each segment. Now we have collection scores. We need to group it by intervals of scores and count the students in each interval for both Chinese and English. Below is related information:
The score intervals are [60, 70, 80, 90], which are [0,60), [60,70), [70,80), [80,90), and [90,).

name  

age

province

subject

zhou gao gao

24

guang xi

[Chinese,80],[Math,84][English,84],[Chemic,98]

li chao

16

shan dong

[Chinese,94],[Math,88][English   ,75],[Chemic,73]

yang hao hao

26

guang xi

[Chinese,85],[Math,83][English   ,64],[Chemic,71]

The expected result:

Segment

Chinese

English

0

3

5

1

2

3

4

As MongoDB doesn’t provide related interface, we use MapReduce to traverse each document and count students of each score interval for the two subjects respectively. The solution is complicated.

It would be easy if we could use esProc SPL to group records by subjects and score interval and then concatenate result.
Download esProc installation package
HERE.

Directions:

1. Write SPL script score.dfx in esProc:

A

B

1

[60, 70, 80, 90]

/ Define score intervals

2

=mongo_open("mongodb://127.0.0.1:27017/raqdb")

/ Connect to MongoDB database

3

=mongo_shell(A1,"scores.find(,{_id:0})")

/ Get data of score collection

4

=A3.groups(A1.pseg(subject.(score)(1)):Segment;     count(1): Chinese )

/ Count students of each   score interval for Chinese

5

=A3.groups(A1.pseg(subject.(score)(3)):Segment;     count(1): English)

/ Count students of each   score interval for English

6

=A4.join(Segment,   A5:Segment,   English)

/ A4Concatenate  results of A4 and A5

7

>A1.close()

/ Close database connection

  2. Execute the script to return the final result:  

A6

Segment

Chinese

English

0

1

3

1

2

3

2

8

5

3

6

8

4

3

1

Pseg() function returns the sequence number of interval a score falls in and accumulate the number of students score by score according to the corresponding sequence number of interval.

esProc provides JDBC interface to let you integrate the script into a Java program conveniently:

public static void doWork() {

    Connection con = null;
    java.sql.Statement st;
   
    try{
        Class.forName("com.esproc.jdbc.InternalDriver");
        con = DriverManager.getConnection("jdbc:esproc:local://"); 
        // Call script score.dfx
        st=con.createStatement(); 
        ResultSet rst = st.executeQuery("call score");
        System.out.println(rst);
    } catch(Exception e){
        System.out.println(e);
    } finally{
        // Close database connection
        if (con!= null) {
            try {
                con.close();
            } catch(Exception e) {
                System.out.println(e);
            }
        }  
    }
}

Read How to Call an SPL Script in Java to learn more about integration of esProc SPL script into a Java program.