* How to Make a Crosstab in MongoDB

Crosstab is a commonly used grouping & aggregation table. It shows the aggregates on a specific field and groups them; groups rows by a specific variable, and transposes groups into rows and other variables or variable combinations into columns to form a data table for statistical analysis.
Take collection events for instance. We count events for each user, append the aggregate results by userID at the end of columns and those by EventID at the end of rows. Below is the source table:

UserID  

EventID

Count

Date

1

1

10

2020-01-02

1

1

15

2020-01-02

1

2

12

2020-01-02

1

3

18

2020-01-03

2

1

5

2020-01-02

3

2

10

2020-01-02

 Group rows by UserID and aggregate EvendID and Count respectively to get the following crosstab:

UserID  

Evt1

Evt2

Evt3

Events

1

25

12

18

55

Total

30.0

60.0

60.0

70.0

Here’s how MongoDB handles it. It aligns each document for each event and adds a field to record the current number of events. This is equivalent to data rearrangement. Then it groups documents by UserID and count the frequency of each event in each group. And then it sums the event column by column and appends the results at the end of documents. Finally it filters the data set to display the desired documents. The process is roundabout and complicated.
esProc SPL, however, does it in an easy and effortless way. It groups records by UserID and then performs aggregates.

Download esProc installation package HERE.
Directions:

1. Write SPL script events.dfx in esProc:

A

B

1

=mongo_open("mongodb://127.0.0.1:27017/raqdb")

/ Connect to MongoDB database

2

=mongo_shell(A1,"events.find(,{_id:0})")

/ Get data of events collection

3

=A2.pivot@s(UserID;EventID,sum(Count);     1:"Evt1",2:"Evt2",     3:"Evt3").derive(Evt1+Evt2+Evt3:Events)

/ Group records by UserID,   count frequency of each EventID in each group and get the total

4

>A3.insert(0,     "Total", A3.sum(Evt1), A3.sum(Evt2), A3.sum(Evt3), A3.sum(Events))

/ Sum each event by column   and append results to the sequence

5

>A1.close()

/ Close database connection

  2. Execute the script to return the final result:

A3

UserID  

Evt1

Evt2

Evt3

Events

1

25

12

18

55

2

5

null

null

5

3

null

10

null

10

Total

30

22

18

70

A3 groups records by UserID, count EventIDs in each group and then gets the total.

esProc provides JDBC interface to let you integrate the script into a Java program conveniently:
public static void doWork() {

    Connection con = null;
    java.sql.Statement st;
   
    try{
        Class.forName("com.esproc.jdbc.InternalDriver");
        con = DriverManager.getConnection("jdbc:esproc:local://"); 
        // Call script events.dfx
        st=con.createStatement(); 
        ResultSet rst = st.executeQuery("call events");
        System.out.println(rst);
    } catch(Exception e){
        System.out.println(e);
    } finally{
        // Close database connection
        if (con!= null) {
            try {
                con.close();
            } catch(Exception e) {
                System.out.println(e);
            }
        }  
    }
}

Read How to Call an SPL Script in Java to learn more about integration of esProc SPL script into a Java program.