* How to Get Record with the Latest Date from Each Group in MongoDB

It’s not uncommon that, during a statistical analysis project, we need to check the record with the latest date in each group after records are grouped.
Take collect project for instance, we want to get the suiteStatus on the latest date in each group after documents are grouped by executionProject. Below is the source data:

{  "_id" : ObjectId("55d4410544c96d6f6578f893"),
      "executionProject" : "Project1",
      "suiteList" : [
          {
              "suiteStatus" : "PASS",
          }
      ],
      "runEndTime" : ISODate("2015-08-19T08:40:47.049Z"),
      "runStartTime" : ISODate("2015-08-19T08:40:37.621Z"),
      "runStatus" : "PASS",
      "__v" : 1
  }, {
      "_id" : ObjectId("55d44eb4c0422e7b8bffe76b"),
      "executionProject" : "Project1",
      "suiteList" : [
          {
              "suiteStatus" : "PASS",
          }
      ],
      "runEndTime" : ISODate("2015-08-19T09:39:13.528Z"),
      "runStartTime" : ISODate("2015-08-19T09:39:00.406Z"),
      "runStatus" : "PASS",
      "__v" : 1
  }, {
      "_id" : ObjectId("55d44f0bc0422e7b8bffe76f"),
      "executionProject" : "Project1",
      "suiteList" : [
          {
              "suiteStatus" : "FAIL",
          }
      ],
      "runEndTime" : ISODate("2015-08-19T09:46:31.108Z"),
   

    "runStartTime" : ISODate("2015-08-19T09:40:27.377Z"),
      "runStatus" : "PASS",
      "__v" : 1
  }, {
      "_id" : ObjectId("55d463d0c0422e7b8bffe789"),
      "executionProject" : "Project2",
      "suiteList" : [
          {
              "suiteStatus" : "FAIL"
          },
          {
              "suiteStatus" : "PASS"
          }
      ],
      "runEndTime" : ISODate("2015-08-19T11:09:52.537Z"),
      "runStartTime" : ISODate("2015-08-19T11:09:04.539Z"),
      "runStatus" : "FAIL",
      "__v" : 1
  }, {
      "_id" : ObjectId("55d464ebc0422e7b8bffe7c2"),
      "executionProject" : "Project3",
      "suiteList" : [
          {
              "suiteStatus" : "FAIL"
          }
      ],
      "runEndTime" : ISODate("2015-08-19T11:18:41.460Z"),
      "runStartTime" : ISODate("2015-08-19T11:13:47.268Z"),
      "runStatus" : "FAIL",
      "__v" : 10
  }

The expected result:

[ {
          "executionProject": "Project1",
          "suite-pass": 0,
          "suite-fail": 1,
          "runEndTime": ISODate("2015-08-19T09:46:31.108Z")
      }, {
          "executionProject": "Project2",
          "suite-pass": 1,
          "suite-fail": 1,
          "runEndTime": ISODate("2015-08-19T11:09:52.537Z")
      }, {
          "executionProject": "Project3",
          "suite-pass": 0,
          "suite-fail": 1,
          "runEndTime": ISODate("2015-08-19T11:18:41.460Z")
      }]

 

MongoDB has its way of handling this. It sorts documents, group them by executionProject, gets suite status (pass or fail) using the combination of $last$cond+$anyElementTrue+
$map+in+$eq, and then join the result to the latest runEndTime to output. The whole process is not so convenient.

It would be easier with esProc SPL. SPL script gets the record with the latest runEndTime from each group and then further processes this record.
Download esProc installation package
HERE.

Directions:

1. Write SPL script project.dfx in esProc:

A

B

1

=mongo_open("mongodb://127.0.0.1:27017/raqdb")

/ Connect to MongoDB database

2

=mongo_shell(A1,"project.find(,{_id:0})").fetch()

/ Get data of project collection

3

=A2.group(executionProject;   (r=~.maxp(runEndTime)).suiteList,   r.runEndTime)

/ Group documents and get the   record with the latest runEndTime from each group

4

=A3.new(executionProject,if(suiteList.(suiteStatus).pos("PASS"),     1, 0): suite-pass,   if(suiteList.(suiteStatus).pos("FAIL"),1,0): suite-fail,     runEndTime)

/ Get status of each selected   record

5

>A1.close()

/ Close database connection

  2. Perform debug execution to get the value of A3 as follows:

A3

executionProject

suiteList

runEndTime

Project1

[[FAIL]]

2015-08-19   17:46:31

Project2

[[FAIL],[PASS]]

2015-08-19   19:09:52

Project3

[[FAIL]]

2015-08-19   19:18:41

  3. Execute the script to return the final result:

A4

executionProject

suite-pass

suite-fail

runEndTime

Project1

0

1

2015-08-19   17:46:31

Project2

1

1

2015-08-19   19:09:52

Project3

0

1

2015-08-19   19:18:41

A3 groups documents, gets record from each group that contains the latest date, and extract the desired field value.

esProc supplies JDBC interface to let you integrate the script into a Java program easily.
public static void doWork() {
    Connection con = null;
    java.sql.Statement st;
   
    try{
        Class.forName("com.esproc.jdbc.InternalDriver");
        con = DriverManager.getConnection("jdbc:esproc:local://"); 
        // Call script project.dfx
        st=con.createStatement(); 
        ResultSet rst = st.executeQuery("call project");
        System.out.println(rst);
    }catch(Exception e){
        System.out.println(e);
    }finally{
        // Close database connection
        if (con!=null) {
            try {
                con.close();
            }catch(Exception e) {
                System.out.println(e);
            }
        }  
    }
}

You can use the same way to get the maximum value or minimum value from each group for a specific field after documents are grouped.
Read
How to Call an SPL Script in Java to learn more about integration of esProc SPL script into a Java program.