* How to Get Record with the Latest Date from Each Group in MongoDB
It’s not uncommon that, during a statistical analysis project, we need to check the record with the latest date in each group after records are grouped.
Take collect project for instance, we want to get the suiteStatus on the latest date in each group after documents are grouped by executionProject. Below is the source data:
{ "_id" : ObjectId("55d4410544c96d6f6578f893"), |
"runStartTime" : ISODate("2015-08-19T09:40:27.377Z"), |
The expected result:
[ { |
MongoDB has its way of handling this. It sorts documents, group them by executionProject, gets suite status (pass or fail) using the combination of $last+$cond+$anyElementTrue+
$map+in+$eq, and then join the result to the latest runEndTime to output. The whole process is not so convenient.
It would be easier with esProc SPL. SPL script gets the record with the latest runEndTime from each group and then further processes this record.
Download esProc installation package HERE.
Directions:
1. Write SPL script project.dfx in esProc:
A |
B |
|
1 |
=mongo_open("mongodb://127.0.0.1:27017/raqdb") |
/ Connect to MongoDB database |
2 |
=mongo_shell(A1,"project.find(,{_id:0})").fetch() |
/ Get data of project collection |
3 |
=A2.group(executionProject; (r=~.maxp(runEndTime)).suiteList, r.runEndTime) |
/ Group documents and get the record with the latest runEndTime from each group |
4 |
=A3.new(executionProject,if(suiteList.(suiteStatus).pos("PASS"), 1, 0): suite-pass, if(suiteList.(suiteStatus).pos("FAIL"),1,0): suite-fail, runEndTime) |
/ Get status of each selected record |
5 |
>A1.close() |
/ Close database connection |
2. Perform debug execution to get the value of A3 as follows:
A3 |
executionProject |
suiteList |
runEndTime |
Project1 |
[[FAIL]] |
2015-08-19 17:46:31 |
|
Project2 |
[[FAIL],[PASS]] |
2015-08-19 19:09:52 |
|
Project3 |
[[FAIL]] |
2015-08-19 19:18:41 |
3. Execute the script to return the final result:
A4 |
executionProject |
suite-pass |
suite-fail |
runEndTime |
Project1 |
0 |
1 |
2015-08-19 17:46:31 |
|
Project2 |
1 |
1 |
2015-08-19 19:09:52 |
|
Project3 |
0 |
1 |
2015-08-19 19:18:41 |
A3 groups documents, gets record from each group that contains the latest date, and extract the desired field value.
esProc supplies JDBC interface to let you integrate the script into a Java program easily.
public static void doWork() {
Connection con = null;
java.sql.Statement st;
try{
Class.forName("com.esproc.jdbc.InternalDriver");
con = DriverManager.getConnection("jdbc:esproc:local://");
// Call script project.dfx
st=con.createStatement();
ResultSet rst = st.executeQuery("call project");
System.out.println(rst);
}catch(Exception e){
System.out.println(e);
}finally{
// Close database connection
if (con!=null) {
try {
con.close();
}catch(Exception e) {
System.out.println(e);
}
}
}
}
You can use the same way to get the maximum value or minimum value from each group for a specific field after documents are grouped.
Read How to Call an SPL Script in Java to learn more about integration of esProc SPL script into a Java program.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL