Merging MongDB Subdocuments
MongoDB stores data in JSON-like BSON format. BSON enables a flexible, versatile and powerful storage model, yet at the same time compromise the data query efficiency.
To make up for this weakness, esProc SPL offers sufficient interfaces to make MongoDB query as convenient as the SQL-like database queries. Here we take a collection with nested subdocument as an example to illustrate how SPL handles MongoDB queries.
Sample data from Collection C1:
{ |
Task: Group the collection by name field. Each group contains all distinct users fields in a subdocument under a name. Below is the expected result:
{ |
SPL script:
A |
B |
||
1 |
=mongo_open("mongodb://localhost:27017/local?user=test&password=test") |
||
2 |
=mongo_shell(A1,"c1.find(,{_id:0};{name:1})") |
||
3 |
for A2;name |
=A3.(acls.read.users|acls.append.users|acls.edit.users|acls.fullControl.users) |
|
4 |
=B3.new(A3.name:_id,B3.union().id():readUsers) |
||
5 |
=@|B4.group@1(~._id,~.readUsers) |
||
6 |
=mongo_close(A1) |
A1: Connect to MongoDB database. The connection string’s format is mongo://ip:port/db?arg=value&….
A2: find() function retrieves documents from collection c1, sorts them and return result as a cursor. As no filtering condition is specified, it gets all fields except _id and sorts them by name.
A3: Fetch a set of documents with same name field value from A2’s cursor each time and do this iteratively. A3’s working scope covers an indented area from B3 to B5, within which A3 references a loop variable.
B3: Get all users fields from the current group of documents:
B4: Union all users in subdocuments in the current group.
B5: Remove duplicate records from each group in B4 and append it to B5. group@1 function deletes the duplicate records. Below is B5’s result:
If the final result set is too large to be wholly loaded into the memory, use export@j function to convert B4’s table sequence into JSON strings to append them to a target text file.
A6: Close MongoDB connection.
MongoDB boasts a lightweight, efficient storage model. Its cooperation with esProc can make data processing more effectively and extensively.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese version