Achieving Foreign Key Joins between MongoDB Collections in SPL

  As a distributed NoSQL database, MongoDB is very like a relational database. But since the documents in a single collection do not need to have the same set of fields, it performs poorly in handling joins, which is the strength of the relational databases. You have to hardcode a foreign key join with the MongoDB API, which is difficult and not intuitive. A simple alternative is the esProc SPL. Let’s look at how SPL deals with such a join.

  Collection UserCourseProgress records users and the courses they attend. Courseid field is the foreign key pointing to _id field in collection Course. Now we need to get the count of users who attends each course and display the course name as the title field values in Course.

UserCourseProgress

Course

{"userId":"u01",
  "courseid":"c01",
  "timespent":6000,
  score:99}
  {"userId":"u02",
  "courseid":"c01",
  "timespent":6000,
  score:99}
  {"userId":"u03",
  "courseid":"c01",
  "timespent":6000,
  score:99}
  {"userId":"u04",
  "courseid":"c01",
  "timespent":6000,
  score:99}
  {"userId":"u05",
  "courseid":"c01",
  "timespent":6000,
  score:99}
  {"userId":"u01",
  "courseid":"c02",
  "timespent":6000,
  score:99}
  {"userId":"u02",
  "courseid":"c02",
  "timespent":6000,
  score:99}
  {"userId":"u03",
  "courseid":"c03",
  "timespent":6000,
  score:99}

{"_id":"c01"
  "title":"Japanese159",
  "description":"Japanese  base",
  "category":"language"}
  {"_id":"c02"
  "title":"Chinese200",
  "description":"Chinese  middle",
  "category":"language"}
  {"_id":"c03"
  "title":"Political  science 280",
  "description":"Political  middle",
  "category":"politics"}
  {"_id":"c04"
  "title":"EE490",
  "description":"electronic  engineering hign",
  "category":"Electronic"}

 

  SPL code:


A

1

=mongo_open("mongodb://localhost:27017/local?user=test&password=test")

2

=mongo_shell(A1,"UserCourseProgress.aggregate([{$group:    {_id: {'primary':'$courseid'},'userCount':   {$sum: 1}}}, {$sort:{'userCount':-1}},{$project:{_id:0,'courseid':'$_id.primary','userCount':1}}])")

3

=mongo_shell(A1,"Course.find(,{title:1})").fetch()

4

=A2.switch(courseid,A3:_id)

5

=A4.new(userCount,courseid.title)

6

=mongo_close(A1)

  A1: Connect to MongoDB. The connection string format is mongo://ip:port/db?arg=value&….

  A2: Count users of each course. MongoDB’s aggregate() function is used to retrieve data from UserCourseProgress, where the parameter is an aggregate expression written in MongoDB’s syntax. The query returns a memory result, as shown below:

  undefined

  A3: find() function retrieves data from Course with a filtering condition being absent and returns a cursor. Since there are only several courses, we use fetch() function to import data in the cursor to the memory. Below is the result:

  undefined

  A4: switch() function replaces A2’s foreign key field values with the corresponding records in A2, as shown below:

  undefined

  A5: Object style access to the memory to form a new two-dimensional table, as shown below:

  undefined

  A6: Close the MongoDB connection.

  As you see, it’s as convenient, easy and intuitive to implement collection joins with SPL in MongoDB as performing joins in a relational database. SPL makes the loose table association in the NoSQL database not a problem anymore.