1.14 Grouping: get top/bottom N from each group

 

group() function groups records while retaining the grouped subsets and then summarizes grouped subsets. The ability of keeping the grouped subsets makes it suitable for scenarios where the grouped subsets need to be repeatedly used or where further complicated computations exist.

Find IDs of students whose scores rank in top 2 for each subject in every class based on the following Scores table.

CLASS STUDENTID SUBJECT SCORE
Class one 1 English 84
Class one 1 Math 77
Class one 1 PE 69
Class one 2 English 81
Class one 2 Math 80

SPL script:

A
1 =connect(“db”)
2 =A1.query@x(“select * from Scores”)
3 =A2.group(CLASS,SUBJECT;~.top(-2;SCORE):TOP2)
4 =A3.(TOP2).conj()

A1 Connect to the database;
A2 Query students’ scores;
A3 Group A2’s records by class and subject and from each group get top 2 records in terms of score;
A4 Concatenate all top 2 records for each subject in every class into a sequence.

Below is A3’s result (group() function groups records and generates grouped subsets. In this context, getting topN is a kind of aggregate operation that, in this case, gets the top two records from each subset.):

CLASS SUBJECT Members
Class one English [[Class one,4,English,96],[Class one,9,English,93]]
Class one Math [[Class one,13,Math,97],[Class one,10,Math,97]]

Execution result:

CLASS STUDENTID SUBJECT SCORE
Class one 4 English 96
Class one 9 English 93
Class one 13 Math 97
Class one 10 Math 97