6.15 Order-based grouping: by continuous same value

 

Sometimes the order of data is important for a grouping operation. Neighboring records having same values in a certain field or meeting certain conditions at the same time will be put into the same group. Examples include finding the country taking the first place in the Olympic gold medal table for the longest time and count the largest days when a stock rises consecutively. All these computing tasks need order-based grouping operations.

When grouping an ordered set, create a new group whenever the grouping field value changes.

Find the country that occupies the first place in terms of total count of medals for the longest time and information of medals according to the Olympic medal table. Below is part of the table:

Game Nation Gold Silver Copper
30 USA 46 29 29
30 China 38 27 23
30 UK 29 17 19
30 Russia 24 26 32
30 Korea 13 8 7

When A.group() function works with @o option, it creates a new group whenever the grouping field value changes.
When the function uses @1 option, it gets the first record in each group to form a new table sequence.

SPL script:

A
1 =T(“Olympic.txt”)
2 =A1.sort@z(GAME,GOLD,SILVER,COPPER)
3 =A2.group@o1(GAME)
4 =A3.group@o(NATION)
5 =A4.maxp(~.len())

A1 Import the Olympic medal table.
A2 Sort records in A1’s table by game, and the medal count (gold, silver and then copper) in descending order.
A3 Get one record for each game, which has the country having the greatest medal count because the records are already ordered.
A4 Create a new group whenever the country changes.
A5 Get the group having the largest number of records, which is the country taking the first place continuously for the most games.