6.4 Alignment grouping: keep one record at most for each group

 

It is common that a grouping operation requires the result set to be ordered by the order of a specified base set, which we call alignment grouping. Examples include calculating a company’s total sales amount in each city according to the order of Beijing, Shanghai, Guangzhou and Shenzhen, calculating the average salary of each department according to the specified order of departments, and so on.
It is probably that an alignment grouping operation generates one or more empty groups or that one or more members do not belong to any group.
The case that data is sorted by the order a specified field in the base table and keep one matching member at most for each group is suitable for computing scenarios where we want to view or use data according to the specified order.

Here our task is to find the GDP and population of first-tier cities according to the order of Beijing, Shanghai, Guangzhou and Shenzhen in the Chinese cities’ GDP table of a certain year. Below is part of the source data:

ID CITY GDP POPULATION
1 Shanghai 32679 2418
2 Beijing 30320 2171
3 Shenzhen 24691 1253
4 Guangzhou 23000 1450
5 Chongqing 20363 3372

SPL has A.align() function to perform alignment grouping, which by default keeps one matching member at most for each group.

SPL script:

A
1 =T(“CityGDP.txt”)
2 [“Beijing”,“Shanghai”,“Guangzhou”,“Shenzhen”]
3 =A1.align(A2,CITY)
4 =A3.new(CITY,GDP,POPULATION)

A1 Query city GDP table.
A2 Define a base sequence of cities.
A3 Use A.align() function to sort the city GDP table according to the specified sequence of cities and keep one matching member at most for each group.
A4 Create a result table comprising city, GDP and population fields.

Execution result:

CITY GDP POPULATION
Beijing 30320 2171
Shanghai 32679 2418
Guangzhou 23000 1450
Shenzhen 24691 1253