6.4 Alignment grouping: keep one record at most for each group
It is common that a grouping operation requires the result set to be ordered by the order of a specified base set, which we call alignment grouping. Examples include calculating a company’s total sales amount in each city according to the order of Beijing, Shanghai, Guangzhou and Shenzhen, calculating the average salary of each department according to the specified order of departments, and so on.
It is probably that an alignment grouping operation generates one or more empty groups or that one or more members do not belong to any group.
The case that data is sorted by the order a specified field in the base table and keep one matching member at most for each group is suitable for computing scenarios where we want to view or use data according to the specified order.
Here our task is to find the GDP and population of first-tier cities according to the order of Beijing, Shanghai, Guangzhou and Shenzhen in the Chinese cities’ GDP table of a certain year. Below is part of the source data:
ID | CITY | GDP | POPULATION |
---|---|---|---|
1 | Shanghai | 32679 | 2418 |
2 | Beijing | 30320 | 2171 |
3 | Shenzhen | 24691 | 1253 |
4 | Guangzhou | 23000 | 1450 |
5 | Chongqing | 20363 | 3372 |
… | … | … | … |
SPL has A.align() function to perform alignment grouping, which by default keeps one matching member at most for each group.
SPL script:
A | |
---|---|
1 | =T(“CityGDP.txt”) |
2 | [“Beijing”,“Shanghai”,“Guangzhou”,“Shenzhen”] |
3 | =A1.align(A2,CITY) |
4 | =A3.new(CITY,GDP,POPULATION) |
A1 Query city GDP table.
A2 Define a base sequence of cities.
A3 Use A.align() function to sort the city GDP table according to the specified sequence of cities and keep one matching member at most for each group.
A4 Create a result table comprising city, GDP and population fields.
Execution result:
CITY | GDP | POPULATION |
---|---|---|
Beijing | 30320 | 2171 |
Shanghai | 32679 | 2418 |
Guangzhou | 23000 | 1450 |
Shenzhen | 24691 | 1253 |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL