6.6 Alignment grouping: put non-matcheding records to a separate group
Group data in order of the specified field in the base table and put non-matching members in a separate group. This is suitable for computing scenarios where we want to know information of matching members as well as non-matching members.
Calculate the average salary of employees in states of [California, Texas, New York, Florida] and in all the other states as a whole based on the employee table. Below is part of the table:
ID | NAME | STATE | DEPT | SALARY |
---|---|---|---|---|
1 | Rebecca | California | R&D | 7000 |
2 | Ashley | New York | Finance | 11000 |
3 | Rachel | New Mexico | Sales | 9000 |
4 | Emily | Texas | HR | 7000 |
5 | Ashley | Texas | R&D | 16000 |
… | … | … | … | … |
SPL offers @n option to work with A.align() function to place non-matching records to a new group during an alignment grouping.
SPL script:
A | |
---|---|
1 | =T(“Employee.csv”) |
2 | [California,Texas,New York,Florida] |
3 | =A1.align@an(A2,STATE) |
4 | =A3.new(if (#>A2.len(),“Other”,STATE):STATE,~.avg(SALARY):AvgSalary) |
A1 Query Employee table.
A2 Create a sequence of states.
A3 Use A.align@an function to group records in Employee table by aligning them to A2’s sequence of states. @a option enables returning all matching members for each group, and @n option enables putting non-matching members in a separate group.
A4 Calculate average salary of each group, during which the group of non-matching records (which is the last group) is named “Other”.
Execution result:
STATE | AvgSalary |
---|---|
California | 7700.0 |
Texas | 7592.59 |
New York | 7677.77 |
Florida | 7145.16 |
Other | 7308.1 |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL