3.1 Get concatenation of two sets
The concatenation of set A and set B is made up of all members of set A and all members of set B, including all duplicates. For instance, the concatenation of [1,2] and [1,3] is [1,2,1,3], where member 1 appears twice in the result set.
Here is an example. A business has two sales channels – online and offline stores. Their sales data is stored in SALES_ONLINE table and SALES_STORE table respectively. The task is to query the total sales amount of each product in the year 2014. Below is part of data in the two tables:
SALES_ONLINE:
ID | CUSTOMERID | ORDERDATE | PRODUCTID | AMOUNT |
---|---|---|---|---|
1 | HANAR | 2014/01/01 | 17 | 3868.6 |
2 | ERNSH | 2014/01/03 | 42 | 1005.9 |
3 | LINOD | 2014/01/06 | 64 | 400.0 |
4 | OTTIK | 2014/01/07 | 10 | 1194.0 |
5 | OCEAN | 2014/01/09 | 41 | 319.2 |
… | … | … | … | … |
SALES_STORE:
ID | CUSTOMERID | ORDERDATE | SELLERID | PRODUCTID | AMOUNT |
---|---|---|---|---|---|
1 | EASTC | 2014/01/01 | 1 | 27 | 3063.0 |
2 | ERNSH | 2014/01/02 | 8 | 70 | 2713.5 |
3 | MAGAA | 2014/01/03 | 2 | 74 | 1675.0 |
4 | SAVEA | 2014/01/07 | 7 | 45 | 2018.2 |
5 | FOLIG | 2014/01/08 | 8 | 30 | 1622.4 |
… | … | … | … | … | … |
Structures of the two tables are not completely same. SALES_STORE has SELLERID field that stores data of salespeople, but both have ORDERDATE, PRODUCTID and AMOUNT fields. The grouping and aggregation after they are concatenated can be handled as usual.
SPL can use character “|” to calculate concatenation of sets. A|B, for instance, calculates concatenation of set A and set B.
SPL script:
A | |
---|---|
1 | =T(“SalesOnline.xlsx”).select(year(ORDERDATE)==2014) |
2 | =T(“SalesStore.xlsx”).select(year(ORDERDATE)==2014) |
3 | =A1|A2 |
4 | =A3.groups(PRODUCTID; sum(AMOUNT):AMOUNT) |
A1: Import SALES_ONLINE table from the original file and select sales records of the year 2014.
A2: Import SALES_STORE table from the original file and select sales records of the year 2014.
A3: Use the operator “|” to calculate concatenation of the two sets.
A4: Group and calculate total sales amount of each product.
Note that SPL concatenation does not require that sets have same structures. This is because SPL sequences support members of different structures. Yet we can directly access common fields of all members of a sequence, like PRODUCTID and AMOUNT here, as we access fields of a regular data table. This is really convenient.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL