Comparing row-based Excel
Two Excel files have multiple columns of data. The first row is the column name, and the second row is the data record. Compare the key columns in the two files. To deal with this problem, you can read the file into a data set, take out the record values of the key columns to form a set, and then calculate the result through the two sets.
There are sales order tables order_2018.xlsx and order_2019.xlsx for 2018 and 2019. The two files have the same columns. Part of the data is as follows:
1. Find the same
Find the data with the same key column value in the two files.
Example: Please find out the CustomerId and ProductId of users who have purchased the same product in the past two years, and save the data in the c_p.txt file.
The esProc SPL script is as follows:
A |
|
1 |
=T("e:/orders/order_2018.xlsx";CustomerId,ProductId) |
2 |
=T("e:/orders/order_2019.xlsx";CustomerId,ProductId) |
3 |
=[A1,A2].merge@io() |
4 |
=T("e:/orders/c_p.xlsx",A3) |
A1 Read order_2018.xlsx key column CustomerId, ProductId
A2 Read order_2019.xlsx key column CustomerId, ProductId
A3 Merging the two-year data, @i means to return the records that are included in common
A4 Write the result in A3 to c_p.xlsx
2. Find the difference
Example 1: Find out the order status of new customers in 2019 and save them in the file new_c.xlsx. The esProc SPL script is as follows:
A |
|
1 |
=T("e:/orders/order_2018.xlsx") |
2 |
=T("e:/orders/order_2019.xlsx") |
3 |
=A2.id(CustomerId)\A1.id(CustomerId) |
4 |
=A2.select(A3.contain(CustomerId)) |
5 |
=T("e:/orders/new_c.xlsx",A4) |
A1 Read order_2018.xlsx
A2 Read order_2019.xlsx
A3 Subtract the 2018 customer IDs from all customer IDs in 2019 to get the new customer IDs
A4 Filter the orders of new customers from the 2019 order table
A5 Write the result in A4 to new_c.xlsx
Example 2: Find out the IDs of all lost customers in 2019 and save them in the file lost_c.xlsx. The esProc SPL script is as follows:
A |
|
1 |
=T("e:/orders/order_2018.xlsx") |
2 |
=T("e:/orders/order_2019.xlsx") |
3 |
=A1.id(CustomerId)\A2.id(CustomerId) |
4 |
=file("e:/orders/lost_c.xlsx").xlsexport(A3) |
A1 Read order_2018.xlsx
A2 Read order_2019.xlsx
A3 Subtract all customer IDs in 2018 from all customer IDs in 2018 to get the customer IDs that were lost
A4 Write the result in A3 to lost_c.xlsx, because the result in A3 is a sequence, not a table sequence, so the T function cannot be used to save data here
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_Desktop/
SPL Learning Material 👉 https://c.scudata.com
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProcDesktop
Linkedin Group 👉 https://www.linkedin.com/groups/14419406/
Chinese version