From SQL to SPL: Complement a certain average value to ensure that the total sum remains unchanged
An invoice table in the SQL Server database has one amount for each project, and each project in the project table has multiple accounts, and the two are associated through ProjectID.
# Invoices
InvoiceID |
Amount |
ProjectID |
1 |
100.0000 |
1 |
2 |
100.0000 |
2 |
3 |
100.0000 |
3 |
4 |
100.0000 |
4 |
#Projects
ProjectID |
AccountCode |
1 |
12345 |
2 |
12345 |
2 |
7890 |
3 |
800 |
3 |
234 |
3 |
987 |
4 |
800 |
4 |
234 |
4 |
987 |
4 |
2579 |
Now we need to associate the two tables and add a SplitAmount field. Roughly on average divide the amount according to the number of accounts in the project, for example, 100 is divided into 3 parts. The amount of N-1 accounts should be rounded to 2 decimal places according to 1/N, which is 33.33. The Nth account should complement the average value to ensure that the total amount remains unchanged, which is 100-33.33 * 2=33.34.
InvoiceID |
Amount |
ProjectID |
AccountCode |
SplitAmount |
1 |
100.0 |
1 |
12345 |
100.0 |
2 |
100.0 |
2 |
12345 |
50.0 |
2 |
100.0 |
2 |
7890 |
50.0 |
3 |
100.0 |
3 |
800 |
33.34 |
3 |
100.0 |
3 |
234 |
33.33 |
3 |
100.0 |
3 |
987 |
33.33 |
4 |
100.0 |
4 |
800 |
25.0 |
4 |
100.0 |
4 |
234 |
25.0 |
4 |
100.0 |
4 |
987 |
25.0 |
4 |
100.0 |
4 |
2579 |
25.0 |
SQL:
select *,
SplitAmount
+ case when rn = 1
then i.Amount - sum (i.SplitAmount)
over (partition by i.ProjectID)
else 0
end as AdjustedSplitAmount
from(
select
I.*, P.AccountCode,
round(I.Amount / count(I.InvoiceID) over (partition by P.ProjectID), 2) as SplitAmount,
row_number() over (partition by P.ProjectID order by p.AccountCode) as rn
from
#Invoices I Inner Join #Projects P on I.ProjectID = P.ProjectID
) i
After SQL grouping, it must aggregate immediately, and cannot retain the grouped subsets and directly add SplitAmount field on the subsets according to the rules. It requires indirect implementation using nested subqueries and window functions, and the sequence numbers also need to be extra generated using window functions. The overall code is cumbersome.
With grouped subsets, SPL code can be more natural:
A |
|
1 |
=mssql.query("select I.*,P.AccountCode, from #Invoices I,#Projects P where I.ProjectID = P.ProjectID") |
2 |
=A1.group(ProjectID) |
3 |
=A2.(cnt=~.count(),avg=round(Amount/cnt,2), ~. derive(avg+if(#==1, Amount-avg*cnt): SplitAmount)) |
4 |
=A3.conj() |
A1: Simple join, load data.
A2: Group, but not aggregate.
A3: Process each group of data and directly add SplitAmount field according to the rules. # is the natural sequence number, and there is no need for additional calculation.
A4: Merge groups.
Question source:https://stackoverflow.com/questions/78235564/split-invoice-total-into-multiple-rows-but-split-always-equals-total
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL