From SQL to SPL: Complement a certain average value to ensure that the total sum remains unchanged

An invoice table in the SQL Server database has one amount for each project, and each project in the project table has multiple accounts, and the two are associated through ProjectID.

# Invoices

InvoiceID

Amount

ProjectID

1

100.0000

1

2

100.0000

2

3

100.0000

3

4

100.0000

4

#Projects

ProjectID

AccountCode

1

12345

2

12345

2

7890

3

800

3

234

3

987

4

800

4

234

4

987

4

2579

Now we need to associate the two tables and add a SplitAmount field. Roughly on average divide the amount according to the number of accounts in the project, for example, 100 is divided into 3 parts. The amount of N-1 accounts should be rounded to 2 decimal places according to 1/N, which is 33.33. The Nth account should complement the average value to ensure that the total amount remains unchanged, which is 100-33.33 * 2=33.34.

InvoiceID

Amount

ProjectID

AccountCode

SplitAmount

1

100.0

1

12345

100.0

2

100.0

2

12345

50.0

2

100.0

2

7890

50.0

3

100.0

3

800

33.34

3

100.0

3

234

33.33

3

100.0

3

987

33.33

4

100.0

4

800

25.0

4

100.0

4

234

25.0

4

100.0

4

987

25.0

4

100.0

4

2579

25.0

SQL

select *,
       SplitAmount 
       + case when rn = 1 
              then i.Amount - sum  (i.SplitAmount) 
                              over (partition by i.ProjectID)
              else 0
              end  as AdjustedSplitAmount
from(
  select 
      I.*, P.AccountCode,
      round(I.Amount / count(I.InvoiceID) over (partition by P.ProjectID), 2) as SplitAmount,
      row_number() over (partition by P.ProjectID order by p.AccountCode) as rn
  from 
      #Invoices I Inner Join #Projects P on I.ProjectID = P.ProjectID
) i

After SQL grouping, it must aggregate immediately, and cannot retain the grouped subsets and directly add SplitAmount field on the subsets according to the rules. It requires indirect implementation using nested subqueries and window functions, and the sequence numbers also need to be extra generated using window functions. The overall code is cumbersome.

With grouped subsets, SPL code can be more natural:


 A

1

=mssql.query("select I.*,P.AccountCode, from #Invoices I,#Projects P where I.ProjectID = P.ProjectID")

2

=A1.group(ProjectID)

3

=A2.(cnt=~.count(),avg=round(Amount/cnt,2), ~. derive(avg+if(#==1, Amount-avg*cnt): SplitAmount))

4

=A3.conj()

A1 Simple join, load data.

A2 Group, but not aggregate.

A3 Process each group of data and directly add SplitAmount field according to the rules. # is the natural sequence number, and there is no need for additional calculation.

A4 Merge groups.

Question sourcehttps://stackoverflow.com/questions/78235564/split-invoice-total-into-multiple-rows-but-split-always-equals-total