SQL: Perform statistics at different granularities based on the time span of the data
A certain view in MS SQL will generate data with different time spans. When the time span is less than 30 days:
id |
branchId |
totalPrice |
createdAt |
85 |
1 |
2718.66 |
2022-07-03 08:49:27.727 |
26 |
1 |
3832.69 |
2022-07-06 09:08:06.880 |
89 |
1 |
9569.85 |
2022-07-07 04:13:09.230 |
80 |
1 |
1523.62 |
2022-07-07 04:38:29.313 |
15 |
1 |
2500.21 |
2022-07-11 09:01:05.183 |
5 |
1 |
6874.03 |
2022-07-14 23:54:05.590 |
45 |
1 |
9188.03 |
2022-07-17 05:35:48.560 |
98 |
1 |
4426.17 |
2022-07-21 17:35:31.617 |
54 |
1 |
3862.86 |
2022-07-22 05:18:28.553 |
70 |
1 |
5668.82 |
2022-07-22 06:12:33.867 |
65 |
1 |
3653.67 |
2022-07-26 08:29:03.587 |
Grouping by branchId and day is required, and the total price should be summed up as follows:
branchId |
sumTotalPrice |
timeFrame |
1 |
2718.66 |
2022-07-03 |
1 |
3832.69 |
2022-07-06 |
1 |
11093.47 |
2022-07-07 |
1 |
2500.21 |
2022-07-11 |
1 |
6874.03 |
2022-07-14 |
1 |
9188.03 |
2022-07-17 |
1 |
4426.17 |
2022-07-21 |
1 |
9531.68 |
2022-07-22 |
1 |
3653.67 |
2022-07-26 |
When the time span is greater than or equal to 30 days and less than 365 days:
id |
branchId |
totalPrice |
createdAt |
52 |
4 |
7502.97 |
2023-11-01 17:49:51.110 |
56 |
4 |
7337.75 |
2023-11-06 15:38:57.567 |
44 |
4 |
9385.97 |
2024-01-18 11:19:04.460 |
Then group by branchId and month, and sum up the total price. Note: The format remains unchanged, and the output time field is taken as the first day of the current month.
branchId |
sumTotalPrice |
timeFrame |
4 |
14840.72 |
2023-11-01 |
4 |
9385.97 |
2024-01-01 |
When the time span is greater than or equal to 365 days:
id |
branchId |
totalPrice |
createdAt |
22 |
2 |
5589.39 |
2020-05-23 15:22:14.703 |
46 |
2 |
6103.08 |
2020-08-18 03:58:14.973 |
48 |
2 |
4905.96 |
2020-10-14 23:57:48.680 |
85 |
2 |
8953.03 |
2021-08-15 11:16:34.627 |
6 |
2 |
8132.46 |
2021-08-26 21:27:21.627 |
53 |
2 |
1913.24 |
2021-09-20 17:41:13.793 |
4 |
2 |
3164.81 |
2022-03-18 04:24:40.840 |
28 |
2 |
3506.16 |
2022-05-20 17:48:44.330 |
37 |
2 |
7256.73 |
2022-07-25 20:45:16.497 |
16 |
2 |
7470.38 |
2023-01-22 18:33:07.163 |
27 |
2 |
5957.58 |
2023-03-22 03:04:02.687 |
99 |
2 |
7722.43 |
2023-04-14 21:22:38.160 |
81 |
2 |
4517.39 |
2023-04-25 11:25:17.900 |
70 |
2 |
5562.04 |
2023-05-10 08:19:35.200 |
55 |
2 |
9343.17 |
2023-11-17 08:56:09.870 |
94 |
2 |
1056.38 |
2024-01-03 18:36:24.743 |
47 |
2 |
8434.11 |
2024-03-23 09:07:31.743 |
Group by branchId and year, also sum up the total price. Note: The time field is taken as the first day of the current year.
branchId |
sumTotalPrice |
timeFrame |
2 |
16598.43 |
2020-01-01 |
2 |
18998.73 |
2021-01-01 |
2 |
13927.7 |
2022-01-01 |
2 |
40572.99 |
2023-01-01 |
2 |
9490.49 |
2024-01-01 |
SPL code:
A |
|
1 |
=MSSQL.query("select * from tb”) |
2 |
=interval(A1.min(date(createdAt)),A1.max(date(createdAt))) |
3 |
=A1.groups(branchId, date(createdAt,if(A2>=356:"yyyy",A2<30: "yyyy-MM-dd";"yyyy-MM")):timeFrame; sum(totalPrice):sumTotalPrice) |
4 |
=A3.new(#1,#3,#2) |
A1: Query the view through JDBC.
A2: Calculate the time span.
A3: Group and aggregate. When the span is greater than 365 days, the time format mask is yyyy; When the span is less than 30 days, the mask is yyyy-MM-dd, and for other spans, it is set to yyyy-MM. The date function returns the first day of the current year for the yyyy mask time string and the first day of the current month for the yyyy-MM mask time string.
A4: Adjust the field order to meet formatting requirements.
Problem source:https://stackoverflow.com/questions/78390673/select-and-group-results-by-dynamic-date
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL