Summarize an XLS File Every N Rows
Below is part of data.xls:
A |
B |
|
1 |
01.01.2017 00:15 |
1,830,868 |
2 |
01.01.2017 00:30 |
1,832,884 |
3 |
01.01.2017 00:45 |
1,905,648 |
4 |
01.01.2017 01:00 |
1,957,952 |
5 |
01.01.2017 01:15 |
1,968,672 |
6 |
01.01.2017 01:30 |
1,868,292 |
7 |
01.01.2017 01:45 |
1,904,240 |
8 |
01.01.2017 02:00 |
1,931,128 |
9 |
01.01.2017 02:15 |
1,888,136 |
10 |
01.01.2017 02:30 |
2,069,428 |
11 |
01.01.2017 02:45 |
1,863,588 |
12 |
01.01.2017 03:00 |
1,864,380 |
13 |
01.01.2017 03:15 |
2,106,224 |
14 |
01.01.2017 03:30 |
2,416,920 |
Computing goal: Perform aggregate every 4 rows to calculate the group number and the sum. Below is part of the result table:
A |
B |
|
1 |
1 |
7,527,352 |
2 |
2 |
7,672,332 |
3 |
3 |
7,685,532 |
4 |
4 |
9,501,864 |
The key point here is to summarize table every N rows, rather than group it by a specific column and then do the summarization.
Just handle it according to the following steps:
1. Start esProc (Download esProc installation package and free DSK edition license HERE. You will be prompted to load the license file when you run esProc for the first time).
2. Write a script and execute it.
A |
|
1 |
=file("D:/data.xls").xlsimport() |
2 |
=A1.groups((#-1)\4+1;sum(#2)) |
3 |
=file("D:/new.xls").xlsexport(A2) |
In this script, groups()function is used to group the table and perform aggregate. The number sign (#) represents a row number. #2 is column 2. (#-1)\4 means that the current row number subtracts 1 and then divided by 4 to get the quotient. The table is thus grouped according to whether or not the quotient is the same.
If the xls file is too large to fit into the memory, we can use the following script:
A |
B |
C |
|
1 |
=file("D:\data.xls").xlsimport@c() |
||
2 |
for A1,4*1000 |
=A2.groups((#-1)\4+1;sum(_2)) |
/ Process a batch of rows each time (4000 rows) |
3 |
=B2.run(#1=(#A2-1)*1000+#1) |
/ Modify the sequence number |
|
4 |
=file("D:\new.xls").xlsexport@a(B3) |
/ Export result to a new file |
Note: Sequence number = batch number * 1000 + relative sequence number of the current batch.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_Desktop/
SPL Learning Material 👉 https://c.scudata.com
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProcDesktop
Linkedin Group 👉 https://www.linkedin.com/groups/14419406/
Chinese version