Summarize an XLS File Every N Rows

Below is part of data.xls:

A

B

1

01.01.2017 00:15

1,830,868

2

01.01.2017 00:30

1,832,884

3

01.01.2017 00:45

1,905,648

4

01.01.2017 01:00

1,957,952

5

01.01.2017 01:15

1,968,672

6

01.01.2017 01:30

1,868,292

7

01.01.2017 01:45

1,904,240

8

01.01.2017 02:00

1,931,128

9

01.01.2017 02:15

1,888,136

10

01.01.2017 02:30

2,069,428

11

01.01.2017 02:45

1,863,588

12

01.01.2017 03:00

1,864,380

13

01.01.2017 03:15

2,106,224

14

01.01.2017 03:30

2,416,920

Computing goal: Perform aggregate every 4 rows to calculate the group number and the sum. Below is part of the result table:

A

B

1

1

7,527,352

2

2

7,672,332

3

3

7,685,532

4

4

9,501,864

The key point here is to summarize table every N rows, rather than group it by a specific column and then do the summarization.

Just handle it according to the following steps:

1. Start esProc (Download esProc installation package and free DSK edition license HERE. You will be prompted to load the license file when you run esProc for the first time).

2. Write a script and execute it.

A

1

=file("D:/data.xls").xlsimport()

2

=A1.groups((#-1)\4+1;sum(#2))

3

=file("D:/new.xls").xlsexport(A2)

In this script, groups()function is used to group the table and perform aggregate. The number sign (#) represents a row number. #2 is column 2. (#-1)\4 means that the current row number subtracts 1 and then divided by 4 to get the quotient. The table is thus grouped according to whether or not the quotient is the same.

If the xls file is too large to fit into the memory, we can use the following script:

A

B

C

1

=file("D:\data.xls").xlsimport@c()

2

for A1,4*1000

=A2.groups((#-1)\4+1;sum(_2))

/ Process a batch of rows   each time (4000 rows)

3

=B2.run(#1=(#A2-1)*1000+#1)

/ Modify the sequence number

4

=file("D:\new.xls").xlsexport@a(B3)

/ Export result to a new   file

Note: Sequence number = batch number * 1000 + relative sequence number of the current batch.