Number Records by Category and Intra-group with Sequence Numbers

Problem description & analysis

data.xlsx records unordered course-related data. Below is part of the file:

A

B

C

1

Course

Date

Time

2

Word

1-Sep-20

9:00

3

Word

1-Sep-20

9:00

4

PowerPoint

1-Sep-20

9:00

5

Word

1-Sep-20

12:00

6

PowerPoint

1-Sep-20

12:00

7

Excel

1-Sep-20

12:00

8

Word

1-Sep-20

12:00

We want to add a new ID field BatchID that gives same values for records having same Course, Date and Time values. Each BatchID is the first three letters of the corresponding Course value plus a sequence number. Data will first be grouped by Course and then each group will be divided into subgroups by Date and Time. The sequence number in each BatchID value is that of the corresponding subgroup in each group.

A

B

C

D

1

Course

Date

Time

Batch ID

2

Word

1-Sep-20

9:00

Wor001

3

Word

1-Sep-20

9:00

Wor001

4

PowerPoint

1-Sep-20

9:00

Pow001

5

Word

1-Sep-20

12:00

Wor002

6

PowerPoint

1-Sep-20

12:00

Pow002

7

Excel

1-Sep-20

12:00

Exc001

8

Word

1-Sep-20

12:00

Wor002

 

The task involves computations after the multilevel grouping and the use of intra-group sequence numbers.

Solution

We write the following script in esProc:

A

1

=file("040/data.xlsx").xlsimport@t()

2

=A1.group(Course).(~.group(Date,Time))

3

=A2.conj(~.news(~;Course,Date,Time,left(Course,3)/string(A2.~.#,"000"):'Batch     ID'))

4

=file("040/result.xlsx").xlsexport@t(A3)

A1,A4: Import the original Excel file and export the result Excel file.

A2: Group A1 by Course and then each group by Date and Time.

A3: Generate new records according to subgroups of each group and value of the new column BatchID; concatenate records of subgroups and then their parent level groups. Expression A2.~.# represents the sequence number of each subgroup in its parent group.

The algorithm can generate eligible BatchID but with shuffled records. To maintain the original order of the records, we can add an index column before records in each group and sort records by the indexes after concatenation.

Q & A Collection

https://stackoverflow.com/questions/63899978/excel2016-generate-id-based-on-multiple-criteria-no-vba