Align Data with Duplicates in Specified Order

Example

There is a data table in data.xlsx, where some index numbers are stored in Header1 field, but the index numbers may have duplicate values. As shown in the figure below, there are two 101s in the red box. The initial table is as follows:

..

Another file target.xlsx has the following data:

..

The task is to sort the data in data.xlsx in the order of MyTarget, where the first 101 represents the first record where Header1 is 101, the second 101 is the second record, and the third 101 cannot find a third record, so the corresponding row is blank. The final result is shown in the figure below:

..

Write SPL script:


A

1

=T("e:/work/data.xlsx").derive(key)

2

=T("e:/work/target.xlsx").derive(key)

3

=A1.group(Header1).run(~.run(key=Header1/"_"/#))

4

=A2.group(MyTarget).run(~.run(key=MyTarget/"_"/#))

5

=A1.align(A2:key,key)

6

=A5.new(A2(#).MyTarget:Header1,~.Header2,~.Header3)

7

=T("e:/work/data2.xlsx",A6)

A1 Read the data of data.xlsx and add a column of key to generate the key values used for alignment

 

A2 Read the data of target.xlsx and add a column of key to generate the key values used for alignment

 

A3 Group A1 by Header1, loop through each group, then loop through each record in the group, and set the value of key as Header1+an underscore+its corresponding sequence number in the group

 

A4 Same as A3

 

A5 Align the keys in A1 in the order of keys in A2

 

A6 Use A5 to create a new data set, use the MyTarget value of the corresponding row number # in A2 as the value of the new column Header1, and then take out the Header2 and Header3 in the original A5

 

A7 Store the results in A6 to data2.xlsx