Alignment Operations between Ordered Sets
【Abstract】
We can perform alignment operations on two ordered sets. Such as comparing the sizes of two sets, checking whether two sets are equal, etc. Read this article to find ways of handling these operations conveniently and efficiently and their esProc sample scripts.
The alignment operations over ordered sets include comparison operators (>, <, ==, etc.) and basic arithmetic operations (represented by +, , *, /, %, \). The sets under discussion are ordered sets whose members are already sorted out. The alignment operations on two ordered sets compare their members or do arithmetic over members in corresponding positions in order.
1. Size comparison
We can use operators ">" and "<" to compare corresponding members of two sets from the first pair in order. To compare [1,3,1] and [1,2,2], for instance, we begin from the first pair of members, [1,1]. They are equal, and then go on to compare the second pair. Since 3>2, there is no need to compare other members and we get [1,3,1] > [1,2,2].
【Example 1】 Based on the Olympic Medal table, find the Games when China is before Russia in the table. Below is part of the table:
Game 
Nation 
Medal 
Game 
30 
USA 
46,29,29 
30 
30 
China 
38,27,23 
30 
30 
UK 
29,17,19 
30 
30 
Russia 
24,26,32 
30 
30 
Korea 
13,8,7 
30 
… 
… 
… 
… 
According to the Olympic rule of ranking, nations are ranked in descending order by the number of gold medals they have won. If nations have same number of gold medals, they will be ranked by the number of silver medals in descending order; and then by the number of copper medals for nations having same total silver medals. Nations getting same total copper medals come equal.
【SPL script】
A 
B 

1 
=file("Olympic.csv").import@cqt() 
/ Import the Olympic medal table for all games 
2 
=A1.run(Medal=Medal.split@c()) 
/ Split each Medal value into a set of medal totals by comma 
3 
=A2.group(Game) 
/ Group A2’s records by game 
4 
=A3.select(~.select(Nation=="China").Medal>~.select(Nation=="Russia").Medal) 
/ Use ">" to compare China’s medal total set and Russia’s in the order of gold, silver and copper, and select the games when China goes ahead 
5 
=A4.(Game) 
/ List the eligible games 
A5’s result:
Game 
23 
25 
28 
29 
30 
2. Equality comparison
It’s common to compare two ordered sets to find if they are equal. Specific scenarios include the comparison of file content or table values, and others (compare [1,2,3] and [2,1,3] to see if they are equal, for instance). Generally two sets are regarded as unequal if any members in the corresponding positions are not consistent. But in certain occasions the order is unimportant and the greatest concern is to find if both sets have same members.
【Example 2】 The following file stores random samples. We want to find if two random samples get same IDs. Below is part of the source data:
ID 
Predicted_Y 
Original_Y 
10 
0.012388464367608093 
0.0 
11 
0.01519899123978988 
0.0 
13 
0.0007920238885061248 
0.0 
19 
0.0012656367468159102 
0.0 
21 
0.009460545997473379 
0.0 
23 
0.024176791871681664 
0.0 
… 
… 
… 
【SPL script】
A 
B 

1 
=file("p_old.csv").import@ct() 
/ Import the first file exported 
2 
=file("p_new.csv").import@ct() 
/ Import the second file exported 
3 
=cmp(A1.(ID),A2.(ID)) 
/Use cmp() function to compare if the IDs in the two files are equal (the corresponding member values are equal and their orders are consistent) 
A3’s result:
Member 
0 
The result is 0, which means the two files contain completely same IDs.
If the order of IDs are different, we can use eq() function to find whether the two sets have same members.
A 
B 

3 
=A1.(ID).eq(A2.(ID)) 
/Use eq() function to compare if the two files contain same IDs whose order can be different 
3. Basic arithmetic operations on sets
The alignment operations include the basic arithmetic calculations (+,,*,/,%,\). For instance, the set of sales amounts of store A in 3 days is [2,3,4] and that of sales amount of store B is [3,1,3], so the set of total sales amounts of the two stores is [5,4,7].
【Example 3】 Calculate the daily relative yield between SZSE 300 (399007) and SZSE Component Index (399001) during the period of December 24 26, 2019. Below is part of the source data:
Date 
Code 
Name 
Open 
Close 
Amount 
2020/2/18 
399001 
Shenzhen 
11244.7651 
11306.4863 
3.19E+11 
2020/2/17 
399001 
Shenzhen 
10974.9328 
11241.4993 
3.12E+11 
2020/2/14 
399001 
Shenzhen 
10854.4551 
10916.3117 
2.77E+11 
2020/2/13 
399001 
Shenzhen 
10936.5011 
10864.3222 
2.87E+11 
2020/2/12 
399001 
Shenzhen 
10735.0475 
10940.7952 
2.66E+11 
… 
… 
… 
… 
… 
… 
【SPL script】
A 
B 

1 
=connect("db") 
/Connect to data sourece 
2 
=["399007","399001"].(A1.query("select * from StockIndex where code=? and date between'20191223'and'20191226'",~)) 
/ Import data of SZSE 300 and SZSE Component Index from Dec. 2326, 2019; the data of Dec. 23 is for calculating the growth rate 
3 
=A2.(~.calc(to(2,4),Close/Close[1])) 
/ Calculate the growth rate for each of the 3 days 
4 
=A3(1)A3(2) 
/ Use "" to perform alignment subtraction between the two set to get the relative yields; the sign ?? (?∈{+,,*,/,%,\}) is used to perform alignment operations between sets 
A4’s result:
Member 
0.0031349096521252617 
0.0011897141619391371 
4.4910504685946595E4 
Find more examples in SPL CookBook.
Chinese version