Get Intersection of Two Subsets in Excel
We have an Excel table, as shown below:
A 
B 

1 
1 
6 
2 
5 
6 
3 
1 
3 
4 
5 
45 
5 
3 
3 
6 
2 
4 
7 
2 
23 
8 
3 
2 
9 
2 
2 
10 
3 
25 
11 
4 
23 
12 
2 
45 
13 
4 
12 
14 
1 
11 
15 
5 
3 
16 
1 
4 
17 
6 
3 
18 
7 
5 
19 
8 
5 
20 
9 
5 
Column A contains ids. Column B contains values. The computing goal is to calculate the intersection of values corresponding to every two ids and get the largest value. For instance, values corresponding to id 1 are 6, 3, 11 and 4, and those corresponding to id 2 are 4, 23 and 2, and the largest value of their intersection is 4. The expected final result is as below:
Directions:
1. Start esProc (Download esProc installation package and free DSK edition license HERE. You will be prompted to load the license file when you run esProc for the first time).
2. Open Excel and load addin file ExcelRaq.xll through Option –> addins in Excel and click Go button on the dialog. The addin file is located in [esProc installation directory \bin]. Related information is easy to get if you don’t know how to load the addin.
3. Write script book1.dfx in esProc:
A 
B 

1 
if arg2==arg3 
>val=null 
2 
else 
=arg1.select(~(1)==arg2).(~(2)) 
3 
=arg1.select(~(1)==arg3).(~(2)) 

4 
>val=(B2^B3).max() 

5 
return val 
A1:B1 If ids in two rows are same, return null;
A2:B4 If ids in two rows are different, get corresponding values for each, calculate intersection and get the largest value;
arg1 is the first parameter passed in from Excel. Its valid data area is A1:B20, which is a twodimensional array. arg2 is the second parameter passed in from Excel, which is an id; arg3 is the third parameter passed in from Excel, which is another id. We can define arguments in esProc in Program>Parameter:
4. In Excel, select H4 and enter expression =esproc("book1",A1:B20,H2,D4) in it(as shown in the above). Press Enter to view the result.