Get Intersection of Two Subsets in Excel

 

We have an Excel table, as shown below:

A

B

1

1

6

2

5

6

3

1

3

4

5

45

5

3

3

6

2

4

7

2

23

8

3

2

9

2

2

10

3

25

11

4

23

12

2

45

13

4

12

14

1

11

15

5

3

16

1

4

17

6

3

18

7

5

19

8

5

20

9

5

Column A contains ids. Column B contains values. The computing goal is to calculate the intersection of values corresponding to every two ids and get the largest value. For instance, values corresponding to id 1 are 6, 3, 11 and 4, and those corresponding to id 2 are 4, 23 and 2, and the largest value of their intersection is 4. The expected final result is as below:

undefined

Directions:

1. Start esProc (Download esProc installation package and free DSK edition license HERE. You will be prompted to load the license file when you run esProc for the first time).

2. Open Excel and load add-in file ExcelRaq.xll through Option –> add-ins in Excel and click Go button on the dialog. The add-in file is located in [esProc installation directory \bin]. Related information is easy to get if you don’t know how to load the add-in.

3. Write script book1.dfx in esProc:

A

B

1

if arg2==arg3

>val=null

2

else

=arg1.select(~(1)==arg2).(~(2))

3

=arg1.select(~(1)==arg3).(~(2))

4

>val=(B2^B3).max()

5

return val

A1:B1 If ids in two rows are same, return null;

A2:B4 If ids in two rows are different, get corresponding values for each, calculate intersection and get the largest value;

arg1 is the first parameter passed in from Excel. Its valid data area is A1:B20, which is a two-dimensional array. arg2 is the second parameter passed in from Excel, which is an id; arg3 is the third parameter passed in from Excel, which is another id. We can define arguments in esProc in Program->Parameter:

undefined

4. In Excel, select H4 and enter expression =esproc("book1",A1:B20,H2,D4) in it(as shown in the above). Press Enter to view the result.