How to Perform Different Types of JOINs on Two Excel Sheets

An Excel file contains 3 types of sheets. Sheet A is the base table, which contains data in the following layout:

A

B

C

1

interval1

interval2

interval3

2

1 hour

1 day

1 week

3

2 hours

2 days

2 weeks

4

3 hours

3 days

3 weeks

5

4 hours

4 days

4 weeks

Sheet B1\B2…Bn are same-structure associated tables. They all have columns interval1, interval2 and interval3. Below is part of the data in one of the sheet:

A

B

C

D

E

F

G

1

interval1

interval2

interval3

Type

value1

value2

value3

2

2 hours

1 day

7 week

circle

37

108.1

4.1

3

3 hours

3 days

7 weeks

Line

39

117.5

4.2

4

4 hours

4 days

7 weeks

Line

35

127

4.3

Sheet C describes the type of JOIN relationship between sheet A and sheet B1\B2..Bn. There are three types of JOIN. crossJoin denotes Cartesian product; leftJoinBig means left join through interval1; and leftJoinSmall is left join through interval1 and interval2. Below is part of data in sheet C:

A

B

1

table

joinType

2

B1

crossJoin

3

B2

leftJoinBig

4

B3

leftJoinSmall

Task: Join sheet A and sheet B1\B2..Bn according to the JOIN types in sheet C. That is, get interval1 column from sheet A and other columns from sheet B to generate n Excel files.

Take sheet B as an example (actually sheet Bs are different from each other). If joinType==crossJoin, the join result will be:

A

B

C

D

E

F

G

1

interval1

interval2

interval3

Type

value1

value2

value3

2

1 hour

1 day

7 week

Circle

37

108.1

4.1

3

1 hour

3 days

7 weeks

Line

39

117.5

4.2

4

1 hour

4 days

7 weeks

Line

35

127

4.3

5

2 hours

1 day

7 week

circle

37

108.1

4.1

6

2 hours

3 days

7 weeks

Line

39

117.5

4.2

7

2 hours

4 days

7 weeks

Line

35

127

4.3

8

3 hours

1 day

7 week

circle

37

108.1

4.1

9

3 hours

3 days

7 weeks

Line

39

117.5

4.2

10

3 hours

4 days

7 weeks

Line

35

127

4.3

11

4 hours

1 day

7 week

circle

37

108.1

4.1

12

4 hours

3 days

7 weeks

Line

39

117.5

4.2

13

4 hours

4 days

7 weeks

Line

35

127

4.3

If joinType==leftJoinBig, the join result should be:

A

B

C

D

E

F

G

1

interval1

interval2

interval3

Type

value1

value2

value3

2

1 hour

3

2 hours

1 day

7 week

circle

37

108.1

4.1

4

3 hours

3 days

7 weeks

line

39

117.5

4.2

5

4 hours

4 days

7 weeks

line

35

127

4.3

If joinType==leftJoinSmall, the join result is as follows:

A

B

C

D

E

F

G

1

interval1

interval2

interval3

type

value1

value2

value3

2

1 hour

3

2 hours

4

3 hours

3 days

7 weeks

line

39

117.5

4.2

5

4 hours

4 days

7 weeks

line

35

127

4.3

To implement the three types of JOINs between sheet A and different sheet Bs, we need to traverse through sheet C. This needs a script instead of a formula. A JOIN is a structured computation. But VBA lacks corresponding function to do this directly. The solution code will be rather complicated.

Directions to get it done with esProc:

1. Start esProc (Download esProc installation package and free DSK edition license HERE. You will be prompted to load the license file when you run esProc for the first time).

2. Write and execute script in esProc:

A

B

1

=file("data.xlsx").xlsopen()

2

=A1.xlsimport@t(;"C")

3

=tableA=A1.xlsimport@t(;"A")

4

for A2

=tableB=A1.xlsimport@t(;A4.table)

5

=case(A4.joinType,
   "crossJoin",xjoin(tableA:A;tableB:B),  
   "leftJoinBig",xjoin@1(tableA:A;tableB:B,A.interval1==interval1),  
   "leftJoinSmall",xjoin@1(tableA:A;tableB:B,A.interval1==interval1    && A.interval2==interval2))

6

=B5.new(A.interval1,B.interval2,B.interval3,B.type,B.value1,B.value2,B.value3)

7

=file(A4.table+A4.joinType+".xlsx").xlsexport@t(B6)

The script function case checks the JOIN type using branching statement. xjoin() function calculates Cartesian product; and @1 option denotes a left join.