Calculate the Intersection, Union, and Difference on Uncertain Number of Sets

 

Example

There are multiple Excel files: Book1.xlsx, Book2.xlsx, Book3.xlsx,..., and they are all named as Book[N].xlsx, where N is a number starting from 1. The data in the files are as follows:

Book1.xlsx

team

member1

member2

member3

member4

member5

2

Nicholas

Kaitlyn

Hailey

David

Christopher

1

Alexis

Cole

Justin

Benjamin

Natalie

5

Logan

Dylan

Elizabeth

Jose

John

3

Isabella

Nicholas

Angela

Daniel

Alexander

4

Madison

Nicholas

Luis

Hannah

John

Book2.xlsx

team

member1

member2

member3

member4

member5

4

Madison

Nicholas

Luis

Hannah

John

8

Morgan

Jose

Joseph

Cameron

Destiny

9

Tyler

Jessica

Elizabeth

Alyssa

Destiny

6

Robert

John

Brandon

Jacob

Hailey

7

Abigail

Sarah

Hailey

Sydney

Joseph

5

Logan

Dylan

Elizabeth

Jose

John

10

John

Victoria

Brandon

Victoria

Katherine

Book3.xlsx

team

member1

member2

member3

member4

member5

10

John

Victoria

Brandon

Victoria

Katherine

11

Dylan

Ian

Jose

Antony

Rebecca

12

Nathan

Austin

Logan

Michael

Kaitlyn

13

Jennifer

Matthew

Samantha

Noah

Olivia

9

Tyler

Jessica

Elizabeth

Alyssa

Destiny

8

Morgan

Jose

Joseph

Cameron

Destiny

Book4.xlsx

...

The task is to calculate the intersection, union, and difference of all the data in Book[N].xlsx.

 

Write SPL script:


A

1

=directory("book*.xlsx")

2

=A1.(file(~).xlsimport@t())

3

=A2.merge@ou(team)

4

=A2.merge@oi(team)

5

=A2.merge@od(team)

A1 List all file names of Book[N].xlsx in order of the file name

A2 Read the data in each Excel file in turn, and create a set with multiple pieces of data

A3 Union multiple pieces of data by the team column, and the results are as follows:

team

member1

member2

member3

member4

member5

2

Nicholas

Kaitlyn

Hailey

David

Christopher

1

Alexis

Cole

Justin

Benjamin

Natalie

5

Logan

Dylan

Elizabeth

Jose

John

3

Isabella

Nicholas

Angela

Daniel

Alexander

4

Madison

Nicholas

Luis

Hannah

John

8

Morgan

Jose

Joseph

Cameron

Destiny

9

Tyler

Jessica

Elizabeth

Alyssa

Destiny

6

Robert

John

Brandon

Jacob

Hailey

7

Abigail

Sarah

Hailey

Sydney

Joseph

10

John

Victoria

Brandon

Victoria

Katherine

11

Dylan

Ian

Jose

Antony

Rebecca

12

Nathan

Austin

Logan

Michael

Kaitlyn

13

Jennifer

Matthew

Samantha

Noah

Olivia

15

Samantha

Ian

Katherine

Alexander

Joshua

14

Abigail

Antony

Hailey

Rachel

William

A4 Intersect multiple pieces of data by the team column, but the result is empty.

A5 Calculate the difference set over multiple pieces of data by the team column, and the results are as follows:

team

member1

member2

member3

member4

member5

2

Nicholas

Kaitlyn

Hailey

David

Christopher

1

Alexis

Cole

Justin

Benjamin

Natalie

3

Isabella

Nicholas

Angela

Daniel

Alexander