From SQL to SPL: Calculate the number of intersections between adjacent subsets after grouping

A certain database table records the execution status of a project, with multiple people participating in the project every day, and one person can participate in multiple tasks of the project in one day.

EMP_ID

EMP_DATE

EMP_TASK

A1

04-01-2024

345

A2

04-01-2024

546

A3

04-01-2024

232

A4

04-01-2024

8000

A5

04-01-2024

2344

A1

04-02-2024

456

A2

04-02-2024

9280

A3

04-02-2024

324

A2

04-02-2024

754

A8

04-02-2024

75

A2

04-03-2024

400

A3

04-03-2024

234

A3

04-04-2024

100

Now we need to calculate: how many people have also participated in the project the previous day for every day. That is, after grouping by date, calculate the number of intersections between daily and previous day's personnel. The first day is special, assuming that everyone participated in the project the previous day.

EMP_DATE

COUNT

04-01-2024

5

04-02-2024

3

04-03-2024

2

04-04-2024

1

We can first group by date, and then perform intersection operations on the grouped subsets, which requires retaining the subsets after grouping. But after SQL grouping, it must aggregate immediately, and subsets cannot be retained, and calculating intersection is also impossible. It needs to do it another way, to group by person first and determine whether each person appeared on a certain date and the previous day, and then group and aggregate these dates, it involves multiple layers of nesting and window functions, which is very troublesome.

After grouping in SPL, subsets can be retained and adjacent subsets can be referenced. Code can be written directly according to the idea.


 A

1

=dbConn.query("select distinct EMP_ID,EMP_DATE from tb")

2

=A1.group(EMP_DATE)

3

=A2.new(EMP_DATE,if(#==1,~.len(),(~.(EMP_ID)^~[-1].(EMP_ID)).len()):COUNT)

A1 Load data from the database and deduplicate EMP-ID.

A2 Group by date, but do not aggregate.

A3 Create a new two-dimensional table based on the grouping results. If the current group is Group 1, directly return the number of members in the group; If it is not the first group, then calculate the intersection of the EMP-IDs of the current group and the previous group, and then calculate the number of members. ^ is used for calculating intersection, [-1] represents the previous group.

Question sourcehttps://stackoverflow.com/questions/78324132/how-do-i-perform-recursive-search-in-oracle