Inner Join of an EXCEPT Subquery on Column Names That Do Not Appear in the Output

Question

Source:https://stackoverflow.com/questions/70510156/inner-join-of-an-except-subquery-on-column-names-that-do-not-appear-in-the-outpu

I have a table of students and classes. I would like to find what classes were dropped from one semester to another (and a similar query for classes added).

Student Class Semester

==============================

Alice English 11

Alice Geometry 11

Alice English 12

Bob Spanish 11

Bob Spanish 12

My approach is to use an except (same as minus):

select distinct Class

from table

where table.Student = 'Alice'

and table.Semester = 11

except

select distinct Class

from table

where table.Student = 'Alice'

and table.Semester = 12

This works correctly and returns Geometry. However, I need to use this as a subquery like this:

select Student, string_agg(X.Class, ',') as 'Deleted_Classes',

count(X) as 'Num_deleted',

SemesterTable.Semester as semester,

lag(Semester, 1)

over (partition by StudentTable.Student

order by SemesterTable.Semester) as Prev_Semester,

from

StudentTable

SemesterTable

inner join (

<<<Same query from above>>>

) X on _______

where X.Num_deleted > 0

My problems is with the____section - inner joins can only be joined on columns that appear in the output. But, my except query doesn't return values of previous and current semesters (it might even return nothing at all if no classes were dropped). So how do I join the subquery into the main table? My desired output is:

Student Semester Prev Semester Deleted_Classes

========================================================

Alice 12 11 Geometry

Alice appears because she had a change in her schedule, but Bobis omitted because there was no change in his schedule.

Answer

Group rows ordered by Student and Semester according to Student, and in each group, group rows by Semester and perform inter-row calculations. The difference between the set of Classes in the spring semester and that in the fall semester is the newly added classes, and the difference between the fall semester and the spring semester contains the cancelled classes. It is a hassle to code this in SQL because you will need the window function, CROSSAPPLY and OUTERAPPLY. The SQL statement will be lengthy and difficult to understand. An alternative is to move data out of database and process it in Python or SPL. SPL, the open-source Java package, is easy to be integrated into a Java program and generate simple code. It can get this done with only two lines of code

A

1

=MSSQL.query@x("select * from Classes order by 1,3")

2

=A1.group@o(#1).conj(~.group@o(#1,#3;~.(#2)).new(Student,#2[+1]:Semester,#2:Prev_Semester,(#3\#3[+1]).concat@c():Deleted_Classes,(#3[+1]\#3).concat@c():Added_Classes).m(:-2))

 

View SPL source code.