Performance Optimization - 6.5 [Foreign key association] Inner join syntax
Performance Optimization - 6.4 [Foreign key association] Time key
We know that the foreign key of fact table does not always have corresponding dimension table record, and there may be invalid value. A common practice is: if no corresponding dimension table record is found for the foreign key, delete the fact table record; if a corresponding record is found, perform addressization and association, which is called inner join.
SPL provides the inner join syntax for cursors:
A | |
---|---|
1 | =file(“product.btx”).import@b().keys@i(id) |
2 | =file(“orders.btx”).cursor@b(p_id,quantity) |
3 | =A2.switch@i(p_id,A1) |
4 | =A3.groups(p_id.vendor;sum(p_id.price*quantity)) |
The switch@i() will delete the fact table record that cannot be associated.
The join() also has the @i option, which means the same thing.
The switch@i() will first fetch the record from the cursor and then judge association. Even if it cannot associate, the record is already generated. Reviewing the pre-cursor filtering discussed in Chapter 4 tells us that to obtain better performance, it needs to perform the association judgement before generating record.
SPL provides the mechanism for the composite table cursor:
A | |
---|---|
1 | =file(“product.btx”).import@b().keys@i(id) |
2 | =file(“orders.ctx”).open().cursor(p_id,quantity;p_id:A1) |
3 | =A2.groups(p_id.vendor;sum(p_id.price*quantity)) |
Having written the association relationship in the filter condition parameter of composite table cursor, SPL will determine whether it can be associated before generating records. If it can, addressization conversion will be done at the same time.
The reason why it is similar to the join()function is that when there are multiple associated fields, it is not clear which field to be addressized, so it is not appropriate to use this syntax. SPL provides a more general fjoin() function, which can implement a solution with richer addressization and simultaneous filtering. However, these contents are no longer related to algorithm principle, so we will not explain the detail here. Interested readers can refer to the information on the SPL forum.
If it only needs to determine whether there exists association without performing foreign key addressization, using the conventional conditional syntax will be OK:
A | |
---|---|
1 | =file(“product.btx”).import@b().keys@i(id) |
2 | =file(“orders.ctx”).open().cursob(quantity;A1.find(p_id)) |
3 | =A2.total(sum(quantity)) |
This code can also handle multiple associated fields.
Performance Optimization - 6.6 [Foreign key association] Index reuse
Performance Optimization - Preface
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL