"[url] We know that the foreign key of fact table does not always have corresponding dimension ta .."

mars RaqForum 25 No.
368 View • 2 Years ago

Performance Optimization - 6.5 [Foreign key association] Inner join syntax

Performance Optimization - 6.4 [Foreign key association] Time key

We know that the foreign key of fact table does not always have corresponding dimension table record, and there may be invalid value. A common practice is: if no corresponding dimension table record is found for the foreign key, delete the fact table record; if a corresponding record is found, perform addressization and association, which is called inner join.

SPL provides the inner join syntax for cursors:

	A
1	=file(“product.btx”).import@b().keys@i(id)
2	=file(“orders.btx”).cursor@b(p_id,quantity)
3	=A2.switch@i(p_id,A1)
4	=A3.groups(p_id.vendor;sum(p_id.price*quantity))

The switch@i() will delete the fact table record that cannot be associated.

The join() also has the @i option, which means the same thing.

The switch@i() will first fetch the record from the cursor and then judge association. Even if it cannot associate, the record is already generated. Reviewing the pre-cursor filtering discussed in Chapter 4 tells us that to obtain better performance, it needs to perform the association judgement before generating record.

SPL provides the mechanism for the composite table cursor:

	A
1	=file(“product.btx”).import@b().keys@i(id)
2	=file(“orders.ctx”).open().cursor(p_id,quantity;p_id:A1)
3	=A2.groups(p_id.vendor;sum(p_id.price*quantity))

Having written the association relationship in the filter condition parameter of composite table cursor, SPL will determine whether it can be associated before generating records. If it can, addressization conversion will be done at the same time.

The reason why it is similar to the join()function is that when there are multiple associated fields, it is not clear which field to be addressized, so it is not appropriate to use this syntax. SPL provides a more general fjoin() function, which can implement a solution with richer addressization and simultaneous filtering. However, these contents are no longer related to algorithm principle, so we will not explain the detail here. Interested readers can refer to the information on the SPL forum.

If it only needs to determine whether there exists association without performing foreign key addressization, using the conventional conditional syntax will be OK: