Learn performance optimization skills from TPCH tests - Q12
I Query Requirement
Q12 queries the shipping mode and order priority, which is helpful for making decisions: whether choosing a cheaper shipping mode will result in consumers receiving goods more after the contract date, which has a negative impact on emergency priority orders.
Q12 is characterized by two table join query operations with grouping, sorting and aggregation operations.
II Oracle Execution
The query SQL written in Oracle is as follows:
select /*+ parallel(n) */
l_shipmode,
sum(case
when o_orderpriority = '1-URGENT'
or o_orderpriority = '2-HIGH'
then 1
else 0
end) as high_line_count,
sum(case
when o_orderpriority <> '1-URGENT'
and o_orderpriority <> '2-HIGH'
then 1
else 0
end) as low_line_count
from
orders,
lineitem
where
o_orderkey = l_orderkey
and l_shipmode in ('TRUCK', 'MAIL')
and l_commitdate < l_receiptdate
and l_shipdate < l_commitdate
and l_receiptdate >= date '1995-01-01'
and l_receiptdate < date '1995-01-01' + interval '1' year
group by
l_shipmode
order by
l_shipmode;
Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.
Script execution time, Unit: seconds
Number of parallel |
1 |
2 |
4 |
8 |
12 |
Oracle |
441 |
341 |
264 |
180 |
184 |
III SPL Optimization
The optimization principle of JOIN between orders and lineitem primary-sub tables here is similar to that in Q3.
Notice that the condition of high_line_count is exactly the opposite of that of low_line_count in SQL, one of which can be calculated by the other with a simpler formula. Moreover, these two columns are only related to the orders table, so they can be calculated first and then joined with the lineitem, which leads to much less computation than JOIN first and calculation later, because the orders table is much smaller than the lineitem table.
The SPL script is as follows:
A |
|
1 |
=now() |
2 |
1995-1-1 |
3 |
=elapse@y(A2, 1) |
4 |
=["MAIL", "TRUCK"] |
5 |
=file("orders.ctx").open().cursor@m(O_ORDERKEY,O_ORDERPRIORITY).run(O_ORDERPRIORITY=if(O_ORDERPRIORITY=="1-URGENT" || O_ORDERPRIORITY=="2-HIGH",1,0)) |
6 |
=file("lineitem.ctx").open().news(A5,L_SHIPMODE,L_RECEIPTDATE,L_SHIPDATE,L_COMMITDATE,O_ORDERPRIORITY;L_RECEIPTDATE>=A2 && L_RECEIPTDATE<A3 && A4.contain(L_SHIPMODE)).select( L_COMMITDATE<L_RECEIPTDATE && L_SHIPDATE<L_COMMITDATE) |
7 |
=A6.groups(L_SHIPMODE;sum(O_ORDERPRIORITY):high_line_count, sum(1-O_ORDERPRIORITY):low_line_count) |
8 |
return interval@ms(A1,now()) |
Script execution time, Unit: seconds
Number of parallel |
1 |
2 |
4 |
8 |
12 |
Oracle |
441 |
341 |
264 |
180 |
184 |
SPL composite table |
200 |
100 |
50 |
28 |
20 |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese Version