Learn performance optimization skills from TPCH tests - Q12

 

I Query Requirement

Q12 queries the shipping mode and order priority, which is helpful for making decisions: whether choosing a cheaper shipping mode will result in consumers receiving goods more after the contract date, which has a negative impact on emergency priority orders.

Q12 is characterized by two table join query operations with grouping, sorting and aggregation operations.

II Oracle Execution

The query SQL written in Oracle is as follows:

 select  /*+ parallel(n) */

         l_shipmode,

         sum(case

                   when o_orderpriority = '1-URGENT'

                            or o_orderpriority = '2-HIGH'

                            then 1

                   else 0

         end) as high_line_count,

         sum(case

                   when o_orderpriority <> '1-URGENT'

                            and o_orderpriority <> '2-HIGH'

                            then 1

                   else 0

         end) as low_line_count

from

         orders,

         lineitem

where

         o_orderkey = l_orderkey

         and l_shipmode in ('TRUCK', 'MAIL')

         and l_commitdate < l_receiptdate

         and l_shipdate < l_commitdate

         and l_receiptdate >= date '1995-01-01'

         and l_receiptdate < date '1995-01-01' + interval '1' year

group by

         l_shipmode

order by

         l_shipmode;

Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.

Script execution time, Unit: seconds

Number of parallel

1

2

4

8

12

Oracle

441

341

264

180

184

III SPL Optimization

The optimization principle of JOIN between orders and lineitem primary-sub tables here is similar to that in Q3.

Notice that the condition of high_line_count is exactly the opposite of that of low_line_count in SQL, one of which can be calculated by the other with a simpler formula. Moreover, these two columns are only related to the orders table, so they can be calculated first and then joined with the lineitem, which leads to much less computation than JOIN first and calculation later, because the orders table is much smaller than the lineitem table.

 

The SPL script is as follows

 


A

1

=now()

2

1995-1-1

3

=elapse@y(A2, 1)

4

=["MAIL",   "TRUCK"]

5

=file("orders.ctx").open().cursor@m(O_ORDERKEY,O_ORDERPRIORITY).run(O_ORDERPRIORITY=if(O_ORDERPRIORITY=="1-URGENT"   || O_ORDERPRIORITY=="2-HIGH",1,0))

6

=file("lineitem.ctx").open().news(A5,L_SHIPMODE,L_RECEIPTDATE,L_SHIPDATE,L_COMMITDATE,O_ORDERPRIORITY;L_RECEIPTDATE>=A2   && L_RECEIPTDATE<A3 && A4.contain(L_SHIPMODE)).select(  L_COMMITDATE<L_RECEIPTDATE && L_SHIPDATE<L_COMMITDATE)

7

=A6.groups(L_SHIPMODE;sum(O_ORDERPRIORITY):high_line_count,   sum(1-O_ORDERPRIORITY):low_line_count)

8

return interval@ms(A1,now())

 

Script execution time, Unit: seconds

Number of parallel

1

2

4

8

12

Oracle

441

341

264

180

184

SPL composite table

200

100

50

28

20