Learn performance optimization skills from TPCH tests - Q18

 

I.   Query Requirement

  Q18 queries to obtain the information of suppliers that can supply more parts than the specified quantities, which can be used to testify whether there are sufficient suppliers for large orders and urgent tasks.

  Q18 is characterized by three tables join operations with grouping, sorting, aggregation and IN sub-query. Query statements do not grammatically limit how many tuples are returned, but the TPC-H standard stipulates that only the first 100 rows of the query results are returned (usually depending on the application).

II.  Oracle Execution

  The query SQL written in Oracle is as follows:

    select * from (
	select  /*+ parallel(n) */
	    c_name,c_custkey,o_orderkey,o_orderdate,o_totalprice,
	    sum(l_quantity)
	from
	    customer,orders,lineitem
	where
	    o_orderkey in (
		select
		    l_orderkey
		from
		    lineitem
		group by
		    l_orderkey
                having
		    sum(l_quantity) > 314
	    )
	    and c_custkey = o_custkey
	    and o_orderkey = l_orderkey
	group by
	    c_name,
	    c_custkey,
	    o_orderkey,
	    o_orderdate,
	    o_totalprice
	order by
	    o_totalprice desc,
	    o_orderdate
    ) where rownum<=100;

  Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.

  Script execution time, Unit: seconds

Number of parallel

1

2

4

8

12

Oracle

1248

739

533

412

344

III. SPL Optimization

  Analyze this query, if we name the following sub-query

    select l_orderkey,sum(l_quantity) lq
    from lineitem
    group by l_orderkey

  as view V, the main body of the original query is equivalent to

    select  /*+ parallel(n) */
	c_name,c_custkey,o_orderkey,o_orderdate,o_totalprice,
	sum(lq)
    from
	customer,orders,lo
    where
	c_custkey = o_custkey
	and o_orderkey = l_orderkey
	and lq>314
    group by
	c_name,
	c_custkey,
	o_orderkey,
	o_orderdate,
	o_totalprice

 

  This is a JOIN operation between the ORDERS table, which has a foreign key association, and its homo-dimension table LO. We know that the lineitem table is a sub-table of ORDERS and ordered to l_orderkey as well, and the LO table calculated by lineitem can still ensure ordered to l_orderkey. Thus it can perform high-speed merge join with ORDERS.

  CUSTOMER table, as a foreign key table, can join with ORDERS in the result and may not participate in the previous operation.

 

  The SPL script is as follows


A

1

=now()

2

>quantity=314

3

=file("lineitem.ctx").open().cursor@m(L_ORDERKEY,L_QUANTITY)

4

=A3.group@s(L_ORDERKEY;sum(L_QUANTITY):quantities).select(quantities>quantity).fetch()

5

=file("orders.ctx").open()

6

=A4.joinx@q(L_ORDERKEY,A5:O_ORDERKEY,O_CUSTKEY,O_TOTALPRICE,O_ORDERDATE)

7

=A6.total(top(100;[-O_TOTALPRICE,O_ORDERDATE])).derive@o().keys@i(O_CUSTKEY)

8

=file("customer.ctx").open().cursor@m(C_CUSTKEY,C_NAME;A7.find(C_CUSTKEY)).fetch().keys@i(C_CUSTKEY)

9

=A7.switch(O_CUSTKEY,A8)

10

=A9.new(O_CUSTKEY.C_NAME:c_name,O_CUSTKEY.C_CUSTKEY:c_cuskey,L_ORDERKEY:o_orderkey,O_ORDERDATE,O_TOTALPRICE,quantities)

11

return interval@ms(A1,now())

 

  Script execution time, Unit: seconds

Number of parallel

1

2

4

8

12

Oracle

1248

739

533

412

344

SPL composite table

152

76

38

21

16