Learn performance optimization skills from TPCH tests - Q19

 

I.   Query Requirement

  Q19 queries the total discounted revenue for three different types of parts that are shipped by air or delivered in person. Parts are selected based on specific brands, containers and size range.

  Q19 is characterized by three table join operations with aggregation and IN sub-query.

II.  Oracle Execution

  The query SQL written in Oracle is as follows:

    select  /*+ parallel(n) */
	sum(l_extendedprice * (1 - l_discount)) as revenue
    from
	lineitem,part
    where
	(
	    p_partkey = l_partkey
	    and p_brand = 'Brand#32'
	    and p_container in ('SM CASE', 'SM BOX', 'SM PACK', 'SM PKG')
	    and l_quantity >= 7 and l_quantity <= 7 + 10
	    and p_size between 1 and 5
	    and l_shipmode in ('AIR', 'AIR REG')
	    and l_shipinstruct = 'DELIVER IN PERSON'
	)
	or
	(
	    p_partkey = l_partkey
	    and p_brand = 'Brand#23'
	    and p_container in ('MED BAG', 'MED BOX', 'MED PKG', 'MED PACK')
	    and l_quantity >= 18 and l_quantity <= 18 + 10
	    and p_size between 1 and 10
	    and l_shipmode in ('AIR', 'AIR REG')
	    and l_shipinstruct = 'DELIVER IN PERSON'
	)
	or
	(
	    p_partkey = l_partkey
	    and p_brand = 'Brand#45'
	    and p_container in ('LG CASE', 'LG BOX', 'LG PACK', 'LG PKG')
	    and l_quantity >= 22 and l_quantity <= 22 + 10
	    and p_size between 1 and 15
	    and l_shipmode in ('AIR', 'AIR REG')
	    and l_shipinstruct = 'DELIVER IN PERSON'
	);

  Where /*+ parallel(n) */ is the parallel query syntax of Oracle, and n is the parallel number.

  Script execution time, Unit: seconds

Number of parallel

1

2

4

8

12

Oracle

395

277

204

176

154

III. SPL Optimization

  This query is a filtering query on two joined tables, which is simple in structure, but along with complex filtering conditions. We have noticed that many items in conditional expressions are only related to the smaller table PART, and the computational complexity of these items is relatively high (with IN operation). If these items are calculated based on the PART table in advance, the computational complexity will be much less than that based on the join result of two tables, because the join result is the same size as the lineitem table, which is much larger.

 

  The SPL script is as follows


A

1

=now()

2

>brand1="Brand#32"

3

>brand2="Brand#23"

4

>brand3="Brand#45"

5

>quantity1=7

6

>quantity2=18

7

>quantity3=22

8

=["SM CASE", "SM BOX", "SM   PACK", "SM PKG"]

9

=["MED BAG", "MED BOX",   "MED PKG", "MED PACK"]

10

=["LG CASE", "LG BOX", "LG   PACK", "LG PKG"]

11

=["AIR","AIR REG"]

12

=file("part.ctx").open().cursor@m(P_PARTKEY,P_BRAND,P_SIZE,P_CONTAINER,0:FLAG;P_SIZE>=1)

13

=A12.run(FLAG=if(P_BRAND==brand1 &&   A8.contain(P_CONTAINER) && P_SIZE<=5:1, P_BRAND==brand2   && A9.contain(P_CONTAINER) && P_SIZE<=10:2,   P_BRAND==brand3 && A10.contain(P_CONTAINER) &&   P_SIZE<=15:3; 0)).select(FLAG>0).fetch().derive@o().keys@i(P_PARTKEY)

14

=file("lineitem.ctx").open().cursor@m(L_PARTKEY,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT;L_PARTKEY:A13,L_SHIPINSTRUCT=="DELIVER     IN PERSON" && A11.contain(L_SHIPMODE) &&   L_QUANTITY>= quantity1 && L_QUANTITY<=quantity3+10)

15

=A14.select(case(L_PARTKEY.FLAG,1:L_QUANTITY<=quantity1+10,2:L_QUANTITY>=quantity2   && L_QUANTITY<=quantity2+10;L_QUANTITY>=quantity3))

16

=A15.total(sum(L_EXTENDEDPRICE*(1-L_DISCOUNT)))

17

return interval@ms(A1,now())

  A13 generates a new FLAG field in the PART table, which is used to calculate the complex conditions related to PART table. When the records satisfy the OR condition 1, 2 and 3 in SQL, assign them as 1, 2 and 3 respectively; if none of them satisfies, assign it as 0, and then filter them based on whether the FLAG value is larger than 0. In A15, the value of FLAG is again used to select which quantity value to filter L_QUANTITY. FLAG-related complex formulas only need to calculate the row number of the PART table.

 

  Script execution time, Unit: seconds

Number of parallel

1

2

4

8

12

Oracle

395

277

204

176

154

SPL composite table

158

80

40

21

15