Performance Optimization Exercises Using TPC-H – Q6

 

SQL code and analysis

Below is the SQL query statement:

select
    sum(l_extendedprice * l_discount) as revenue
from
    lineitem
where
    l_shipdate >= date '1995-01-01'
    and l_shipdate < date '1995-01-01' + interval '1' year
    and l_discount between 0.05 - 0.01 and 0.05 + 0.01
    and l_quantity < 24;

This is a simple aggregate operation on the filtered single table.

SPL solution

SPL uses a parallel query.


A

1

=now()

2

1995-1-1

3

=elapse@y(A2,1)

4

=0.05-0.01

5

=0.05+0.01

6

>quantity=24

7

=file("lineitem.ctx").open().cursor@m(L_EXTENDEDPRICE*L_DISCOUNT:dp;L_SHIPDATE>=A2 && L_SHIPDATE<A3 && L_DISCOUNT>=A4 && L_DISCOUNT<=A5 && L_QUANTITY<quantity)

8

=A7.total(sum(dp))

9

=interval@ms(A1,now())

The technique of performing filtering at cursor creation is used to reduce the amount of data to be retrieved.

Further optimization

1. Optimization method

This example will use the date-conversion optimization method explained in Q1, where the lineitem table’s L_SHIPDATE is already converted.

2. Code for data conversion

2.1 Conversion on lineitem table

Copy lineitem_5.ctx and rename it lineitem_6.ctx.

3. Code after data conversion

Computing code:


A

1

=now()

2

1995-1-1

3

=days@o(elapse@y(A2,1))

4

=days@o(A2)

5

=0.05-0.01

6

=0.05+0.01

7

>quantity=24

8

=file("lineitem_6.ctx").open().cursor@m(L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A4 && L_SHIPDATE<A3 && L_DISCOUNT>=A5 && L_DISCOUNT<=A6 && L_QUANTITY<quantity)

9

=A8.total(sum(L_EXTENDEDPRICE*L_DISCOUNT))

10

=interval@ms(A1,now())

Using enterprise edition’s column-wise computation

1. Original data


A

1

=now()

2

1995-1-1

3

=elapse@y(A2,1)

4

=0.05-0.01

5

=0.05+0.01

6

>quantity=24

7

=file("lineitem.ctx").open().cursor@mv(L_EXTENDEDPRICE,L_DISCOUNT; L_SHIPDATE>=A2 && L_SHIPDATE<A3 && L_QUANTITY<quantity && L_DISCOUNT>=A4 && L_DISCOUNT<=A5)

8

=A7.groups(;sum(L_EXTENDEDPRICE*L_DISCOUNT))

9

=interval@ms(A1,now())

2. Optimized data


A

1

=now()

2

1995-1-1

3

=days@o(elapse@y(A2,1))

4

=days@o(A2)

5

=0.05-0.01

6

=0.05+0.01

7

>quantity=24

8

=file("lineitem_6.ctx").open().cursor@mv(L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A4 && L_SHIPDATE<A3 && L_DISCOUNT>=A5 && L_DISCOUNT<=A6 && L_QUANTITY<quantity)

9

=A8.groups(;sum(L_EXTENDEDPRICE*L_DISCOUNT))

10

=interval@ms(A1,now())

Test result

Unit: Second


Regular

Column-wise

Before optimization

6.6

5.2

After optimization

5.5

3.3