Performance Optimization Exercises Using TPC-H – Q6
Ⅰ SQL code and analysis
Below is the SQL query statement:
select
sum(l_extendedprice * l_discount) as revenue
from
lineitem
where
l_shipdate >= date '1995-01-01'
and l_shipdate < date '1995-01-01' + interval '1' year
and l_discount between 0.05 - 0.01 and 0.05 + 0.01
and l_quantity < 24;
This is a simple aggregate operation on the filtered single table.
Ⅱ SPL solution
SPL uses a parallel query.
A |
|
1 |
=now() |
2 |
1995-1-1 |
3 |
=elapse@y(A2,1) |
4 |
=0.05-0.01 |
5 |
=0.05+0.01 |
6 |
>quantity=24 |
7 |
=file("lineitem.ctx").open().cursor@m(L_EXTENDEDPRICE*L_DISCOUNT:dp;L_SHIPDATE>=A2 && L_SHIPDATE<A3 && L_DISCOUNT>=A4 && L_DISCOUNT<=A5 && L_QUANTITY<quantity) |
8 |
=A7.total(sum(dp)) |
9 |
=interval@ms(A1,now()) |
The technique of performing filtering at cursor creation is used to reduce the amount of data to be retrieved.
Ⅲ Further optimization
1. Optimization method
This example will use the date-conversion optimization method explained in Q1, where the lineitem table’s L_SHIPDATE is already converted.
2. Code for data conversion
2.1 Conversion on lineitem table
Copy lineitem_5.ctx and rename it lineitem_6.ctx.
3. Code after data conversion
Computing code:
A |
|
1 |
=now() |
2 |
1995-1-1 |
3 |
=days@o(elapse@y(A2,1)) |
4 |
=days@o(A2) |
5 |
=0.05-0.01 |
6 |
=0.05+0.01 |
7 |
>quantity=24 |
8 |
=file("lineitem_6.ctx").open().cursor@m(L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A4 && L_SHIPDATE<A3 && L_DISCOUNT>=A5 && L_DISCOUNT<=A6 && L_QUANTITY<quantity) |
9 |
=A8.total(sum(L_EXTENDEDPRICE*L_DISCOUNT)) |
10 |
=interval@ms(A1,now()) |
Ⅳ Using enterprise edition’s column-wise computation
1. Original data
A |
|
1 |
=now() |
2 |
1995-1-1 |
3 |
=elapse@y(A2,1) |
4 |
=0.05-0.01 |
5 |
=0.05+0.01 |
6 |
>quantity=24 |
7 |
=file("lineitem.ctx").open().cursor@mv(L_EXTENDEDPRICE,L_DISCOUNT; L_SHIPDATE>=A2 && L_SHIPDATE<A3 && L_QUANTITY<quantity && L_DISCOUNT>=A4 && L_DISCOUNT<=A5) |
8 |
=A7.groups(;sum(L_EXTENDEDPRICE*L_DISCOUNT)) |
9 |
=interval@ms(A1,now()) |
2. Optimized data
A |
|
1 |
=now() |
2 |
1995-1-1 |
3 |
=days@o(elapse@y(A2,1)) |
4 |
=days@o(A2) |
5 |
=0.05-0.01 |
6 |
=0.05+0.01 |
7 |
>quantity=24 |
8 |
=file("lineitem_6.ctx").open().cursor@mv(L_EXTENDEDPRICE,L_DISCOUNT;L_SHIPDATE>=A4 && L_SHIPDATE<A3 && L_DISCOUNT>=A5 && L_DISCOUNT<=A6 && L_QUANTITY<quantity) |
9 |
=A8.groups(;sum(L_EXTENDEDPRICE*L_DISCOUNT)) |
10 |
=interval@ms(A1,now()) |
Ⅴ Test result
Unit: Second
Regular |
Column-wise |
|
Before optimization |
6.6 |
5.2 |
After optimization |
5.5 |
3.3 |
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL