SQL Performance Enhancement: highly concurrent queries with associated dimension table

 

Problem description

In SQL Performance Enhancement: Queries on Highly Concurrent Accounts (hereinafter referred to as the previous article), we have discussed highly concurrent queries on single tables. Sometimes the query may also involve associative operations when the situation is complex. For example, when querying accounts, the detail data should be associated with the store table and other dimension tables, and the final result should include the fields of dimension tables, such as store name, address, phone number, etc. The data structure is as follows:

undefined 

The SQL statement is:

select d.id,d.amt,d.detail_date,s.name,s.phone_number,s.address,… from detail d
       left join store s on d.store_id=s.id
where d.id='1010087'
and d.detail_date>= to_date('2021-01-10', 'yyyy-MM-dd')
and d.detail_date<to_date('2021-01-25', 'yyyy-MM-dd')
and …

To speed up the query response, we generally index the account id field of the “detail” table as follows:

create index index_detail_1 on detail(id)

 

Solution

1. Quick query and association in memory

The dimension tables used as code are generally small enough to be pre-loaded in memory and indexed. Then we can adapt to the solution mentioned in the previous article, first quickly querying the detail data which satisfy the conditions and then associating them with the dimension tables in memory. Due to the indexed dimension tables and the small amount of retrieved detail data, the performance loss of associative operation is basically negligible, even if there are multiple associated dimension tables, the performance will still be very fast, responding in seconds.

 

2. Newly-added data

Follow the solution mentioned in the previous article.

 

Code sample

1. Pre-process the data: row-based storage and index

Please refer to the previous article.

 

2. Pre-load the index and dimension tables

The index and dimension tables should be loaded into memory to save the loading time for each query when the system is initiated, or they are changed.


A

B

1

if !ifv(detail)

=file("detail.ctx").open().index@3(index_id)

2


=env(detail,B1)

3

if !ifv(store)

=file("store.btx").import@b(id,name,...).keys(id)

4


=env(store,B3)

A1: check whether global variable “detail” exists, if so, it means the index has already been loaded.

 

B1: if global variable “detail” does not exist, then open the composite table to load the three-level index. @2 or @3 represents to load level 2 or level 3 index. Level 3 index performs better but requires more memory. Whether to load level 2 or level 3 index is determined by the size of the index and the amount of available memory.

 

B2: assign global variable detailas B1.

 

A3-B4: pre-load store dimension table, and the primary key with index is the id number of the stores.

 

3. Query the account and perform association

The query code is very easy because the detail table and dimension tables with index are loaded in advance:

 


A

1

=detail.icursor(;id==1010087 && detail_date>=date("2021-01-10") && detail_date<date("2021-01-25") && ,index_id).fetch()

2

=A1.switch(store_id,store)

3

return A3.new(id,store_id.store_id:store_id,store_id.store_name:store_name,,amt,detail_date,)

 

A1: icursor is the cursor with index. In practice, the conditions like account id and time period are the given parameters.

 

A2: read the detail data into memory, and associate it with the store dimension table.

 

A3: refer the associated fields and return the query result.

 

4. Newly-added data

Please refer to the code mentioned in the previous article.