Performance Optimization - 5.8 [Ordered traversal] Index sorting
We know that the essence of an index is sorting. If we want to sort a data table by the to-be-searched key (TBS key), can we use the existing index?
Unfortunately, it does not work in most cases.
If the original data table is not sorted by the TBS key, even if the index is already built, and then read the physical location of each record in order from the index, and there is no need to do sort operation, the index sorting algorithm will not have much advantage over directly performing the big sorting algorithm. The reason is that although the big sorting algorithm needs to perform sort operation, its access to external storage data tables is basically continuous and a large number of data will be read and written each time, while for index sorting, although using index to read data in order avoids the sort operation, it may lead to a large amount of unnecessary read actions since the physical position of the data in external storage is often discontinuous, and it is very likely to have a worse overall performance than the big sorting algorithm.
In fact, index has little effect on most traversal-pattern operations, and it is mainly used for scenarios where there are very few returned result sets.
However, although the overall sorting time using a ready-made index may not necessarily be faster than big sorting algorithm, it has one advantage, that is, it can quickly start data output. In contrast, big sorting algorithm needs to traverse all data and generate the buffer files before it starts outputting data, so there will be a long waiting period.
What are the application scenarios for quickly starting to output data?
For example, when the big data needs to be transmitted in order, remote transmission itself is very slow, often not much faster than big sorting. If the data can be transmitted earlier, a lot of time will be saved. In this case, using the index sorting will be more advantageous, as this algorithm can immediately start returning data for transmission without waiting.
A | |
---|---|
1 | =file(“data.ctx”).open() |
2 | =A1.index(file(“data.idx”);ID;…) |
3 | =A1.icursor@s(…;;file(“data.idx”)) |
The icursor@s() will ensure that the data sorted by specified index is returned (if there is no option, the outputting order will not be ensured, and will generally be the order of the physical location, which is faster). The data will also be returned as a cursor, and taken out by fetch() (for transmission).
Performance Optimization - 6.1 [Foreign key association] Foreign key addressization
Performance Optimization - Preface
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL