What should we do when SQL (and stored procedure) runs too slowly?


As the most commonly used data processing language, SQL is widely used in many scenarios, such as query and batch operating. When the amount of data is large, SQL (and stored procedure) often leads to very slow running. In this case, it needs to optimize SQL. To optimize SQL, there are some specific systemized ways. Normally, we need to check the execution plan first to find out the reason, and then recode in a targeted manner to achieve the purpose of optimization. For example, for the judgment of continuous numbers, we can use between to replace in, use the select statement to specify the field name, use union all to replace union, and rewrite exists into join. Of course, some optimization methods are available in practice, such as creating an index, and using temporary table/aggregation table. In short, there are many optimization methods, and I believe that any DBA will not be a stranger to them.

Unfortunately, however, there are still quite a few cases that cannot run faster no matter which optimization method is used. Visit: The impasse of SQL performance optimizing, where you can find some of such cases and corresponding technical analysis. Due to the limitation of relational algebra (the theoretical basis of SQL), SQL lacks the support from characteristics such as discreteness and ordered set, which makes it extremely difficult or even unable to express some high-performance algorithms, and hence it has to use a relatively cumbersome low-performance algorithm and could do nothing when hardware resources are being wasted. If you want to understand SQL’s defects in theoretical foundation, visit SPL: a database language featuring easy writing and fast running, where you can find the comprehensive explanation. In other words, the reason why using SQL will lead to slow running is caused by its theoretical defects, and this problem can only be partially solved if we rely only on the database to optimize at the engineering level (indeed, many commercial databases are able to automatically recognize some SQL codes and convert them into high-performance algorithms), but cannot be solved fundamentally (the database optimization engine will get "confused” when the code is complicated and has to execute it as low-performance algorithms according to the writing logic of SQL). Of course, the theoretical defects cannot be solved by placing hope on the replacement of database. As long as SQL is still used, it still cannot be solved even if a distributed database or an in-memory database is used. It is true that the performance can be improved to a certain extent after consuming more expensive resources, but there is still a huge gap between the improved performance and the performance that the hardware should have been able to implement.

Is there any other solution?

Discard SQL, and also the relational database together!

Then what should we use?

SQL can't describe these high-performance algorithms. How about Java or C++?

No problem! Theoretically, any algorithm can be implemented in Java and C++, and because the lower-level actions of computer can be controlled, the code written in the two languages can usually run at high speed (as long as the programmer’s ability is good enough).

However, don't count your chickens before they hatch. Although the high-performance algorithms can be coded in either of these languages, they are too native, and do not provide any high-performance computing library for data processing. Therefore, to implement such algorithms in Java or C++, you have to start from scratch, which will make you exhausted. Let’s take hash join as an example. If Java is used, at least hundreds of lines of code are required to implement the hash join, which needs not only to design appropriate hash functions, but to solve possible hash conflicts, the amount of work involved in performing this process will be large. In addition, it is not easy to perform multi-thread programming in Java, yet the parallel computing is an effective way to improve computing performance. Similarly, there are many other algorithms involving in the calculation of structured data, and you can imagine how complex it will be if such algorithms are coded by yourself. If the implementation of a calculation is too complicated, and its development cost far exceeds the performance optimization itself, then there is no significance to optimize it.

Python also faces similar problem. Although Python is much richer than Java in terms of structured data computing library, it does not provide necessary high-performance algorithm libraries and storage schemes. For example, it does not provide the cursor types and related operations for big data, nor does it provide an effective parallel mechanism. To use Python to implement these high-performance algorithms, you have to develop them yourself. However, as an interpreted-execution language, the operating efficiency of Python itself is not high, and the algorithms developed based on Python often fail to meet the high-performance requirements. Likewise, Scala also lacks sufficient high-performance computing libraries, and it is also cumbersome for the programmers to write the algorithms themselves. For programmers who are not familiar with these algorithms, the efficiency of the code written by themselves from scratch is often not as good as the speed of optimized commercial database SQL.

Then, do we have to tolerate the slowness of SQL?

No, SPL is an excellent alternative!

SPL and high performance

Open-source SPL (Structured Process Language) is a programming language specially used for processing the structured data, which can make the calculation that is originally coded in SQL run faster.

Why? Is there a cool tech that can speed up hardware in SPL?

Not really. Software cannot speed up the computing performance of hardware, so does SPL. To put it simply, the reason why using SPL can make calculation run faster is that it uses higher-performance algorithms as mentioned above. SPL provides a large number of basic high-performance algorithm libraries. The code based on these libraries can effectively reduce the amount of calculation. When coding in SPL, we can use these algorithms in combination. If each calculation runs a little faster, it will be much faster as a whole, so as to improve the computing performance.

The high-performance algorithms designed in SPL have already been encapsulated, such as multipurpose traversal, ordered merge, foreign key pre-association, tag bit dimension, parallel computing. Many of them are the original inventions of SPL and appear for the first time in the industry.


Based on these encapsulated algorithm libraries, it will be very convenient to program. You can use them directly without developing from scratch, which not only brings high performance, but improves development speed. From this perspective, fast in running and easy in writing are the same thing in fact, that is, it can write high-performance algorithms efficiently. In contrast, since Java, C++, Python and Scala lack these algorithm libraries, it is difficult to achieve high performance.

Here below are some high-performance algorithm examples in SPL and comparison cases with SQL:

Performance optimization skill: Use Multi-purpose traverse to speed up multiple grouping

Performance optimization skillTopN

Performance optimization skill: Pre-Joining

Performance optimization skill: Numberizing Foreign Key

Performance optimization skill: Attached Table

One-side partitioning algorithm of SPL

Performance optimization skill: ordered grouping

For the same computing task, since SPL adopts the concept that differs from SQL, the computing method with different complexity (lower) can be used.

In practice, SPL has already performed many performance-optimization cases, and the running speed of these cases is increased by several times at least, dozens of times at most, and even thousands of times in extreme cases. An order of magnitude increase in speed is basically the norm.

For example, in the case “optimizing batch operating of vehicle insurance policies of an insurance company” (Open-source SPL optimizes batch operating of insurance company from 2 hours to 17 minutes), using SPL reduces the computation time from 2 hours to 17 minutes, while reducing code amount by 2/3. In this case, the unique multipurpose traversal technology of SPL is used, which can perform multiple operations in one traversal process of big data, thereby effectively reducing the access amount to external storage. This case involves three association and aggregation operations on a large table. If SQL is used, the large table needs to be traversed three times, while using SPL only needs to traverse it once. Moreover, different methods are used for association operation in this case, a huge performance improvement is thus obtained.

Another case (Open-source SPL turns pre-association of query on bank mobile account into real-time association), using SPL turns the mobile account query that can only be pre-associated originally into real-time association, while reducing the number of servers from 6 to 1. This case makes full use of SPL's ordered storage mechanism. Reading all data of one account in one go effectively reduces the hard disk access time (physical continuous storage), and with the aid of the foreign key real-time association technology that distinguishes dimension table from fact table, it implements real-time association query by just using a single machine. As a result, the performance is improved significantly, and the number of hardware required is greatly reduced.

The following are some common business scenarios that can use SPL's algorithm library to obtain high performance:

How to make JOIN run faster?

How does the in-memory database bring memory’s advantage into play?

How to make the columnar storage data warehouse more efficient

How to cope with high-concurrency account query?

What is the key to make the multi-tag user profile analysis run faster?

Speed up Big Data User Behavior Analysis using Bi-dimension Ordering Structure

To learn the reason of SPL's high performance, visit: How the performance improvement by orders of magnitude happened, where a detailed analysis and more actual optimization cases are given for reference.

Further discussion

Now, you may think that the calculation can be speeded up as long as learning SPL syntax. Is it true?

Actually, it's not that simple.

About algorithms

Higher performance obtained in SPL doesn’t attribute to SPL syntax. Although SPL syntax has its own features, it is not the fundamental reason of running fast. The most important thing is to master and exert high-performance algorithms.

Optimizing the performance needs two steps: step 1, design a low-complexity computing scheme; step 2, implement the scheme at a low enough cost. Step 1 is the more important, which should be worked out by programmers with experience and knowledge of certain degree (the ability to master and exert high-performance algorithms), and the step 2 could be implemented in SPL. In other words, SPL is not used for designing a solution of problem, but making the solution easier to implement.

SPL syntax is very simple and much easier than Java. Basically, you can get started within two hours, and become fairly proficient in two or three weeks. But for algorithm, it is not that case, you need to study hard and practice again and again before mastering it. Conversely, as long as you master the algorithm, what kind of syntax is relatively less important (of course, using too sketchy language like SQL still doesn't work). It’s just like treating a patient, only after finding out the pathological cause can the doctor determine which medicine works. To cure the disease, you can either buy the ready-made medicine directly (just like using encapsulated SPL) or collect the herbs in the mountain (like using Java/C++ to hard code). Both methods can cure the disease, the difference is the degree of trouble and the cost of payment.

Since the data structure and algorithm knowledge learned in college are rarely put into practice, many programmers have forgotten this part of knowledge after working for a few years. However, if a programmer does not understand such basic knowledge, he won’t be able to design high-performance algorithm scheme. In view of this situation, SPL provides a special subject on high performance, which covers not only the high-performance algorithms and optimization skills, but the performance optimization books so as to teach people how to exert them.

SPL High-performance algorithm

Performance Optimization - Preface

About storage

Another key point that should be considered in high-performance computing is the data storage, which is closely related to algorithm. High-performance computing is inseparable from a reasonable data storage schema. When using SPL to carry out high-performance computing, it could not be performed based on database, and instead, it needs to move the data out of database and reorganize.


The slow data computing tasks can be divided into two categories: compute intensive and data intensive. If it is just a compute intensive task, the data amount involved is not large, while the calculation amount is large. Since the large calculation amount is not caused by large data amount, there is no need to change the storage schema, and the performance can also be greatly improved as long as a good algorithm is adopted. In other words, you can continue to use original storage schema (such as databases) to optimize the performance through SPL. In contrast, the data intensive task also involves a large amount of calculation, but the difference is that the large calculation amount is mainly caused by large data amount. In this case, if you do not change the storage schema, the data read time is likely to be very long, and the overall operation time cannot be effectively optimized even if the calculation time can be optimized to 0.

Unfortunately, most of the slow computing scenarios we face are data intensive. If the data is still stored in the database, it will take a long time to read the data (IO efficiency is very low) for the reason that the data reading speed via database’s data-fetching driver (such as JDBC) is generally very slow, and this time often far exceeds the subsequent calculation time in SPL, and hence it is impossible to achieve the optimization effect. Moreover, quite a few algorithms in SPL require the data to be stored in a certain way. For example, the one-side partitioning algorithm requires the data to be stored in order. However, the conventional relational database cannot guarantee such requirement, and thus these algorithms cannot be carried out.

In order to solve this problem, SPL provides its own storage mechanism, that is, it directly uses file system, exporting the data from database to files of specific format. Through this mechanism, not only can a higher IO access efficiency and flexible management capability of file system be obtained, but the data storage advantages of own format can be fully utilized such as columnar storage, ordering, compression, and parallel segmentation, thereby efficiently giving play to the effect of high-performance algorithms.

Using file to store data can also effectively reduce the time of writing data to database, and further improve computing performance. Some calculation scenarios need not only to read the data from data source but to store the calculation results, that is, writing the results to database for subsequent calculation. For example, ETL is a typical calculation that involves both read and write. In addition, some calculations on big data or complex calculations need to temporarily store the intermediate results for reuse in subsequent calculation. We know that writing data to database is a very slow action, and the performance of computing scenario that involves this action is naturally low. By using SPL, the data that originally needs to be written to database can be stored in the file (though this is an engineering-level advantage, it still improves the read and write performance by nearly an order of magnitude), and then using the file computing capability of SPL to calculate directly, a high performance is thus achieved.

About real-time query

If the data are all moved out of database, is it impossible to implement real-time data calculation? After all, new data are always generated continuously.

Yes, it can.

For full real-time query, SPL provides multi-source mixed computing capability to meet such scenarios. For the large amount of cold data that is no longer modified, we can use SPL's high-performance file to store, which can obtain higher computing performance; for the small amount of hot data, we still store them in original data source, which allows SPL to read and calculate them directly (SPL supports diverse data sources). Since the amount of hot data is not large, querying directly based on the production data source will not have much impact on it, and the access time will not be too long. In this way, real-time query for all data can be implemented through mixed computing of cold and hot data. The only thing we need to do is to periodically store cold data as SPL's high-performance storage, and keep the small amount of recently generated hot data in original data source. The overall framework is as follows:


The way to get started

From the previous analysis, we know that to accomplish a performance optimization task, you must be proficient in high-performance algorithms and storage mechanisms. However, as can be seen from the above course books that there are a lot of things to learn, and it is not easy to master them all. In particular, many programmers get accustomed to the thinking way in SQL, and it is difficult for them to use other way. When they face a performance optimization task, they often do not know how to start even if they have a powerful tool like open-source SPL. It is like that when a cart driver wants to let his cart run faster, he is accustomed to using the reins and whips. But when he sees the steering wheel and accelerator on the car for the first time, he will be in confusion.

To solve this problem, SPL team also provides corresponding consulting service: you can tell us the performance problem you encountered, and work together with us to discuss and design an optimization solution, and a POC can also be conducted when necessary.

If you are facing a performance problem, and want us to help you, we hope you can provide following necessary information: business scenario, pain points you are encountering, data amount, concurrency number and response time of your current computing. It would be better if you provide SQL scripts, table structures and test data. E-mail us at: spl@scudata.com if you have any questions.

Trust us because we never failed.

Having experienced one or two cases, programmers would become familiar with the thinking way in SPL (it's like they understand how to operate the steering wheel and accelerator), after that, it will not be a problem to optimize the performance themselves.

In the field of calculations, speed determines the winner. However, only by mastering the essence and method of speed can you be invincible. Do you think so?