What are the disadvantages of using stored procedures for report data preparation?

The data preparations of some reports are too complex to write in SQL, which can then be implemented by using stored procedures. Stored procedures support multi-step operations and can write complicated calculations, while the in-database calculation can make full use of the computing ability of database.

 

However, the disadvantages of stored procedures are also obvious. They are difficult to modify, debug and migrate, and they may fail if the database is scaled up/out or replaced. Creating and deploying stored procedures often requires high database permissions. Granting the stored procedures for report data preparation to the database administrator will increase the workload of administrator and the cost of report development. However, granting the over-high authorities to the report developer will bring about security risks in data usage and operation. In addition, stored procedures also lead to excessive coupling of the report and database: the changing of reports may need frequent database operations, and one stored procedure may also be invoked by multiple reports, thus resulting in more and more undeletable stored procedures. As a result, the database administration will become chaotic and the performance will be greatly affected as well.

 

Due to these shortcomings, more and more companies are prohibiting stored procedures for data processing, including report data preparation. So is there any technology that can inherit the advantages of stored procedures and avoid their disadvantages?

 

These problems can be solved by esProc SPL, an open-source data processing engine specializes in structured data computation and has a rich library of computation classes to satisfy the complex report data preparations. Meanwhile, it naturally supports a variety of data sources (RDB, NoSQL, JSON, CSV, Webservice, etc.) and can also realize mixed calculations across data sources.

 

The syntax of esProc SPL is very concise, and it is very easy to write and debug algorithms with an independent IDE.

undefined 

SPL does not depend on the database, so the changing of data source needs to change the connection only rather than the calculation script, which is highly migratable. What’s more, that SPL implements calculations outside the database can fully reduce the workload of the database and improve the performance of report queries with more efficient algorithms. From the perspective of stored procedures, SPL can be considered as outside-database stored procedures.

 

esProc can perform calculations in multiple steps just like stored procedures, and the interpreted execution of SPL, the script of esProc, supports hot switching. For example, find the orders of the top n customers (large customers) that account for half of the sales.


A

1

=db.query("select * from orders")

2

=A1.groups(customer;sum(amount):amount).sort(amount:-1)

3

=A2.sum(amount)/2

4

=0

5

=A2.pselect((A4=A4+amount,A4>=A3))

6

=A2.(customer).to(,A5)

7

=A1.select(A6.pos(A1.customer))

A step-by-step approach is used to first find large customers that satisfy the criteria and then query the detailed order information of those customers.

 

esProc can be used as an embedded JDBC driver integrated into the reporting tools, which access the SPL calculations via JDBC in the same way as accessing a database.

 

SPL is deployed together with the reports, running outside the database, and the corresponding SPL scripts for data preparation can be discarded as soon as the reports are offline, affecting no other report. In this way, the coupling between the database (source) and the report application is fully reduced.