How to realize the complex logic calculation after the multi-source join in BIRT

 

Keywordsjoin across databases     join and calculate

The data sources join of BIRT and the transformation to the same database by ETL are difficult to solve these problems. It can be demonstrated by an example:

The salary of employees consists of basic salary and performance salary. Basic salary (employee table) is stored in MSSQL database of financial management software; performance salary is calculated by performance score, and performance score (performance table) is stored in Oracle database of performance appraisal system. Obviously, salary needs cross database calculation of two databases. The join relationship between the two tables is shown in the following figure:

undefined

Different posts have different performance pay algorithms. For example: the posts are divided into normal and sales. Normal post has performance score but no performance salary, and its pretax salary = base pay. Pretax salary of sales = base pay + performance salary. Performance salary is calculated as follows:

 (1) If the base pay is less than 5000: base pay * (performance score / 100);           

(2) If the base pay is between 5000 and 8000: base pay * ((performance score * 0.9) / 100);           

(3) If the base pay is more than 8000: base pay * ((performance score * 0.8) / 100);

The difficulties of this report are as follows: 1. Employee table and performance table belong to different databases and need to be calculated across databases. 2. The algorithm is complex, it is impossible to achieve the goal only by simply joining two tables, and further calculations are needed. The query process is as follows:

undefined

Analyze the shortcomings of the existing solutions:

1. BIRT has its own solution of data sources join, but it can only provide simple internal and external join, and has no ability of secondary calculation, so it is difficult to deal with the complex algorithm with logical judgment in the loop and multi result set combination.

2. Using ETL to synchronize to one database in advance, the cross- database problem is transformed into the same database problem. But the cost of ETL development is too high, and there are some problems such as data synchronization and real-time.

User-defined data source can solve this problem, but the hard coding method of BIRT java bean data source is too complex and the workload is huge.

It is recommended to use esProc, which is an independent data computing engine, has the computing power independent of the database, supports the mixed operation of heterogeneous data sources, and can easily cope with all kinds of recalculation after join. In fact, esProc can be regarded as a BIRT java bean data source with simpler syntax. For example, the above problem can be written in esProc as follows:


A

B

C

1

=oracleDB.query@x("select empID,score from   performance")

2

=mssqlDB.query@x("select empID,empName,basePay,post from   employee")

3

=A2.derive(preTax)



4

=sales=A3.select(post=="Sales")



5

=sales.switch(empID,A1:empID)



6

for sales

if A6.basePay<5000

>A6.preTax=A6.basePay*(1+A6.empID.score/100)

7


else if A6.basePay>=5000 && A6.basePay<8000

>A6.preTax=A6.basePay*(1+A6.empID.score*0.9/100)

8


else if A6.basePay>=8000

>A6.preTax=A6.basePay*(1+A6.empID.score*0.8/100)

9

=normal=A3.select(post=="Normal").run(preTax=basePay)

10

=sales|normal



Because esProc is a specialized computing language, the code of cross-database computing is more concise and understandable than that of conventional methods.

In fact, there are many similar cross database and split database calculation problems. It would be very simple with the help of esProc SPL. Please refer to Statistical Query after Database Split.

esProc provides JDBC driver, and can be easily integrated with BIRT and other reporting tools. Please refer to How to Call an SPL Script in BIRT.

Read Getting Started with esProc to download and install esProc, get a license for free and find related documentation.