How to implement dynamic join query of cross heterogeneous database in BIRT

 

Keywordscross heterogeneous database       dynamic join

The data sources join of BIRT and the transformation to the same database by ETL are difficult to solve these problems. It can be demonstrated by an example:

The transaction details data (trade table) is stored in the database DB2 of the production system, and the other part of the business data (network table, account table) is stored in MySQL of the business system. Their join relationship is shown in the following figure:

undefined

The so-called "dynamic join" refers to that the user enters parameters in the foreground interface. The report determines which table trade is joined with through parameters, and displays the joined data in the report. In the actual operation, data filtering and aggregation may be required. The query process is as follows:

undefined

For example, query scenario 1: according to the incoming parameters, the report can dynamically join the outAccount field in the trade table with the accountNo field in the account table. Finally, the query result shows all the fields in the trade table, as well as the name, gender, city and other fields of the account table.

Common solutions and shortcomings to solve such problems are analyzed as follows:

1. The problem with BIRT data sources join is that it requires that the table name and field name are known and determined, but such reports are dynamically joined by parameters, so they cannot be implemented.

2. ETL can be used to extract data from the production database to the business database, so that cross database problems can be transformed into the same database. The idea of this solution is simple but there are many difficulties in details. First, real-time query: in order to query data in real time, we need to use triggers and other functions in the production database to detect the real-time changes of data and push the data to the business database, but the production database cannot be easily changed, so real-time query cannot be realized. Non real time query is also difficult to do, because the data of the production database is extremely large, it is impossible to get all of them every time. Only incremental extraction method can be used, and to judge the increment, time stamp field needs to be added to the trade table. Similarly, this change is not allowed in the production database and therefore cannot be implemented.

3. In terms of capability, BIRT java bean data source can really solve the report problem. It is more flexible than data sources join and does not need to modify the production database. But there is only one defect in this solution: the code is too complex, because the data calculation is not Java specialty. It is not realistic to use hard coding method to solve every cross-database problem.

It is recommended to use esProc, which is an independent data computing engine, has the computing power independent of the database, supports the mixed operation of heterogeneous data sources, and is more suitable for further calculation after dynamic join. In fact, esProc can be regarded as a BIRT java bean data source with simpler syntax. For example, to implement the above problem, the script of esProc only needs 6 lines:


A

1

=DB2.query("select   runningNo,networkNo,outAccount,amount from trade")

2

="select  "+crossJoinField+","+crossOtherFields+"from  "+crossTable

3

=Mysql.query(A2)

4

=join(A1:trade,${tradeJoinField};A3:cross,${crossJoinField})

5

=crossOtherFields.array().("cross."+~).string()

6

=A4.new(trade.runningNo,trade.networkNo,trade.outAccount,trade.amount,${A5})

Where tradeJoinField, crossJoinField, crossOtherFields and crossTable are input parameters. Finally, the calculation results are returned to the dataset of BIRT for report presentation. From this example, we can see that the parameter usage of esProc is very flexible, and that esProc is very suitable for such dynamic joined queries. Because it is a special computing language, the code of cross database computing is more concise and understandable than the conventional method.

In fact, there are many similar cross database and split database calculation problems. It would be very simple with the help of esProc SPL. Please refer to Statistical Query after Database Split

esProc provides JDBC driver, and can be easily integrated with BIRT and other reporting tools. Please refer to How to Call an SPL Script in BIRT

Read Getting Started with esProc to download and install esProc, get a license for free and find related documentation.