Is ORM a Convenient Option for Data Migration between Databases?

 

Programmers have been trying to use the ORM technique to convert SQL statements between different types of databases. The issue is that the technique is only suitable for translating simple SQL statements in OLTP scenarios. Complicated SQL statements in OLAP cases are still hard to migrate. esProc SPL, however, (Structured Process Language) supports a wealth of functions to be able to describe various algorithms and thus handle a rich variety of computing scenarios. Looking Is ORM a Convenient Option for Data Migration between Databases? for details.

 

Object-relational Mapping (ORM) is a programming technique that lets users query and manipulates data from a database using an object-oriented paradigm. The platform provides upward general query interface (such as HQL) and generates SQL statements downwards for different types of databases. Thanks to the uniform upward interface, the technique is considered convenient to data migration between databases. But is it really suitable for this purpose?

Hibernate and MyBatis are representative of the ORM technique. Let’s examine them one by one.

As SQL has a certain degree of generality, Hibernate-generated SQL statements for the simplest computing scenarios can operate in any databases and data migration between databases is simply achievable. Below, for instance, is an HQL statement for achieving a conditional query:

from   OrdersEntity where (amount between    2000 and 3000) and UPPER(client)    like '%s%' 

It can be converted into the following SQL statement:

select   ordersenti0_.OrderID as orderid1_1_, ordersenti0_.Client as client2_1_,   ordersenti0_.SellerId as sellerid3_1_, ordersenti0_.Amount as amount4_1_, ordersenti0_.OrderDate   as orderdat5_1_ from orders ordersenti0_ where (ordersenti0_.Amount between   2000 and 3000) and (upper(ordersenti0_.Client) like '%S%')

The SQL can be executed in a series of databases, including MySQL and Oracle.

Hibernate can correctly handle the special data types and functions to make sure successful migration even the rules for them in different databases are inconsistent. To truncate the year part, for instance:

HQL:select client,year(orderDate),sum(amount),count(orderId) from   OrdersEntity group by client,year(orderDate) having sum(amount)>2000

 

MySQL:select ordersenti0_.Client as col_0_0_,   year(ordersenti0_.OrderDate) as col_1_0_, sum(ordersenti0_.Amount) as   col_2_0_, count(ordersenti0_.OrderID) as col_3_0_ from orders ordersenti0_   group by ordersenti0_.Client , year(ordersenti0_.OrderDate) having   sum(ordersenti0_.Amount)>2000

 

Oracle:select ordersenti0_.Client as col_0_0_, extract(year from   ordersenti0_.OrderDate) as col_1_0_, sum(ordersenti0_.Amount) as col_2_0_,   count(ordersenti0_.OrderID) as col_3_0_ from system.orders ordersenti0_ group   by ordersenti0_.Client , extract(year from ordersenti0_.OrderDate) having   sum(ordersenti0_.Amount)>2000

Hibernate correctly selects functions according to different databases. It uses year function for MySQL and extract function for Oracle.

The framework even correctly translate operations on classes into SQL statements. To get the first three records, for instance:

HQL+JAVA:session.createQuery("select orderId,client,amount,orderDate   from    OrdersEntity").setMaxResults(3);

 

MySQL:select ordersenti0_.OrderID as col_0_0_, ordersenti0_.Client as   col_1_0_, ordersenti0_.Amount as col_2_0_, ordersenti0_.OrderDate as col_3_0_   from orders ordersenti0_ limit ?

 

Oracle:   select * from (select ordersenti0_.OrderID as col_0_0_, ordersenti0_.Client   as col_1_0_, ordersenti0_.Amount as col_2_0_, ordersenti0_.OrderDate as   col_3_0_ from test.orders ordersenti0_) where rownum <= ?

Yet, Hibernate cannot handle more complicated cases.

To achieve unsupported functions in HQL

Compared with SQL, Hibernate HQL supports far fewer functions, including calculating absolute value, finding length of a string, and getting the year/month/date (as the above instance shows). The language does not have date functions and string functions that are frequently-used in project development. The former is used to increase or reduce dates and to get the day or quarter ordinal in the year. The latter is used to do the replacement, perform left truncation and get ASCII code. To implement HQL functions of similar functionalities, we have to embed SQL in HQL, which results in un-migratable HQL statements.

Suppose we are using MySQL database to do a specific computation. We need to let HQL use a user-defined function udf_datasub that it does not support to achieve the date increase/decrease. Below is the HQL statement:

select udf_dateadd (orderDate,3)  from   OrdersEntity

To do that we need to make it inherit the parent class org.hibernate.dialect.MySQLDialect and write user-defined JAVA class, in which we register name of the target HQL function, udf_dateadd, and reference MySQL function date_add. Below is the core code:

registerFunction("udf\_dateadd",   new SQLFunctionTemplate( DateType.INSTANCE,"date\_add(?1,INTERVAL ?2   DAY)") );

Below is the SQL statement for MySQL generated from executing HQL:

select   date\_add(ordersenti0\_.OrderDate,INTERVAL 3 DAY) as col\_0\_0_ from orders   ordersenti0_

When switched to Oracle database, Hibernate generates the same SQL. The problem is that there isn’t date_add function in Oracle. Error will be reported at execution, and the migration fails. In this case, we need to modify the user-defined JAVA class by change the core code as follows:

registerFunction("udf_dateadd",   new SQLFunctionTemplate( DateType.INSTANCE,"?1 + ?2)") );

But transparent migration become impossible.

The functions that cannot be transparently migrated in HQL also include week\quarter\replace\left\char (Take MySQL as an example). Using SQL functions in development project with HQL has been a normal.

To achieve indescribable operations in HQL

There are also many operations that can be described in SQL but cannot be described in HQL. These include subqueries in FROM clause, order-based calculations and window functions. You cannot use HQL to migrate a syntactic component in such a SQL statement.

HQL supports converting a subquery that returns a single field in the SELECT statement, but it does not support translating one in the FROM clause. For instance, HQL cannot generate the following SQL. Error will be reported if we code the same target in HQL.

_select orderId, m from (select orderId, month(orderDate)   m from OrdersEntity) t1_

Here it’s impossible to do the data migration with HQL. We can only use SQL to make it.

There are indeed more complicated methods to achieve the subqueries if you really want to use HQL. One is to create a view in the database and convert the subquery into an ordinary table for query (Hibernate does not distinguish a view and a table). To migrate data between databases, we just need to re-create the view in the target database. Another is to create a virtual view class (use @Subselect to comment it). The method makes migration easier. But as it requires a great amount of code in the development and debugging phases, it is usually more complicated than the first method.

Both methods are inconvenient. Moreover, a business database normally won’t allow adding a view. In view these, many people directly use a SQL statement (org.hibernate.Session.createSQLQuery(SQL)) to solve their immediate problem. Yet another issue is that, if the SQL statement contains a function to be migrated, HQL won’t automatically convert that function when we switch to a different database. Such a case happens between MySQL year function and Oracle extract function.

HQL does not support rownum, too. A SQL statement containing rownum cannot be expressed in HQL, like the one below:

WITH t AS (SELECT sales, month(orderDate) m,value,rownum   num FROM tbl ORDER BY month(orderDate) ASC)

SELECT   sales,m,v1,v2, (v1-v2)/v2 loopRate

FROM   (

                SELECT t1.sales,t1.m,t1.value v1,t2.value v2

                FROM

                t t1 LEFT JOIN t t2 ON t1.num=t2.num+1 WHERE t1.sales=t2.sales) t

Besides using several subqueries that, the above SQL contains the pseudo-ordinal field rownum to achieve the order-based calculation. Yet both are unsupported by HQL and thus cannot phrased in the language. The calculations can only be written with the un-migratable SQL.

Though the above link relative ratio calculation can be simplified using the window function:

      SELECT sales,m,v1,v2,   (v1-v2)/v2 loopRate

       FROM (

                SELECT sales, month(orderDate) m,value v1,

                LAG(value,1) OVER(PARTITION BY sales ORDER BY month(orderDate)) v2 FROM tbl)   t

We have to write it directly in un-migratable SQL because HQL does not support window functions.

Hailed as “full-automatic ORM” thanks to its theoretical support of migratability during the whole computing process from input (HQL) to output (Object), Hibernate, in practice, depends on SQL to write many functions and operations. Unless you are developing a small or relatively simple project, the migration is not smooth.

 

Mybatis is the “semi-automatic ORM” because it generates migratable JAVA code only in the output end and has un-migratable SQL code in the input end. People have learned that this is a weakness and hope to make up for it by developing various third-party extension packages, such as Mybatis-plus and Mybatis-ext. Generally, the packages supply upward functional programming interface and generates SQL statements downward to achieve migratable input end. However, they do not support even the simplest calculations, such as join queries, and, compared with Hibernate, depend more on SQL and are more un-migratable. Other ORM-type techniques are more immature and un-migratable.

It seems that all ORM-related techniques for data migration between databases are rudimentary. They grudgingly meet the needs of migrating simple SQL statements in OLTP scenarios. And they fall down on the job when SQL statements are complicated in OLAP scenarios.

 

If only there was an alternative.

And we do have one – it is esProc SPL. SPL has a set of standard SQL query syntax that boasts plenty of built-in functions (the number is on the rise) for describing various common operations. sqltranslate, one of the SPL functions, translates the standard SQL statements to different databases’ SQL dialects and achieves SQL migration between databases.

Below is a standard SQL statement:

select   client,year(orderDate),sum(amount),count(orderId) from OrdersEntity group by   client,year(orderDate) having sum(amount)>2000

sqltranslate ("MYSQL") translates it into the following statement:

select   client,year(orderDate),sum(amount),count(orderId) from OrdersEntity group by   client,year(orderDate) having sum(amount)>2000

And .sqltranslate("ORACLE") translates it into the statement below:

select   client,EXTRACT(YEAR FROM orderDate),sum(amount),count(orderId) from   OrdersEntity group by client,EXTRACT(YEAR FROM orderDate) having   sum(amount)>2000

The SPL function selects correct functions for different databases.

Standard SQL directly supports a wealth of functions

On top the basic functions, standard SQL also supports a great number of functions commonly-used in real-world projects, including the date functions for date increase/decrease, getting the day or quarter ordinal in the year and the string functions for replacement, performing left truncation and getting ASCII code. This makes SQL migration between databases more convenient. Below, for instance, is the standard SQL for increasing/decreasing dates:

select   ADDDAYS(OrderDate,3) from orders orders

It is translated into MySQL SQL as follows:

select   OrderDate+INTERVAL 3 DAY from orders orders

And Oracle SQL as follows:

select   OrderDate+NUMTODSINTERVAL(3,'DAY') from orders orders

Standard SQL can describe various computations

Different SQL dialects support different functions. But they have much fewer differences in computational statements. This means that databases are more compatible in terms of statements. By translating only the functions and copying the statements, SPL standard SQL functionality is able to express more computations. For instance, the following subquery remains the same for any type of database when being translated and can be correctly executed:

_select orderId, m from (select orderId,   month(orderDate) m from Orders) t1_

Since SPL does not translate statements, it cannot migrate computations involving certain databases, such as the earlier versions of MySQL that do not support window functions. So the following link relative ration calculation can only be migrated between high versions of MySQL and Oracle/SQLSVR, etc.

      SELECT sales,m,v1,v2,   (v1-v2)/v2 loopRate

       FROM (

                SELECT sales, month(orderDate) m,value v1,

                LAG(value,1) OVER(PARTITION BY sales ORDER BY month(orderDate)) v2 FROM tbl)   t

Some databases, such as Oracle and DB2, support rownum, while some, such as MySQL and SQLSVR, do not support rownum pseudo fields. A standard SQL statement containing rownum can only be migrated between Oracle and DB2 but cannot be migrated to MySQL and SQLSVR.

SPL can also migrate complicated OLAP statements. Details are explained in How to Write SQL Queries Adaptable to Different Databases.