SPL Simplified SQL - Interline Computing

 

  In the development of database application, we often need to face complex SQL computing, and inter-line computing is one of them. For example, divide the monthly sales by the sales of the previous month to obtain the ratio of last period, or divide the monthly sales by the sales of the same month last year for the same period ratio. Some database systems do not provide window functions of SQL2003 standard, or support is not complete, so it is necessary to change the way of thinking to complete inter-line computing, using join operation instead, which is not only difficult to understand but also inefficient. Even if window functions can be used, there are still some problems such as nested sub-queries, and the SQL statements are still complex and lengthy. SPL can implement inter-line computation through simpler and more intuitive code. Here is an example.

        

  Table sales stores order data for many years, some of which are as follows:

OrderID

Client

SellerId

OrderDate

Amount

10248

VINET

5

2013/7/4

2440

10249

TOMSP

6

2013/7/5

1863.4

10250

HANAR

4

2013/7/8

1813

10251

VICTE

3

2013/7/8

670.8

10252

SUPRD

4

2013/7/9

3730

10253

HANAR

3

2013/7/10

1444.8

10254

CHOPS

5

2013/7/11

625.2

10255

RICSU

9

2013/7/12

2490.5

10256

WELLI

3

2013/7/15

517.8

 

  It is now required to calculate the ratio of last period and the same period ratio for the sales of each month in the specified period according to the table.

 

  SPL codes


A

1

=db.query("select * from sales where   ORDERDATE>=? and ORDERDATE<=?",begin,end)

2

=A1.groups(year(ORDERDATE):y,month(ORDERDATE):m;sum(AMOUNT):mAmount)

3

=A2.derive(mAmount/mAmount[-1]:lrr)

4

=A3.derive(mAmount/mAmount[-12]:yoy)

5

=A4.sort(y:-1,m)

6

>file("sales.csv").export@ct(A5)

 

  A1When querying data from a database by time period, begin and end are grid parameters, such as begin= "2013-07-01" and end= "2015-03-31". Some of the query results are as follows:

   undefined

        

  A2=A1.groups(year(ORDERDATE):y,month(ORDERDATE):m;sum(AMOUNT):mAmount)

  This code groups orders by year and month, and aggregates monthly sales. The groups function can perform grouping aggregation. Its parameters are divided into two parts. Before the semicolon, the grouping expression is: year(ORDERDATE):y,month(ORDERDATE):m. After the semicolon, the aggregation expression is sum(AMOUNT):mAmount. The aggregated field is named mAmount. Part of the calculation results are as follows:

  undefined

 

  A3=A2.derive(mAmount/mAmount[-1]:lrr)

  This code adds a new field lrr on the basis of A2, i.e. monthly ratio to the previous period, and its expression is mAmount/mAmount[-1]. SPL can use [N] or [-N] to express post-N or previous N record relative to the current record, so mAmount in the code represents current sales and mAmount [-1] represents last month sales. The results are as follows:

  undefined

  It should be noted that the ratio of the initial month to the previous period is empty (July 2013).

 

  A4=A3.derive(mAmount/mAmount[-12]:yoy)    

  This code adds a new field yoy on the basis of A4, which is the same period ratio of monthly sales. It should be noted that the same period ratios for the months of the initial year (that is, 2013) are empty. Some of the results are as follows:

  undefined

 

  A5=A4.sort(y:-1,m:-1)

  To look clearer, we rank A5 in reverse chronological order. The result is as follows:

  undefined

 

  A6>file("sales.csv").export@ct(A5))

  This code exports the calculation results to the "sales. csv" file for viewing through Excel and other tools:

  undefined

 

  In addition to exporting data, SPL can also be directly invoked by reporting tools or Java programs. The method of invoking SPL is similar to that of ordinary database. The JDBC interface provided by SPL can return the results of calculation in the form of ResultSet to the Java main program. Specific methods can refer to relevant documents.How to call an SPL script in Java