Calculate the growth of TopN in one line

 

For example, calculate the PM2.5 increase rate in the three days with the most serious pollution in a year. If you understand this increase, you can observe whether the serious pollution suddenly appears or gradually accumulates. It is not difficult to find out the three most polluted days with SQL statements:

select top 3 * from T order by pm25 desc

But the later steps are more troublesome. To find the day before each of the three days, and calculate with the value of the previous day. If you are not an expert in SQL, you can't write the SQL correctly with ease.


 If we use esProc SPL language to describe the calculation process, it will be clear. Get raw data from database:

>T=connect(”mysqlDB”).query(“select * from T”)

Then one line of code completes the whole calculation requirements:

>t3=T.ptop(-3, pm25),t3=t3.run(~=T(~).pm25/T(~-1).pm25-1)

This is mainly due to the support of SPL language for ordered set calculation, which is easy to get the position of data in the set, and also easy to reverse check the data through relative / absolute position.

SPL changes the topN in SQL when calculating the ordered set. It can take the value / record / record position in the set of the topN, so as to meet the broader calculation demand. TopN in SPL can be applied to the grouped subsets to enhance the ability of subsequent operation of grouping. Refer to TopN and variants.


When the data is not in the database, it is still convenient for SPL to perform complex calculations:

=file(“d:/t.csv”).import(;,",").enum...

It's also easy to embed esProc into Java applicationsplease refer to How to Call an SPL Script in Java

For specific usage, please refer to  Getting started with esProc