# Order-based Aggregation by Intervals

Question

The following data table has two fields – a and b.

 a b 10 3 21 4 33 6 46 7 52 10

This is the result I want:

 a<10 sum(b) a<20 sum(b) a<30 sum(b) a<40 sum(b) a<50 sum(b)

Values of a field can be divided into over 1000 intervals.

Answer

To aggregate data by intervals in SQL:

with

cr as

(

select “a<10” x, b b from test3 where a < 10 union all

select “a<20” x, b b from test3 where a < 20 union all

select “a<30” x, b b from test3 where a < 30 union all

select “a<40” x, b b from test3 where a < 40 union all

select “a<50” x, b b from test3 where a < 50

)

select x,sum(b) from cr group by x;

It’s complicated when there are many conditional intervals.

SPL can divide a series of numbers into multiple intervals dynamically to generate a conditional sequence, perform aggregation by each condition, and generate the resulting table sequence. No matter the number of the intervals, the SPL solution is always a piece of serval-line code:

 A 1 =connect(“db”) 2 =A1.query(“select   a,b from test3 order by a”) 3 =(A2.m(-1).a – 1)\10+1 4 =A3.new(~*10:C,A2.select(a

A1: Connect to database db;

A2: Retrieve data from the data table and order data by a field;

A3: Find the last interval number. A2.m() function finds a certain record from A2’s table sequence with the parameter being the record’s sequence number; a negative number represents a reciprocal. Here we get the value of a field in the last record and find the interval where the largest value of a field falls. The last a value – 52– is contained in the interval [51,60], whose sequence number is 6;

A4: new() function generates a new table sequence by calculating field values of A3’s record sequence. Each parameter corresponds to a new field value; the colon introduces a new field name. As the result of A3 is 6, A3.new() loops over the records 6 times; the sign ~ represents the loop number. Values of C field are loop number*10; values of D field are sums of the results of query performed over A2’s table sequence according to the condition a<C.

A2

A3

A4