How to group according to field value changes in time sequence in SQL
Key words:SQL ordered grouping; Adjacent row comparison
The sorted (usually sorted by time) data is grouped and counted according to the change of a field, that is, the value of the grouping field is compared with the value of the previous row. If it is the same, it is divided into the same group as the previous row, and if it is different, a new group is created.
It's hard to do this with SQL!
The set of SQL is order-less. In early SQL, there was no method of adjacent row reference. In SQL 2003 standard, window function is added, which can reference adjacent row, but grouping is still very difficult, and you need to use subquery to create grouping sequence number artificially.
For example:Query the city where a person stays for a certain period of time, and list the start and end time. The existing database table footmark is as follows:
NAME |
FOOTDATE |
CITY |
Tom |
2020-01-02 08:30:00 |
New York |
Tom |
2020-01-03 08:30:00 |
New York |
Tom |
2020-01-04 13:30:05 |
New York |
Tom |
2020-01-04 16:36:00 |
Washington |
Tom |
2020-01-05 08:30:00 |
Washington |
Tom |
2020-01-06 12:30:00 |
Washington |
Tom |
2020-01-06 17:30:25 |
New York |
Tom |
2020-01-07 09:30:05 |
New York |
Tom |
2020-01-09 16:30:00 |
New York |
… |
… |
… |
The expected result is as follows:
NAME |
CITY |
STARTDATE |
ENDDATE |
Tom |
New York |
2020-01-02 08:30:00 |
2020-01-04 13:30:05 |
Tom |
Washington |
2020-01-04 16:36:00 |
2020-01-06 12:30:00 |
Tom |
New York |
2020-01-06 17:30:25 |
2020-01-09 16:30:00 |
Take Oracle as an example. It is written in SQL as follows:
WITH A AS
( SELECT NAME, FOOTDATE, CITY,
CASE WHEN CITY=LAG(CITY) OVER (PARTITION BY NAME ORDER BY FOOTDATE) THEN 0 ELSE ROWNUM END FLAG
FROM FOOTMARK ORDER BY NAME, FOOTDATE),
B AS
( SELECT NAME, FOOTDATE, CITY,
MAX(FLAG) OVER (PARTITION BY NAME ORDER BY FOOTDATE) FLAG
FROM A),
C AS
( SELECT NAME, CITY, FLAG,
MIN(FOOTDATE) STARTDATE,
MAX(FOOTDATE) ENDDATE
FROM B
GROUP BY NAME, CITY, FLAG
ORDER BY NAME, FLAG )
SELECT NAME, CITY, STARTDATE, ENDDATE FROM C;
Here, the flag is an artificial grouping sequence number. The SQL is hard to write and understand.
For grouping by order, it would have been much simpler if you had used esProc SPL. Only one line is needed:
connect("mydb").query("SELECT * FROM FOOTMARK ORDER BY NAME,FOOTDATE").groups@o(NAME,CITY;min(FOOTDATE):STARTDATE,max(FOOTDATE):ENDDATE)
SPL is based on the ordered sets and provides the option @o of grouping by order. It is very easy to solve this problem.
SPL provides multiple grouping methods, such as equivalent grouping, ordered grouping, ordered conditional grouping, sequence number grouping, nested grouping, big data ordered grouping, and ordered conditional grouping of big data. For details, please refer to grouped subsets
esProc SPL is a professional script language to solve difficult SQL problems. It has simple syntax, conforms to natural thinking, and is a natural step-by-step, procedure oriented computing language. It uses a unified syntax independent of database, and the algorithm can be seamlessly migrated between databases. A desktop computing tool, esProc is ready to use, with simple configuration and complete debugging functions: breakpoints and single step execution can be set, and the result of each step can be viewed. Please refer to: http://www.raqsoft.com/html/sql-enhancer.html
It's also easy to embed esProc into Java applications,please refer to How to Call an SPL Script in Java
For specific usage, please refer to Getting started with esProc
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL