"Key words：SQL ordered grouping; Adjacent row comparison The sorted (usually sorted by time) data .."

sjr RaqForum 37 No.
341 View • 4 Years ago

How to group according to field value changes in time sequence in SQL

SQL Headache(24) compare rows(1) position(2)

Key words：SQL ordered grouping; Adjacent row comparison

The sorted (usually sorted by time) data is grouped and counted according to the change of a field, that is, the value of the grouping field is compared with the value of the previous row. If it is the same, it is divided into the same group as the previous row, and if it is different, a new group is created.

It's hard to do this with SQL!

The set of SQL is order-less. In early SQL, there was no method of adjacent row reference. In SQL 2003 standard, window function is added, which can reference adjacent row, but grouping is still very difficult, and you need to use subquery to create grouping sequence number artificially.

For example：Query the city where a person stays for a certain period of time, and list the start and end time. The existing database table footmark is as follows:

NAME	FOOTDATE	CITY
Tom	2020-01-02 08:30:00	New York
Tom	2020-01-03 08:30:00	New York
Tom	2020-01-04 13:30:05	New York
Tom	2020-01-04 16:36:00	Washington
Tom	2020-01-05 08:30:00	Washington
Tom	2020-01-06 12:30:00	Washington
Tom	2020-01-06 17:30:25	New York
Tom	2020-01-07 09:30:05	New York
Tom	2020-01-09 16:30:00	New York
…	…	…

The expected result is as follows：

NAME	CITY	STARTDATE	ENDDATE
Tom	New York	2020-01-02 08:30:00	2020-01-04 13:30:05
Tom	Washington	2020-01-04 16:36:00	2020-01-06 12:30:00
Tom	New York	2020-01-06 17:30:25	2020-01-09 16:30:00

Take Oracle as an example. It is written in SQL as follows:

WITH A AS

( SELECT NAME, FOOTDATE, CITY,

CASE WHEN CITY=LAG(CITY) OVER (PARTITION BY NAME ORDER BY FOOTDATE) THEN 0 ELSE ROWNUM END FLAG

FROM FOOTMARK ORDER BY NAME, FOOTDATE),

B AS

( SELECT NAME, FOOTDATE, CITY,

MAX(FLAG) OVER (PARTITION BY NAME ORDER BY FOOTDATE) FLAG

FROM A),

C AS

( SELECT NAME, CITY, FLAG,

MIN(FOOTDATE) STARTDATE,

MAX(FOOTDATE) ENDDATE

FROM B

GROUP BY NAME, CITY, FLAG

ORDER BY NAME, FLAG )

SELECT NAME, CITY, STARTDATE, ENDDATE FROM C;

Here, the flag is an artificial grouping sequence number. The SQL is hard to write and understand.

For grouping by order, it would have been much simpler if you had used esProc SPL. Only one line is needed:

connect("mydb").query("SELECT * FROM FOOTMARK ORDER BY NAME,FOOTDATE").groups@o(NAME,CITY;min(FOOTDATE):STARTDATE,max(FOOTDATE):ENDDATE)

SPL is based on the ordered sets and provides the option @o of grouping by order. It is very easy to solve this problem.

SPL provides multiple grouping methods, such as equivalent grouping, ordered grouping, ordered conditional grouping, sequence number grouping, nested grouping, big data ordered grouping, and ordered conditional grouping of big data. For details, please refer to grouped subsets

esProc SPL is a professional script language to solve difficult SQL problems. It has simple syntax, conforms to natural thinking, and is a natural step-by-step, procedure oriented computing language. It uses a unified syntax independent of database, and the algorithm can be seamlessly migrated between databases. A desktop computing tool, esProc is ready to use, with simple configuration and complete debugging functions: breakpoints and single step execution can be set, and the result of each step can be viewed. Please refer to: http://www.raqsoft.com/html/sql-enhancer.html

It's also easy to embed esProc into Java applications，please refer to How to Call an SPL Script in Java

For specific usage, please refer to Getting started with esProc

SPL Official Website 👉 http://www.scudata.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProc

SPL Learning Material 👉 http://c.scudata.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/ydhVnFH9

Youtube 👉 https://www.youtube.com/@esProc_SPL

SQL Headache(24) compare rows(1) position(2)

Developer

sjr • 341 View • 4 Years ago