Count Continuously Appeared Numbers
【Question】
I need to write a SQL query to find all numbers that continuously appear at least 3 times. For example, 1 is the only number that appears at least 3 times continuously in the following logs table:
+----+-----+
|id|num|
+----+-----+
|1|1|
|2|1|
|3|1|
|4|2|
|5|1|
|6|2|
|7|2|
+----+-----+
Is there any suggestion about how to do this? Detailed solution with explanations is appreciated. Thanks.
Below is a solution:
select distinct(a.num)
from Logs a,Logs b,Logs C
where a.id=b.id+1
and a.num=b.num
and b.id=c.id+1
and b.num=c.num
【Answer】
The above solution self-joins logs tables three times to find 3 continuous records with same num values. It has two weaknesses: 1. Slow when data volume is large; 2. Tedious as you need to self-join the table multiple times if you want to find a number of continuously records with same num values.
SPL (Structured Process Language) supports ordered sets. So there’s no need to join a table itself for an indefinite number of times. This lets you write a simple script, as shown below:
A |
|
1 |
$select id,num from logs |
2 |
=A1.group@o(num) |
3 |
=A2.select(~.len()>=3).(~.num) |
A1: Retrieve data in SQL;
A2: Group rows by comparing each with its next neighbor according to num value;
A3: Get records with num values that appear continuously at least 3 times.
To get records with a number of (n) same num values, just change ~.len()>=3 in A3 to ~.len()>=n.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL