String Split – Dynamically Count Records Containing Certain Strings
【Question】
I have a table:
id,zb1,zb2
1,abcd,adfg
2,adef’,edf
3,ae,acd
4,bde,bc
For each of values a, b, c, d, e, f, g, I want to find the number of records where zb1 field contains it and the number of records where zb2 contains it. Below is the desired result:
zb1 zb2
a:3 a:2
b:2 b:1
c:1 c:2
d:3 d:3
e:1 e:0
f:1 f:1
g:0 g:1
The table structure:
CREATE TABLE `cs`(
`id`int(5)NOT NULL,
`zb1`varchar(10)NOT NULL,
`zb2`varchar(10)NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Output data from table`cs`:
INSERT INTO`cs`VALUES(1,‘abcd’,‘adfg’);
INSERT INTO`cs`VALUES(2,‘adef’,‘edf’);
INSERT INTO`cs`VALUES(3,‘ae’,‘acd’);
INSERT INTO`cs`VALUES(0,‘bde’,‘bc’);
【Answer】
It’s inconvenient to perform string split in SQL, let alone there will be a dynamic result set. If the data volume is small, you can import data into the memory. Here’s how I handle the task in SPL (Structured Process Language):
A |
|
1 |
$select zb1,zb2 from tb |
2 |
=A1.conj(zb1.split("")) |
3 |
=A1.conj(zb2.split("")) |
4 |
=(A2|A3).id() |
5 |
=A4.new(~/":"/A2.count(~==A4.~):zb1,~/":"/A3.count(~==A4.~):zb2) |
A1: Import the data table;
A2,A3: Split every value in zb1 field and zb2 field respectively and concatenate them again;
A4: Get distinct members;
A5: Create a new table sequence and count the records containing those values respectively in each of the two fields.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL