Split Strings into Sequences for Comparison
【Question】
I have an author field. Each value is like this: author1, author2, author3. I want to get records containing same authors (it’s not necessarily that all of them are same) but in different orders. For example, “Tom,Jack,Bob” and “Bob,Tom,Jack”.
【Answer】
You can perform string split in SQL. But it’s much easier to handle it in SPL (Structured Process Language). Here’s the SPL script:
A |
|
1 |
$select id,author from tb |
2 |
=A1.group(author.split().sort()).select(~.count()>1).conj() |
A2 splits each author value as a sequence, and sorts and groups the field. Then select groups containing repeated members but in different order.
Suppose data looks like this:
id author
1 Tom, Jack, Bob
2 Jack, Bob, Tom
3 May, Tom, Jack
4 Bob, Jack, Tom
5 aaa,bbb, Tom
6 Tom, May, Jack
Here’s the desired result:
1 Tom, Jack, Bob
2 Jack, Bob, Tom
4 Bob, Jack, Tom
6 Tom, May, Jack
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL