Finding Duplicate Values in Multiple Colums in a SQL Table and Count for chars

Question

Source:https://stackoverflow.com/questions/70513319/finding-duplicate-values-in-multiple-colums-in-a-sql-table-and-count-for-chars

I have the following table structure:

idname1name2name3...

1HansPeterFrank

2HansFrankPeter

3HansPeterFrank

4PaulPeterHans

 

I use the following command to display duplicates and counts:

SELECTCOUNT(name1),name1,name2,name3

FROMtable

GROUPBYname1,name2,name3

HAVING(COUNT(name1)>1)AND(COUNT(name2)>1)AND(COUNT(name3)>1)

 

This command gives me a count of 2. I would like to know how the second line could also be counted as a duplicate.

Unfortunately, the solution to the original question (Finding duplicate values in multiple colums in a SQL table and count) does not work for char.

Answer

You just need to sort rows by all columns, group them, count the duplicates, and find the record with the most duplicates. It is rather a hassle to get this written in SQL, which uses a nested query containing window function. The SQL statement is long and difficult to read. An alternative is to export data out of database and handle it in Python or SPL. SPL, the open-source Java package, is easier to be integrated into a Java program and generate much simpler code. It finishes the task with only three lines of code:

A

1

=sqlite.query("select name1,name2,name3 from names")

2

=A1.group([#1,#2,#3].sort();~.len()).maxp(#2)

3

=create(count,name1,name2,name3).record([A2.#2,A2.#1(1),A2.#1(2),A2.#1(3)])

 

View SPL source code.