Find Duplicates in SPL

Question

I am working on an uploading feature for which I need to read in data from .txt. The file’s format is something like this:
13500000000|1
13500000001|1
13500000002|1
13500000003|1
13500000003|1

I need to check whether there is duplicate data, and if there is, prompt the user with a message. Is there any suggestion for doing this? Thanks.

 

Answer

It’s simple to handle this in SPL (Structured Process Language). Group records by the first column and return groups that contain more than one record. Those records are duplicates.

A

1

=file("E:\\s.txt").import@i()

2

=A1.group().select(~.len()>1)

A1: Import content from s.txt and return a sequence.

undefined

A2: Group records and find the group that hold more than one member. Below is the duplicate:

undefined

An SPL script can be embedded into a Java program for further computation. More details are explained in How to Call an SPL Script in Java.