How to Delete Rows with Duplicates in a Specific Column

Problem description & analysis

Below is the data of Excel file book1.xlsx:

..

We are trying to perform distinct by column: find duplicate values in Container column and remove the rows where these values are located. Below is the expected result:

..

Solution & explanation

We write the following script (p1.dfx) in esProc:


A

1

=clipboard().import@t()

2

=A1.group(Container).select(~.len()==1).conj()

3

=file("result.xlsx").xlsexport@t(A2)

Explanation:

A1   Import data from the clipboard as a table sequence.

A2  Group the table sequence by Container column, get groups that contain only one record, and concatenate the eligible groups.

A3  Export A2s result to result.xlsx.

After the code is executed, result.xlsx is what we expect.

Q & A Collection

https://stackoverflow.com/questions/64219953/remove-duplicates-based-on-specific-column-name