"In the csv file below, some lines have null values, some have NaN values, and there are duplicat .."

jinxing RaqForum 41 No.
162 View • 5 Months ago

Clear duplicate lines and lines having missing values from a csv file

In the csv file below, some lines have null values, some have NaN values, and there are duplicate lines.

Sno,Country,noofDeaths

1,,32432

2,Pakistan,NaN

3,USA,3332

4,RUSSIA,

5,JAPAN,567

3,USA,3332

Use Java to do this: Delete lines containing null values or NaN values, and remove the duplicate lines. Below is the expected result:

Sno,Country,noofDeaths

3,USA,3332

5,JAPAN,567

Write the SPL script:

A1: Parse the csv file as a two-dimensional table.

A2: Convert records of the table to a sequence and perform intersection with [null,NaN] to get records that are not their common members.

A3: Group A2’s records, and get the first record from each group while keeping the original order.

SPL Official Website 👉 https://www.scudata.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL

SPL Learning Material 👉 https://c.scudata.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/2bkGwqTj

Youtube 👉 https://www.youtube.com/@esProc_SPL

Application

jinxing • 162 View • 5 Months ago