Find the row containing the maximum value from a text file

Here is a semicolon-separated csv file. Field whose values are "unknown" represents invalid data.

name;height;mass;hair_color;skin_color;eye_color;birth_year;gender

Luke Skywalker;172;77;blond;fair;blue;19BBY;male

C-3PO;167;75;n/a;gold;yellow;112BBY;n/a

R2-D2;96;32;n/a;white, blue;red;33BBY;n/a

Darth Vader;202;136;none;white;yellow;41.9BBY;male

Leia Organa;150;49;brown;light;brown;19BBY;female

Owen Lars;178;120;brown, grey;light;blue;52BBY;male

Beru Whitesun lars;165;75;brown;light;blue;47BBY;female

Grievous;216;159;none;brown, white;green, yellow;unknown;male

Lily;216;159;none;brown, white;green, yellow;unknown;male

unknown;unknown;216;black;dark;dark;unknown;male

Rey;unknown;unknown;brown;light;hazel;unknown;female

Poe Dameron;unknown;unknown;brown;light;brown;unknown;male

Task: Use Java to perform filtering to get name field of the row containing the maximum value of mass field. Requirement: Convert data to the stream-style to handle; skip invalid data; if there are multiple eligible names, concatenate them with the semicolon, such as Grievous;Lily.

Write the following SPL script:


A

1

=T@c("data.csv"; ";")

2

=A1.select(mass!="unknown" && name!="unknown")

3

=A2.total(top(1;-mass))

4

=A3.(name).concat(";")

A1: Parse the csv file as a stream-style two-dimensional table; @c enables using the cursor.

A2: Filter away the invalid data.

A3: Find the record whose mass field value rank in topN.

A4: Get all values of name field and concatenate them with the semicolon.

Read How to Call a SPL Script in Java to find how to integrate SPL into a Java application.

Source:https://stackoverflow.com/questions/72018171/filtering-from-csv-files-using-java-stream