Split a CSV File into Separate Files by Specific Columns

 

Problem description & analysis

Below is the content of CSV file csv.csv:

X,Y,Z,Sc1,Sc2,Sc3,Sc4

1,0,0,5,7,9,10

0,1,1,6,8,4,0

0,0,1,3,3,8,2

We are trying to divide the file into four smaller files, in which X, Y, and Z fields are always retained. Below is the expected result:

File 1       | File 2       | File 3       | File 4

----------------------------------------------------------

[Sc1, X,Y,Z] | [Sc2, X,Y,Z] | [Sc3, X,Y,Z] | [Sc4,  X,Y,Z]

[5,   1,0,0] | [7,   1,0,0] | [9,   1,0,0] | [10,   1,0,0]

[6,   0,1,1] | [8,   0,1,1] | [4,   0,1,1] | [0,   0,1,1]

[3,   0,0,1] | [3,   0,0,1] | [8,   0,0,1] | [2,   0,0,1]

Solution

We write the following script p1.dfx in esProc:

A

1

=file("csv.csv").import@cw()

2

=4.(file("File"/~/".csv"))

3

=A2.run(~.write(A1.([~(3+A2.#),~(1),~(2),~(3)]).concat@nc()))

Explanation:

A1   Import the CSV file as a sequence of sequences.

A2  The n files are File[n].csv, in which n is 4.

A3  Get desired columns from A1’s sequence of sequences in order, concatenate them into a string to write into the corresponding file.

See How to Call an SPL Script in Java to learn about the integration of an SPL script with a Java program.

Q & A Collection

https://stackoverflow.com/questions/63346048/i-need-to-take-a-csv-file-and-split-it-into-separate-files-based-on-column-heade