Join Up Multiple Same-format 2D Tables

Question

Is there a Java library that would allow me to parse CSV files that have headers defined on certain lines? Here's an example for such a CSV:

$ID$,$Customer$

Cust1, Jack

Cust2 , Rose

$Name$,$Location$

Sherlock,London

Clouseau,Paris

 

The "$" symbol indicates the presence of headers on that line, and the values in subsequent rows map to these headers.

 

Answer

Your question: Each two-dimensional table has same number of rows where the first is the headers. You need to join up the two-field two-dimensional tables into a wider standardized one. The algorithm is like this: Group data into multiple two-dimensional tables according to whether a row has the "$" symbol; create an empty 2D table whose headers are values of the first rows; beginning from the 2nd, get rows with same sequence numbers and union them in order; then enter values to the empty 2D table.

The algorithm involves grouping operation, order-based operation and dynamic 2D table. It’s really difficult to code it in Java. But it’s simple to achieve it in SPL (Structured Process Language):

A

1

=file("d:\\source.csv").import@c()

2

=A1.group@i(left(#1,1)=="$")

3

=create(${A2.conj(~(1).array()).concat@c()})

4

=to(2,A2(1).len()).conj((t=~,A2.(~(t).array()).conj()))

5

=A3.record(A4)

A1: Import source.csv as a 2D table.

undefined

A2: Grouping; put rows from one containing $ to another containing the symbol into same group (the second symbol row will be put into the next group).

undefined

A3: Create a new 2D table where the column headers are values of the first row in each group.

undefined

A4: Beginning from the 2nd row, get rows with same sequence numbers from the groups each time and union them in order as a sequence.

undefined

A5: Populate members of A4’s sequence in order into A3’s table sequence row by row.

undefined

The SPL script is integration-friendly. See How to Call an SPL Script in Java to learn how to integrate it with a Java application.