Removing Empty Lines in Parsing Structured Data

Question

I have a text data like

name = abc

id = 123

Place = xyz

Details = some texts with two line

 

name = aaa

id = 54657

Place = dfd

Details = some texts with some lines

 

I need to place them in a table or csv and my output should look like:

name       id     Place       Details   

abc        123     xyz         Some texts with two line

aaa        54657   dfd         Some texts with some lines

 

Answer

I don’t know how many empty lines in your text data, but you just need to delete the empty lines, retrieve the data to the right of the equal signs, group it every 4 lines and populate each group to an empty two-dimensional table. Since it’s too complicated to hardcode the process in Java, you can program it in esPorc SPL and then integrate the SPL script via JDBC. Here’s the SPL script:

A

1

=file("D:\\source.txt").import@i()

2

=A1.select(~).(~.split@t("=")(2))

3

=A2.group((#-1)\4)

4

=A3.new(~(1):name,~(2):id,~(3):place,~(4):details)

A1: Import the text data as a sequence whose members are all the lines;

undefined

A2: Split each member in A1 into a sequence according to the separator "=", trim the spaces at the two ends of each of the two members, and then return the second members, that is, the data to the right of the equal sign;

undefined

A3: Group A2’s sequence every four rows;

undefined

A4: Create a new table sequence made up of fields name, id, place and details according to A3’s sequence to hold the final result set;

undefined

You can export A4’s result set to a text file directly:

A5=file("D:\\result.txt").export@t(A4)

Or update the result set to the database:

A6=myDB1.update@i(A4, tableName,name:name,id:id,place:place,details:details;id)