Batch Convert a Large JSON File Where Each Row Is a Record into a CSV File

Problem description & analysis

We have a text file json.txt. Below is part of the data:

{"name":"abc1","age":20}

{"name":"abc2","age":20}

{"name":"abc3","age":20...

{"name":"abc4","age":20}

{"name":"abc5","age":20}

Each row is a JSON string. Yet there is dirty data in the file, such as row 3. We are trying to remove the dirty data and export the rest of the data to a CSV file. Below is a part of the desired result:

name

age

abc1

20

abc2

20

abc4

20

abc5

20

Solution

Write the following script p1.dfx in esProc:

A

1

=file("json.txt").cursor@si()

2

=A1.(json(~)).select(ifr(~) && ~.fno()==2)

3

=file("csv.csv").export@ct(A2)

Explanation:

A1  Create cursor for the text file and return it. Close the cursor after all data is scanned.

A2  Loop through A1’s cursor to convert each JSON string into a record, during which null will be returned if the braces cannot be matched (which is the dirty data), and then select records having two fields.

A3  Export A2’s result to csv.csv.

Read How to Call an SPL Script in Java to learn about the method of integrating the SPL script into Java.

Q & A Collection

https://stackoverflow.com/questions/62703575/how-to-process-larger-files-from-json-to-csv-using-spring-batch