A Join Between CSV & JSON to Find Their Difference

Question

I have two CSV files: File 1 defines standard CSV data file and File 2 is the mapping file of File 1. There’s also a Json file. Now I want to map those two files and compare the data of File 1 with Json and output the mismatches as a CSV. So, how can I achieve that?

File 1:

key,id,value,name,number

 

1,322,re43,dasu,555-456

 

File 2:

Rsat_key,key

 

Retina_id,id

 

reason_value,value

 

real_name,name

 

work_ph,phone

 

Json

{

 

 "Rsat_key":1,

 

 "Retina_id":322,

 

 "reason_value":,

 

 "real_name":dasu,

 

 "work_ph":,555-456,

 

}

 

Answer

Both File1 and JSON are multiline structured data. To compare them you just need to join them by key words and then perform the conditional query. Since Java lacks a class library for handling structured data, it achieves the goal in rather complicated and difficult way. You can get it done in esProc SPL. The Structured Process Language generates intuitive and easy code:

 

A

1

=file("f1.csv").import@ct()

2

=file("f2.csv").import@ct()

3

=json(file("3.json").read())

4

=condition=A2.("_1."+_2+"!=_2."+_1).concat("||")

5

=join@1(A1,${csvKey};A3,${A2.select(_2==csvKey)._1})

6

=A5.select(${condition})

A1: Import column names and detailed records from f1.csv.

A2: Import records from f2.csv and use the default column names _1 and _2.

A3: Read in contents of Json and parse them into structured data.

A4: Compose strings of query conditions.

undefined

A5: Perform a left join. csvKey is a parameter that inputs File1’s primary key. By default join() function performs an inner join, but with @1 it performs a left join. The macro ${} dynamically parses the specified string as an expression.

undefined

A6: Perform a conditional query over A5’s result to return eligible records.

A Java program calls the SPL script via esproc JDBC. Details are explained in How to Call an SPL Script in Java.