Compare Two CSV Files
【Question】
Scenario - I want to compare 2 CSV files in Mule. Find details like rows added, deleted or updated. I have searched but there is no such feature/component. Wondering if we can use Bash script in mule? The option known to me is through a Java component. But I want better suggestions or ideas. Please suggest pointers to give a start.
【Answer】
To find the difference between the two CSV files is basically performing set operations over structured data. Java doesn’t offer ready-to-use functions to do this, so the process will be complicated. Try using SPL (Structured Process Language) to do this and return result to the Java application. For example, the following SPL script finds newly-added rows by the composite primary key - userName\date:
A |
B |
|
1 |
=file("D:\\old.csv").import@t(;",").sort(userName,date) |
=file("D:\\new.csv").import@t(;",").sort(userName,date) |
2 |
=[B1,A1].merge@d(userName,date) |
You can also find the deleted or updated rows in SPL. The SPL script can be used as a Java class library for processing structured data, and it can be easily embedded into a Java application.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL