Group & Summarize a CSV File

Problem description & analysis

Below is CSV file sample.csv:

City,Year,Income

Delhi,2001,12345

Mumbai,2001,43211

Hyderabad,2001,54322

Delhi,2002,76543

Mumbai,2002,43211

Hyderabad,2002,54322

Bangalore,2001,43211

Bangalore,2001,54322

We are trying to group the file by City and calculate average of Income. Below is the desired result:

Bangalore - 48766

Delhi - 44444

Hyderabad - 54322

Mumbai - 43211

Solution

Write the script p1.dfx below in esProc:

A

1

=file("city.csv").import@ct()

2

=A1.groups(City;int(avg(Income))).(#1/" - "/#2)

Explanation

A1  Import the CSV file containing table headers.

A2  Group A1 by City and calculate average of Income, and output the grouped table sequence according to the desired format.

Read How to Call an SPL Script in Java to learn how to integrate the script code into a Java program.

Q & A Collection

https://stackoverflow.com/questions/60948224/approach-to-find-average-by-x-from-a-file