How to Calculate Some Specific Data Function from the Data of a Large CSV File

Question

Source: https://stackoverflow.com/questions/66753971/how-to-calculate-some-specific-data-function-from-the-data-of-a-large-csv-file

I'm trying to work out the most expensive county to rent a building from data in a CSV file. The data from each column I need has been put into a list. The price range is set by the user so the outermost For loop and If statement ensure that the buildings considered are in the set price range.

The price of a building is also slightly complicated because the price is the minimum stay x price.

In the code below I am trying to get the average property value of one county just soon I can get the basic structure right before I carry on, but I'm kind of lost at this point. Any help would be much appreciated.

publicintsampleMethod()

{

ArrayList<String>county=newArrayList<String>();

ArrayList<Integer>costOfBuildings=newArrayList<Integer>();

ArrayList<Integer>minimumStay=newArrayList<Integer>();

ArrayList<Integer>minimumBuildingCost=newArrayList<Integer>();

try{

//CodetoreaddatafromtheCSVandputthedatainthelists.

}

}

catch(IOException|URISyntaxExceptione){

//Somecode.

}

 

intcount=0;

intavgCountyPrice=0;

intcountyCount=0;

for(intcost:costOfBuildings){

if(costOfBuildings.get(count)>=controller.getMin()&&costOfBuildings.get(count)<=controller.getMax()){

for(StringcurrentCounty:county){

for(intcurrentMinimumStay:minimumStay){

if(currentCounty.equals("samplecounty")){

countyCount++;

inttemp=nightsPermitted*cost;

avgCountyPrice=avgCountyPrice+temp/countyCount;

}

}

}

}

count++;

}

returnavgCountyPrice;

}

Here is a sample table to depict what the CSV looks like. Also, the CSV file has more than 50,000 rows.

name

county

price

minStay

Morgan

lydney

135

5

John

sedury

34

1

Patrick

newport

9901

7

Answer

Let’s describe the algorithm of your task: Group the CSV file by county, calculate the average price in each group, and find the country that has the highest average price for buildings. The code will be rather long if you try to finish the task using Java.

It is convenient and simple to get this done in SPL, the open-source Java package. The language only needs one line of code:

A

1

=file("data.csv").import@ct().groups(county;avg(price):price_avg).top(-1;price_avg).county

 

SPL offers JDBC driver to be invoked by Java. Just store the above SPL script as mostExpensiveCounty.splx and invoke it in Java in the same way you call a stored procedure:

Class.forName("com.esproc.jdbc.InternalDriver");

con= DriverManager.getConnection("jdbc:esproc:local://");

st = con.prepareCall("call mostExpensiveCounty()");
st.execute();

View SPL source code.