Filtering and Aggregation over a CSV File

Question

In Bookings.csv file each line contains a name, surname, roomtype, check-in date, check-out date separated by semicolons.

Name;Surname;roomtype1.2;2015-03-24;2015-03-26 

Paul;Smith;roomtype1.1;2015-03-21;2015-03-23 

Romas;Babajus;roomtype2.1;2015-03-26;2015-03-28 

Bob;Alfredo;roomtype3.1;2015-03-24;2015-03-26 

Edvard;Jogn;roomtype2.2.;2015-03-04;2015-03-25 

Jonas;Amberto;roomtype3.2;2015-03-20;2015-03-23

 

In roomtype1.* , roomtype2.* , roomtype3.* the “*” indicates a room that is considered to be roomtype1.

When there’s a new booking for roomtype1 the program should find the check-out dates of rooms which are roomtype1 (roomtype1.1 and roomtype1.2) and compare each check-out date in order find a room that has the closest check-out date to the new booking date.

So far I am only able to read the whole dates stored in Bookings.csv without knowing to which roomtype those dates belong.

How would you suggest reading only roomtype1 check-out dates from a csv file? Would it be the best way to use a two-dimensional array and loop the file?

So far my code looks like this if that helps:

publicclassBookings{

 

staticlong difv;

 

publicstaticvoid main(String\[\] args) throws Exception{

 

SimpleDateFormatft = newSimpleDateFormat("yyyy-MM-dd");

 

DatecheckIn = null;

 

DatecheckOut = null;

 

Date test = ft.parse("2015-03-30");

 

StringfileName = "Bookings.csv";

 

Filefile = newFile(fileName);

 

try{

 

ScannerinputStream = newScanner(file);

 

while (inputStream.hasNext()) {

 

String data = inputStream.next();

 

String\[\] values = data.split(";");

 

checkIn = ft.parse(values\[3\]);

 

checkOut = ft.parse(values\[4\]);            

 

//         System.out.println("Check in date");

 

//         System.out.println(checkIn);

 

//         System.out.println("Check out date");

 

//         System.out.println(checkOut);

 

}

 

inputStream.close();

 

//interval(checkOut, test, TimeUnit.HOURS);

 

// System.out.println("the difv is" + difv);

 

//      if (checkOut.compareTo(test) <= 0) { // or equal

 

//         System.out.println("Date1 is after or equal to Date2");

 

//       } else if (checkOut.compareTo(test) < 0) {

 

//        System.out.println("Date1 is before Date2");

 

//       } else if (checkOut.compareTo(test) == 0) {

 

//       System.out.println("Date1 is equal to Date2");

 

//       } else {

 

//        System.out.println("How to get here?");

 

//        

 

//        }

 

} catch(FileNotFoundException e) {

 

e.printStackTrace();

 

}

 

}

 

publicstaticlong interval(DatecheckOut, Datetest, TimeUnit timeunit) {

 

long diff = test.getTime()- checkOut.getTime();

 

difv = timeunit.convert(diff,TimeUnit.MILLISECONDS);

 

return difv;

 

}

 

}

 

Answer

Your question involves field query and getting max value, which are among the basic structured computations. As Java lacks related class library, the code is complicated and difficult to read. Here we handle the question in SPL (Structured Process Language). The code is intuitive and easy to understand:

A

1

=file("Bookings.csv").import@tc(;,";")

2

=A1.select(like(roomtype,argtype+".*"))

3

=A2.maxp(interval('check-out',argDate))

 

undefined

A1: Import the csv file using semicolon separator and input the first line as field names.

A2: Get roomtype field through query parameter argtype and wild character *.

A3: Calculate the maximum by sorting data by the interval between check-out value and the input parameter argDate.

So, you can easily handle structured computations for Java in SPL. Then you can integrate the SPL script with Java via integration-friendly esProc JDBC interface. For more details, see How to Call an SPL Script in Java.