Find Missing Datetimes from a CSV-format Time Series with Fixed Interval

 

Problem description & analysis

Below is part of the data in CSV file csv.csv:

2021-02-02 00:00:00

2021-02-02 01:00:00

2021-02-02 02:00:00

2021-02-02 03:00:00

2021-02-02 04:00:00

2021-02-02 05:00:00

2021-02-02 06:00:00

2021-02-02 07:00:00

2021-02-02 08:00:00

2021-02-02 10:00:00

...

The whole data is from 2021-02-02 00:00:00 to 2021-03-03 23:00:00 and has been ordered at an interval of one hour. There are altogether 30 days of data and, in theory, 24 records each day. Actually, there are certain records are left out. Our task is to find those missing records. Below is the desired result:

2021-02-02 09:00:00

2021-02-02 14:00:00

2021-02-03 06:00:00

2021-02-03 09:00:00

2021-02-03 11:00:00

2021-02-03 17:00:00

2021-02-03 18:00:00

2021-02-03 19:00:00

2021-02-03 20:00:00

2021-02-03 23:00:00

...

Solution & explanation

Write the following script p1.dfx in esProc script:

A

1

=periods@s("2021-02-02   00:00:00","2021-03-03   23:00:00",3600)

2

=file("csv.csv").import@ci()

3

=A1\A2

Explanation:

A1  List all dates and times between 2021-02-02 00:00:00 and 2021-03-03 23:00:00 at the interval of one hour.

A2  Import the date and time data from the CSV file.

A3  Calculate difference between A1 and A2.

Read How to Call an SPL Script in Java to learn how to integrate the script code into a Java program.

Q & A Collection

https://stackoverflow.com/questions/63265424/java-loop-trough-a-csv-file-checking-for-missing-meter-readings-based-on-the-t