Unable to Parse Header from GitHub CSV URL Using Apache Commons

Question

Source: https://stackoverflow.com/questions/67898113/unable-to-parse-header-from-github-csv-url-using-apache-commons

I'm trying to access header values for each record which is present in CSV file URL from GitHub using Apache Commons CSV library.

This is my code:

@Service

public class CoronaVirusDataService {

private static String virus_data_url = "https://raw.githubusercontent.com/CSSEGISandData/COVID-19/Aysen_Chile_07032021/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv";

@PostConstruct

public void getVirusData()

{

try

{

URL url = new URL(virus_data_url);

HttpURLConnection con = (HttpURLConnection) url.openConnection();

BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));

while((in.readLine()) != null)

{

StringReader csvReader = new StringReader(in.readLine());

Iterable<CSVRecord> records = CSVFormat.DEFAULT.withFirstRecordAsHeader().parse(csvReader);

for (CSVRecord record : records) {

String country = record.get("Country/Region");

System.out.println(country);

}

}

in.close();

}

catch(Exception e)

{

e.printStackTrace();

}

}

}

When I run the application I'm getting this error:

java.lang.IllegalArgumentException: A header name is missing in [, Afghanistan, 33.93911, 67.709953, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 4, 4, 4, 4, 5, 7, 8, 11, 12, 13, 15, 16, 18, 20, 24, 25, 29, 30, 34, 41, 43, 76, 80, 91, 107, 118, 146, 175, 197, 240, 275, 300, 338, 368, 424, 445, 485, 532, 556, 608, 666, 715, 785, 841, 907, 934, 997, 1027, 1093]

at org.apache.commons.csv.CSVParser.createHeaders(CSVParser.java:501)

at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:412)

at org.apache.commons.csv.CSVParser.<init>(CSVParser.java:378)

at org.apache.commons.csv.CSVFormat.parse(CSVFormat.java:1157)

at com.p1.Services.CoronaVirusDataService.getVirusData(CoronaVirusDataService.java:34)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

Answer

You want to parse an HTTP file with headers and of the standard CSV format. The code will be lengthy if you try to do the parsing in Java.

But, it is simple to finish this using SPL, the open-source Java package. You just need one line of code:

A

1

=httpfile("https://raw.githubusercontent.com/CSSEGISandData/COVID-19/Aysen_Chile_07032021/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv").import@ct(Country/Region)

 

SPL offers JDBC driver to be invoked by Java. Just store the above SPL script as httpcsv.splx and invoke it in Java as you call a stored procedure:

Class.forName("com.esproc.jdbc.InternalDriver");

con= DriverManager.getConnection("jdbc:esproc:local://");

st=con.prepareCall("call httpcsv()");

st.execute();

Or execute the SPL string within a Java program in the way we execute a SQL statement:

st = con.prepareStatement("==httpfile(\"https://raw.githubusercontent.com/CSSEGISandData/COVID-19/Aysen_Chile_07032021/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv\").import@ct(Country/Region)");
st.execute();

View SPL source code.