Retrieve Multiple Types of Data from a Huge Excel File

Problem description & analysis

Below is Excel file test-file.xlsx:

Number

Date(mostly)

Date(mostly)

1234567

2020/1/1

1234581

2020/1/15

2020/1/16

asdf

asdf

qwer

123

zxcv

456

123

qwer

Each column could contain data of multiple types, including string, date and numeric. Suppose there is a huge volume of data in the Excel file, we are trying to output the data retrieval result to the console. Below is the desired result:

Number,Date(mostly),Date(mostly)

1234567,2020-01-01,null

1234581,2020-01-15,2020-01-16

asdf,asdf,null

qwer,123,null

zxcv,456,null

123,qwer,null

Solution

We write the following script p1.dfx in esProc:

A

B

1

=file("test-file.xlsx").xlsimport@c()

2

for A1

>output(A2(1).array().concat@c())

Explanation:

A1  Import the Excel data, during which @c option enables returning result as a cursor.

A2   Loop through A1 to retrieve each row of data at a time.

B2   Connect columns of the current record by comma into a string to output to the console.

Read How to Call an SPL Script in Java to learn about the integration of an SPL script with a Java program.

Q & A Collection

https://stackoverflow.com/questions/63765556/read-very-large-excel-file-with-date-and-non-date-numbers