Use Open-source SPL to Run SQL on txt/csv/xls

 

The open-source esProc SPL is a specialized structured data computation engine. It supplies rich class libraries and all-around and database-independent computational capabilities. It can perform computations directly based on multiple or different types of data sources. Besides its native syntax, SPL provides SQL syntax for querying files like txt, csv and xls as it queries tables. This is simple and convenient.

Regular queries

It is easy to perform the basic filtering operations, null value judgment and complex conditional queries in SPL:
$select * from d:/Orders.csv 
where not Amount>=100 and Client like 'bro' or OrderDate is null

Effortless to achieve GROUP BY … HAVING:

$select year(OrderDate),Client ,sum(Amount),count(1) from d:/Orders.csv
group by year(OrderDate),Client
having sum(Amount)<=100

Smooth to perform Case…When:

$select case year(OrderDate) when 2021 then 'this year' 
when 2020 then 'last year' else 'previous years' end 
from d:/Orders.csv

Convenient to do all types of joins – left join, right join, full join or inner join:

$select o.OrderId,o.Client,e.Name e.Dept,e.EId from d:/Orders.txt o
left join d:/Employees.txt e on o.SellerId=e.Eid

Or an inner join containing the where clause:

$select o.OrderId,o.Client,e.Name e.Dept 
from d:/Orders.csv o,d:/Employees.csv e
where o.SellerId=e.Eid

Easy to perform subqueries, including those within in clause and within with clause:

$with t as (select Client ,sum(amount) s from d:/Orders.csv group by Client)
select t.Client, t.s, ct.Name, ct.address from t
left join ClientTable ct on t.Client=ct.Client

And supports AS alias, such as union, union all, intersect and minus, as well as into for outputting to file:

$select dept,count(1) c,sum(salary) s into deptResult.xlsx 
from employee.txt group by dept having s>100000

Special queries

SPL, by default, can directly use SQL to query comma- or tab-separated csv, txt and xls files. Sometimes the data format is not standard. The text is non-comma- or tab-separated, a file does not have a header row, a specified Excel sheet is to be queried, and so forth. SPL also provides methods to handle these special cases in SQL. SQL can work with the SPL extension function to handle a colon-separated file:
$select * from {file("d:/Orders.txt").import@t (;":")}
where Amount>=100 and Client like 'bro' or OrderDate is null

SPL allows using sequence numbers to represent column names for a file without a header row:

$select * from {file("d:/Orders.txt").import()} where \_4>=100 and \_2 like 'bro' or _5 is null

Read a specified sheet of an Excel file:

$select * from {file("D:/Orders.xlsx").xlsimport@t(;"sheet3")}
where Amount>=100 and Client like 'bro' or OrderDate is null

Handle a csv file downloaded from a remote website:

$select * from {httpfile("http://127.0.0.1:6868/Orders.csv).import@tc()}
where Amount>=100 and Client like 'bro' or OrderDate is null

Read a JSON file as a string for parsing:

$select * from {json(file("d:/data.json").read())}
where Amount>=100 and Client like 'bro' or OrderDate is null

And read RESTful json:

$select * from {json(httpfile("http://192.168.1.33:6868/api/getData").read())}
where Amount>=100 and Client like 'bro' or OrderDate is null

SPL offers support of SQL92 standard. Just enjoy using SQL!

Seamless integration into applications

With the support of SQL, SPL becomes more skilled in querying and computing files within an application. It is very simple to integrate the SPL code into an application. SPL provides the standard JDBC/ODBC/HTTP drivers through which SPL is integrated into the application quickly and conveniently. Invoke SPL code through JDBC:
…
Class.forName("com.esproc.jdbc.InternalDriver");

Connection conn = DriverManager.getConnection("jdbc:esproc:local://");

PrepareStatement st=con.prepareStatement("$select * from D:/Sales.csv where state=?");

st.setObject(1,"California");

st.execute();

ResultSet rs = st.getResultSet();
…

Being able to query files in SQL directly from an application. That’s cool!

Download SPL

SPL source code