SQL Query over File Examples

 

【Abstract】

This article illustrates scenarios of querying files directly in SQL and provides sample programs written with esProc SPL.

Its convenient and efficient to manipulate database data in SQL. Its natural to load data into the database and handle it in SQL when we trying to compute files. The problem is that data loading itself is not simple. It would be extremely convenient if there was a way to perform SQL query directly on files. Here Ill bring in such a tool, esProc SPL, list various SQL file query scenarios and offers esProc example programs. esProc is a specialized data computing engine, and SPL, abbreviation of Structured Process Language used by esProc, boasts a complete set of SQL file query methods.

This article uses text files in all examples. In fact the methods also apply in Excel files.

1.Filtering

Select records from a text file on a specified condition in SQL.

Example: Select scores of students in class 10 from students score table Students_scores.txt. In the file, the first row contains column names and detailed data begins from the second row:

undefined 


A

1

$select * from  E:/txt/Students_scores.txt where CLASS=10

2. Aggregation

Summarize data in a text file in SQL.

Example: Calculate the average Chinese score, the highest math score, and the total of English scores based on the student scores table.


A

1

$select  avg(Chinese),max(Math),sum(English) from E:/txt/Students_scores.txt

3. Inter-column calculations

Perform inter-column calculations in a text file in SQL.

Example: Calculate the total score of each student in the student score table.


A

1

$select  *,English+Chinese+Math as total_score from E:/txt/students_scores.txt

Below is A1s result, where a computed column total_score is added.

undefined 

4.CASE statement

We can perform complicated conditional queries in SQL using CASE statement.

Example: Find whether or not the English result of each student in the student scores table is a Pass.


A

1

$select  *, case when English>=60 then 'Pass' else 'Fail' end as English_evaluation  from E:/txt/students_scores.txt

Below is A1s result, where a computed column English_evaluation is added.

undefined 

5.Sorting

Sort data in a text file in ascending (or descending) order in SQL.

Example: Sort the student scores table by class in ascending order and by total score in descending order.


A

1

$select *  from E:/txt/students_scores.txt order by CLASS,English+Chinese+Math desc

6.TOP-N

Get Top-N over a text file in SQL.

Example: Get the records of the three students who have the highest English scores.


A

1

$select  top 3 * from E:/txt/students_scores.txt order by English desc

7. Grouping & aggregation

Perform grouping & aggregation on data in a text file in SQL.

Example: Find the lowest English score, the highest Chinese score and the total math score in each class.


A

1

$select  CLASS,min(English),max(Chinese),sum(Math) from E:/txt/students_scores.txt  group by CLASS

8. Post-grouping filtering

Group and summarize data in a text file and then perform filtering in SQL.

Example: Find the class where the average English score is below 70.


A

1

$select  CLASS,avg(English) as avg_En from E:/txt/students_scores.txt group by CLASS having  avg(English)<70

A1s result.

undefined 

9. Distinct

Perform distinct operation over data in a text file in SQL.

Example: Get the IDs of all classes.


A

1

$select  distinct CLASS from E:/txt/students_scores.txt

10. Distinct Count

Perform count after distinct operation on data in a text file in SQL.

Example: Based on the product information file, count the number of different kinds of products. Below is part of the file:

undefined 


A

1

$select  count(distinct PID) from E:/txt/PRODUCT_SALE.txt

11. Grouping & Count Distinct

Group data in text file and perform count distinct on data in text file in SQL.

Example: Based on sales table, count the days when there are sales records for each kind of product.


A

1

$select  PID,count(distinct DATE) as no_sdate from E:/txt/PRODUCT_SALE.txt group by  PID

 

12. Two-table join query

Perform a join query over two text files in SQL.

Example: The product information and the sales information are stored in two text files respectively. We want to calculate the total sales amount of the products whose quantity per order is less than 10. Below are the files data structures and their relationship:

undefined 


A

1

$select  sum(S.quantity*P.Price) as total
from E:/txt/Sales.txt as S  join E:/txt/Products.txt as P on S.productid=P.ID
where S.quantity<=10

 

13. Multi-table join query

Perform a join query over more than two text files in SQL.

Example: The state information, department information and employee information are stored in 3 different text files. We want to find the information of employees of California in HR department.


A

1

$select  e.NAME as NAME
from E:/txt/EMPLOYEE_J.txt as e
    join E:/txt/DEPARTMENT.txt as d on  e.DEPTID=d.DEPTID
    join E:/txt/STATE.txt as s on  e.STATEID=s.STATEID
where
   d.NAME='HR' and s.NAME='California'

14. Multi-table, multilevel join query

Perform a multilevel join query over more than two text files in SQL.

Example: The state information, department information and employee information are stored in 3 different text files. We want to find the information of employees of New York state whose managers come from California.


A

1

$select  e.NAME as ENAME
from  E:/txt/EMPLOYEE.txt as e
   join E:/txt/DEPARTMENT.txt as d on  e.DEPT=d.NAME
   join E:/txt/EMPLOYEE.txt as emp on d.MANAGER=emp.EID
where  e.STATE='New York' and emp.STATE='California'

 

15. Nested subquery

Use very complicated SQL in a nested subquery.

Example: The employee information and the department information are stored in two text files. We want to find the department that has the youngest manager. Below are parts of the two files:

undefined 

undefined 


A

1

$select  emp.BIRTHDAY as BIRTHDAY,emp.DEPT as DEPT
from E:/txt/DEPARTMENT.txt as dept
    join E:/txt/EMPLOYEE.txt emp
    on dept.MANAGER=emp.EID
where
  emp.BIRTHDAY=(select  max(BIRTHDAY)
       from ( select emp1.BIRTHDAY as BIRTHDAY
            from E:/txt/DEPARTMENT.txt as  dept1
             join E:/txt/EMPLOYEE.txt as  emp1
             on dept1.MANAGER=emp1.EID
           )
        )

 

16. Common table expression

Compute data in a text file using SQL WITH clause.

Example: Based on the same text files in the previous example, find the specified departments (HR, R&D, Sales) from the department table, and count the female employees and calculate the average salary in each of these departments.


A

1

$with A  as
(select  NAME as DEPT from E:/txt/DEPARTMENT.txt
where  NAME='HR' or NAME='R&D' or NAME='Sales')

select  A.DEPT DEPT,count(*) NUM,avg(B.SALARY) AVG_SALARY from
A left  join E:/txt/EMPLOYEE.txt B
on  A.DEPT=B.DEPT
where  B.GENDER='F' group by A.DEPT

 

Find more examples of agile computations in SPL CookBook