A Case of Structured Query
【Question】
I'm trying to check whether column 3 of a tab-delimited file contains a certain word. If it does not, it should continue reading. If it does contain the word, it should check column 4. Depending on whether there is content in column 4, the output should be something found or something not found.
I'm not stuck on the second part of this, i.e. checking column 4. My output gives me"something found" when there is in fact no content there.
for line in f:
if line.strip()split("\t")[2] == "word":
print ("word")
if line.strip().split("\t")[3] is not None:
print ("something found")
else:
print("nothing found")
The file looks like this:
reference #1 reference #2 notword content ...(more columns)
reference #1 reference #2 word content ...
reference #1 reference #2 word noContent ...
【Answer】
You can achieve what you want with a simple structured query. Obviously the Python code is too complicated. An easy alternative is SPL (Structured Process Language). Here’s the SPL script:
A |
|
1 |
=file("d:\\data.csv").import() |
2 |
=A1.select(_3:"word", _4:"content") |
A1: Read the file.
A2: Get records where column 3 is word and column 4 is content.
You can change the condition in select function to perform different queries. You can also achieve an unfixed query condition through parameter passing and macro replacement. Modify A2 as follows, for example:
A |
|
1 |
=file("d:\\data.csv").import() |
2 |
=A1.select(${where}) |
A2: You can achieve a dynamic query by passing a condition to the select function through parameter where.
See How to Call an SPL Script in Java to learn how to invoke an SPL script in a Java application.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/2bkGwqTj
Youtube 👉 https://www.youtube.com/@esProc_SPL