Get a Certain String from Each Row

Question

From a text file with variable number of columns per row (tab delimited), I would like to extract value with specific condition. The text file looks like:

S1=dhs  Sb=skf  S3=ghw  QS=ghr<b/>

S1=dhf  QS=thg  S3=eiq<b/>

QS=bhf  S3=ruq  Gq=qpq  GW=tut<b/>

Sb=ruw  QS=ooe  Gq=qfj  GW=uvd<b/>

I would like to have a result like:

QS=ghr<b/>

QS=thg

QS=bhf

QS=ooe

Please excuse my naive question but I am a beginner trying to learn some basic bash scripting technique for text manipulation.

 

Answer

Shell can do this for you. But the code is hard to read. It’s easy to get this done with SPL’s (Structured Process Language) set-based operations. Here’s the SPL script:

A

1

=file("/file.txt").import()

2

=A1.(~.array()).union()

3

=A2.select(pos(~,"QS"))

A1: Import the text file;

A2: Get field values of each record to form a sequence and union the sequences;

A3: Get from each sequence the member matching string “QS”.