"【Question】 I have a file that looks like this: >Unc14086 AGAGUUUGAU >Unc3544 .."

blackduckie RaqForum 28 No.
562 View • 5 Years ago

Combine Matching Records from Two Files

text(125)

【Question】
I have a file that looks like this:

>Unc14086

AGAGUUUGAU

>Unc35443

GCACGAGAAA

So, every n (n may vary) lines the next line starts with “>”, that is the beginning of a new block of information.

I have another tab-delimited file:

Unc14086 InformationalTextExample

Unc35443 InformationalTextExampleII

My goal is to parse the second file with information found in lines starting with “>” in the first file. Whenever a matching pair occurs, I want to write “InformationalTextExample” in that line, possibly separated by “_”:

>Unc14086_InformationalTextExample

AGAGUUUGAU

>Unc35443_InformationalTextExampleII

GCACGAGAAA

How would that be possible?

Thank you!

【Answer】

A Perl solution is clear but long. esProc SPL’s loop functions will give you a concise solution. Here’s the SPL script:

	A
1	=file("one.txt").read@n()
2	=file("another.txt").import()
3	=A1.(if(left(~,1)!=">",~,A2.select@1(mid(A1.~,2)==_1).(">"+_1+"_"+_2)))

See SQL Headaches Therapies - For Loop Operations to learn more use cases of esProc loop functions.

SPL Official Website 👉 https://www.scudata.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL

SPL Learning Material 👉 https://c.scudata.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/2bkGwqTj

Youtube 👉 https://www.youtube.com/@esProc_SPL

text(125)

Application

blackduckie • 562 View • 5 Years ago