9.8 Split string using tab as the delimiter

 

Use tab as the delimiter to split a string into a sequence of strings.
Arrange a log file into structured data (a table sequence made up of a number of fields – USERID, UNAME, IP, TIME, URL, BROWSER, LOCATION and MODULE). In the log file, a record’s first line contains IP, TIME, GET, URL, BROWSER; the second holds MODULE; and the third contains USERID, UNAME, LOCATION.

imagepng

The s.split(d) function splits string s using delimiter d and return result as a sequence.

SPL script:

A
1 =file(“log.txt”).read@n()
2 =A1.group((#-1)\3)
3 =A2.(~.conj(~.split(“\t”)))
4 =A3.new(~(7):USERID,~(8):UNAME,~(1):IP,~(2):TIME,~(4):URL,~(5):BROWSER,~(9):LOCATION,left(~(6).split(“:”)(2),-1):MODULE)

A1 Read the file by line and import it as a sequence of strings.
A2 Use group() function to divide the sequence every three lines.
A3 Use s.split() function to split each line by ”\t” and concatenate lines into a sequence.
A4 Generate structured data.

Execution result:

USERID UNAME IP TIME URL BROWSER LOCATION MODULE
47356 Jessica 10.10.10.143 2013-04-01 21:14:44 /p/pt301/index.jsp Mozilla/6.0 Chicago production
419 Jacob 10.10.2.76 2013-04-01 21:18:50 /h/homepage.jsp Chrome/35 Houston homepage