Structurize a Text File Containing Two Types of Separator
【Question】
I have a text file called readings, it has the following data in it:
-10,3NW,15cm,4:38
5,15SW,8mm,2:8
8,8ENE,2mm,:25
-5,0,7cm,1
-3,0,3mm
From the first position, they are temperature, speed, precipitation and time (hours and minutes).
I want to split the string with tokens = line.split(":") only if the fourth token exists. My code for splitting the strings without doing any splits with the delimiter : is:
try {
input = new BufferedReader(new FileReader("readings.txt"));
line = input.readLine();
while (line != null) {
tokens = line.split(",");
temperature = Integer.parseInt(tokens[0].trim());
tokens[1] = tokens[1].trim();
separation = firstNonNumericPosition(tokens[1]);
if (separation == 0 || (separation < 0 && Integer.parseInt(tokens[1]) != 0)) {
speed = -1;
} else {
if (separation < 0) {
speed = 0;
direction = "";
} else {
numeric = tokens[1].substring(0, separation);
speed = Integer.parseInt(numeric.trim());
direction = tokens[1].substring(separation).trim();
}
if (tokens.length > 2) {
tokens[2] = tokens[2].trim();
separation = firstNonNumericPosition(tokens[2]);
if (separation <= 0) {
precipitation = -1;
} else {
numeric = tokens[2].substring(0, separation);
precipitation = Integer.parseInt(numeric.trim());
unit = tokens[2].substring(separation).trim();
}
} else {
precipitation = 0;
unit = "";
}
}
if (speed < 0 || precipitation < 0) {
System.out.println("Error in input:" + line);
} else {
readings[size] = new Reading(temperature, speed, direction,
precipitation, unit.equalsIgnoreCase("cm"));
size++;
}
line = input.readLine();
}
input.close();
} catch (NumberFormatException ex) {
System.out.println(ex.getMessage());
} catch (IOException ioe) {
System.out.println(ioe.getMessage());
} catch (ArrayIndexOutOfBoundsException ar){
System.out.println(ar.getMessage());
}
I tried using this logic but it gave an ArrayIndexOutOfBoundException of 3.
if(tokens.length > 3) {
tokens = line.split(":");
hours =Integer.parseInt(tokens[3].trim());
minutes =Integer.parseInt(tokens[4].trim());
}
How is it possible to split it if the fourth token exists?
【Answer】
Your target: To structurize the text data into a five-column 2D table. Problem description: The 4th column is irregular and you need to split it into two columns by comma. This will involve order-based calculation and structured computation. Without further requirements, we can handle it in SPL (Structured Process Language). The code is simple and easy to understand.
A |
|
1 |
=file("d:\\source.txt").import@c() |
2 |
=A1.new(#1:Temperature, #2:speed, #3:precipitation, (t=#4.array(":")).m(1):hours ,t.m(2):minutes) |
A1: Read in source.txt as a 4-column table sequence.
A2: Split the 4th column into two columns by the comma to generate a new 2D table consisting of 5 fields – temperature, speed, precipitation, hours, and minutes.
The SPL script can be easily integrated into a Java application. See How to Call an SPL Script in Java.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProc_SPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL