SPL Programming - 7.1 [String and time] String

 

Up to now, the data we have processed in the program, except for a small amount of text in output(), are numerical values or sequences of numerical values. In fact, the computer can also easily process texts. In the programming language, we call these texts string. String is another data type different from integer and floating-point number.

In SPL, write text directly in a cell, if it cann’t be identified as other meaningful data type or statement, it will be regarded as a string constant by default. In an expression, we can also use double quotation marks to enclose text to represent a string constant.

A B
1 集算器 SPL
2 =“集算器” =“SPL”
3 3+5 =“3+5”
4 =3*4 '=3*4
5 2020-1-1 ’2020-1-1
6 for 10 ’for 10

A1 is the string constant “集算器”, A2 is a normal calculation cell, and its calculation result is also “集算器”, which is the same as A1; Similarly, the cell values of B1 and B2 are “SPL”; A3 starts with a number, but cannot be interpreted as a numeric value, so it is also a string. Like B3, it is “3+5”.

A4 is a calculation cell that can calculate a numerical value, so it is not a string. If we want to write a string constant starting with =, we should add a single quotation mark ’ at the beginning of the cell. This extra single quotation mark is not part of the string, and the cell value of A4 is “=3*4”. Note that only one single quotation mark is needed, do not write another one on the right. If it is written, the single quotation mark on the right will also be regarded as a character of the string.

Similarly, A5 is a date constant that can be identified, and it is not a string. If we want to generate such a string, a single quotation mark ’ should also be used at the beginning of the cell; And also, A6 is a legal statement and will not be recognized as a string. To obtain such a string constant, a single quotation mark ’ should also be used at the beginning of the cell.

String is also a data type and can be operated. Its most common operation is concatenation, that is, concatenating two strings to form a longer string.

A B
1 集算器
2 =A1+“2020” 集算器 2020
3 =A1+2020 2020
4 =A1+string(2020) 集算器 2020
5 =A1/2020 集算器 2020
6 =A1/“2020” 集算器 2020

Use the + sign to concatenate two strings (A2); However, it should be noted that the string part (A3) will be ignored when adding a string and a value. This is a special stipulation of SPL. Most other programming languages do not stipulate like this. Instead, they will either report an error or convert the value into a string and then concatenate. The value can be normally concatenated with other strings by converting it into a string with the string()function (A4). If the slash / is used to concatenate the string and the value, the value will be automatically converted into a string (A5). The slash / can also concatenate two strings directly (A6).

Now, we can use the string to transform the code of prime factorization and display the result as a complete expression.

A B C
1 7215 =2 =string(A1)+“=1”
2 for A1>1 if A1%B1==0 >C1=C1+“*”+string(B1)
3 >A1=A1\B1
4 next
5 >B1+=1

After the execution of this program, a string will be obtained in cell C1: 7215=1*3*5*13*37, and the result of prime factorization is written out. If you carefully interpret the execution process of the program, you can imagine how it is concatenated step by step.

If there is a concatenation, there is a split. SPL also provides the following functions:

A B
1 集算器 2020
2 =len(A1) ’7
3 =left(A1,2) ’集算
4 =right(A1,3) ’020
5 =mid(A1,3,2) ’器 2
6 =A2.(mid(A1,~,1)) [“集”,“算”,“器”,“2”,“0”,“2”,“0”]

The len() function returns the number of characters constituting the string, also known as the length of the string. Note that the name of this len()function is the same as that of the function getting the sequence length, but the writing method is different. The string should be regarded as a parameter here instead of writing as A1.len(). It should also be noted that SPL adopts unicode, and a Chinese character or a number (or English letter) is only one character. In some early programming languages, a Chinese character will be two characters.

The left(), right() and mid() functions of A3, A4 and A5 will get a part from the string to form a new string and return. The function name has reflected the part to be taken. Then observe the operation results. It is easy to understand the meaning of its parameters, and we won’t elaborate here. Almost all programming languages that can handle strings have these functions, and the naming and parameter rules are the same.

A6 split a string into a sequence of characters with a loop function, and you can understand the mid() function again. A single character is also a string, that is, a string with a length of 1.

The result of prime factorization will always be written with a 1* in the front, this is because we concatenate a * and a factor to the result string every time in the loop. If we don’t write the first 1, there will be a result of 7215=*3*5*13*37, which is wrong.

But in any case, this 1* is a little redundant. What can we do?

It can be solved by splitting the string.

A B C
1 7215 =2 =string(A1)+“=”
2 for A1>1 if A1%B1==0 >C1=C1+if(right(C1,1)==“=”,““,”*”)/B1
3 >A1=A1\B1
4 next
5 >B1+=1

When concatenating the factor, judge whether the current string ends with =. If so, it means that it is the first factor, do not concatenate the * sign, otherwise it will be concatenated. Now we can get the desired result 7215=3*5*13*37.

We can also do it later and remove the redundant 1* part:

A B C
1 7215 =2 =string(A1)+“=1”
2 for A1>1 if A1%B1==0 >C1=C1+“*”+string(B1)
3 >A1=A1\B1
4 next
5 >B1+=1
6 =pos(C1,“=1*”) >C1=left(C1,A6+1)+mid(C1,A6+3)

We encounter another function pos() here, which will find another string in a string. The former is called a substring. After finding the substring, it will return the position, that is, starting from this position, the character of the original string will be this substring. If it cannot be found, it will return null. This is very similar to the pos() function of the sequence, but it also needs to write the string as a parameter rather than the object syntax.

We know for sure that there must be a substring =1* in the calculated result string now (there will only be one after adding =, otherwise there may be a prime factor in the middle ending with 1, but pos only finds the first one and it’s not wrong, but we are more rigorous here). After finding its position, we can remove the 1* part by splitting and concatenating.

We can also use the replace()function to directly replace: =replace(C1,“=1*”,“=”). Similarly, replace =1* with =, not 1* with an empty string.

SPL provides many string processing functions, which are not listed here one by one. You can check the help documents when necessary.

We just used == to compare strings. Of course, we can also use !=. Then, are the symbols like >, < meaningful to strings?

Yes.

In essence, computers can only process numerical values. When processing characters, they should also be represented by numerical values, which is the encoding method. As we said earlier, the encoding method adopted by SPL is called unicode.

Since it is a numeric value, it can be compared with the value. When the strings are compared, it is compared with the character code. The comparison rule is very similar to the sequence, that is, the first character of the two is compared first. If it is different, the bigger one will be bigger. If it is the same, then compare the second character,…,, until it is different or one of them is not long enough. In fact, this is the real source of the term dictionary order.

So, how do we know which of the two characters is bigger and which is smaller? Like “1” and “2”, or “1” and “A”?

Simply put, we can write a code and compare it. However, it’s too tiresome to memorize all by rote. Let’s see if these codes have any rules. SPL provides asc() function to return the code of a character.

A B
1 =asc(“0”) =9.(asc(string(~)))
2 =asc(“A”) =asc(“C”)
3 =asc(“a”) =asc(“z”)
4 =asc(“集”) =asc(“算”)

We won’t explain it one by one. Please execute these codes to see the results. Here we directly write the coding rules:

1) The coding of “0”-“9” is continuous, from 48 to 57;

2) The coding of “A”-“Z” is continuous, from 65 to 90;

3) The coding of “a”-“z” is continuous, from 97 to 112;

4) The coding of Chinese characters is complex, and there is no clear rule.

This coding standard was originally called ASCII, and unicode is extended on the basis of ASCII, so this function is also called asc().

In reverse to the asc()function, SPL also has a char() function that can convert the coding into a character.

We can use the char() function and the rules just found to write a function: given two integers r and c, calculate the name of the cell in row r and column c in Excel.

A B C D
1 >r=2,c=123
2 >rc="" =1 =26
3 for c>C2 >c-=C2 >C2*=26 >B2+=1
4 >c-=1
5 for B2 >rc=char(65+c%26)+rc >c\=26
6 return rc/r

The main difficulty in calculating the cell name is to calculate the column name. First calculate the number of letters needed(B1), and then calculate the letter string corresponding to the column.

Conversely, the asc() function can be used to inversely calculate the row number and column number from the cell name rc:

A B C D
1 >rc=“PG45”
2 =len(rc).(upper(mid(rc,~,1)))
3 >r=c=0 =-1 =1
4 for A2 if A4>=“A” >c=c*26+asc(A4)-65
5 >B3+=C3 >C3*=26
6 else >r=r*10+asc(A4)-48
7 return [r,c+B3+1]

The upper() function is used to capitalize letters.

The rule of Excel cell name looks simple, but it is not easy to come up with the correct calculation logic. As we said earlier, the program code will not help us solve the problem, but will only help us realize the solution.


SPL Programming - Preface
SPL Programming - 6.3 [Reuse] Reusable script
SPL Programming - 7.2 [String and time] Split and concatenate