Global variable/ Variable/Lock

 

SPL variables have three scopes: local variable, global variable, and job variable. The default variables are all local variable, and the scope is within the scope of the same SPL script (excluding the main sub thread code). The usage is relatively simple, which is not covered in this article. The following will focus on the other two variables and their related locks.

Global variable

The scope of the global variable in SPL is the life cycle of the same JVM (computing node\ SPL service), which can be accessed in all SPL scripts run on this JVM.

The global variable is assigned with the env function:

>env(gv1,1) //gv1 is the name of the global variable

Simply access the variable name directly in the script

=gv1+2 // equal to 3

It can also be deleted

>env(gv1)

Some data may be shared by multiple scripts and can be preloaded into memory when memory capacity permits, avoiding each instant read and improving performance.

When the service starts, load the global variable in init.splx.


A

B

1

=connect("orcl")

Connect to database

2

=A1.query@x("select OrderID,Client,SellerID,OrderDate,Amount from orders order by OrderID")


3

=A2.index()

Establishing an index

4

>env(orders,A3)

Global variable assignment

You can use it directly in the script


A

B

1

=orders.select(OrderDate>=arg1 && OrderDate<arg2)

Direct reference to global variable

2

=A1.groups(Client; sum(Amount):s,count(1):c)


Job Variable

Sometimes a complex job needs to be implemented with the cooperation of multiple scripts. These scripts may share some information. It is troublesome to always transfer information through parameters and return values of the scripts. A simple idea is to implement it using global variable.

However, a server may concurrently run many jobs. If the global variable is used, this variable will be shared by all jobs, which is often not what we want.

To this end, SPL provides a mechanism of job variable.

A calculation initiated by an SPL script (including an SPL code without a script) and achieved by multiple scripts executed through the call function is called a job. A single calculation achieved by multiple SPL scripts via a same JDBC connection is also considered a job.

The use of job variables is similar to that of global variable. Add an option @ j when assigning values:

>env@j(jv1,1) //jv1 is the job variable name

Directly access with variable names in scripts:

=jv1+2 // is equal to 3

It can also be deleted:

>env@j(jv1)

For example, the main script sets the username which is used as a parameter as the job variable userID, and the sub scripts use the userID for calculation. Under different jobs (main scripts executed with different parameters), the value of userID will be different and will not affect each other.

Main script


A

B

1


2

>env@j(userID, arg)

Set userID as a job variable

3

=call("sub.splx")

Subscript

4


sub.splx


A

B

1

=connect("orcl")


2

=A1.query@x("select deptID,deptName from account where userID=?",userID)

Use job variable

3


Lock

When multiple threads simultaneously read and write to a shared resource, unpredictable results can occur. For example, if a thread modifies the shared variable from 1 to 2 and then adds 3, the expected result is 5. However, just before adding 3, another thread modifies the variable to 4. At this point, the previous thread continues to add 3 to the variable, resulting in an incorrect result of 7. In this case, locks should be used to ensure the correctness of concurrent reads and writes. A script can lock a shared resource before performing a write operation. If other scripts attempt to read or write the resource at this time, they must wait for the previous script to unlock the resource, or give up reading or writing the resource after waiting for a period of time.

For example, the global variable m is used to record the number of times a file is accessed. The file will be accessed concurrently by multiple scripts. To avoid read/write conflicts, locks should be used.


A

B

1

=file("d:/Orders.csv")

Access file

2

>lock(10)

Lock, lock name is 10

3

=m=m+1

Access number plus 1

3

>lock@u(10)

Unlock

4

=A2.import@t().select(Amount>arg1 && Amount<arg2)


When the script is executed simultaneously (assuming 2 threads are concurrent, with an initial value of m=0), if no lock is used, m+1 in both threads may execute simultaneously, with 0+1=1. The final result is 1, which is obviously incorrect; After using the lock, the m+1 in the two threads must be executed separately. The first to lock must be executed first, with 0+1=1. After the former releases the resource and then the latter executes, at this time, m=1, so m+1=2, the result obtained is correct. Sometimes, if the previous thread fails to release resources (or even experiences a deadlock), the thread cannot wait indefinitely. In this case, the second parameter of the function lock, timeout time, should be used, which means that the thread can wait up to N milliseconds. If other threads that previously locked resources unlock during this period, the function lock will return the lock name, and the thread should lock the resource normally and use it; If other threads do not unlock after N milliseconds, the lock function will return 0, and this thread should handle it as an exception.