"Though esProc is a powerful computing engine, it is not good at handling machine learning algori .."

Hoo RaqForum 19 No.
1 Reply • 837 View • 4 Years ago

How to Call a Python Program from SPL

Though esProc is a powerful computing engine, it is not good at handling machine learning algorithms. Python, however, is excellent in doing that. So esProc offers the YM external library to call a Python program in an esProc SPL program. That’s smart.

We’ll illustrate how to call a Python program from SPL in three aspects:

1. Standards and requirements in Python module development;

2. Interface call using ym_exec;

3. Uses of model building algorithm module.

The diagram shows relationships between the SPL program, the interface and the Python program:

undefined

The SPL program calls ym_exec interface to pass in a parameter to Python apply()interface. And apply() calls the Python program to execute and returns the result to SPL.

1． Standards and requirements in Python module development

A. def apply(ls) interface calls and executes a Python program and returns it to SPL program.

B. The list type parameter ls functions in the same way as parameter argv in Java entry interface void main(string argv[]).
C. The return value, which is of DataFrame structure and stored in the list type variable, can be viewed in SPL.
D. Below is a sample program (demo.py) of building a Python module:

import pandas as pd

import sys

def apply(lists):

cols = ["value"]

ls = []

for x in lists:

ls.append("{}".format(x))

df = pd.DataFrame(ls, columns=cols)

lls=[]

lls.append(df)

return lls

if __name__ == "__main__":

res = apply(sys.argv[1:])

print('res={}'.format(res))
Execution: python demo.py "AAA" "BBB" 1000
Output: res=[ value

0 AA

1 BBB

2 1000]

The apply() interface adds the passed-in parameter to the variable list ls, puts ls in the DataFrame structure, and then places the dataframe in the to-be-returned variable list lls. Then we test the apply() interface in Python to make sure it operates well and then we can call the it in the SPL program.

Note: Dataframe is returned in msgpack format. This requires data in same column be of same type; otherwise errors will happen in masgpack serialization and SPL won’t receive the dataframe.

2． Interface call using ym_exec
Format: ym_exec(pyfile, p1,p2,…)

The esProc interface function calls and executes the py file using passed-in parameters p1 and p2. The number of parameters vary according to those in apply() interface.
This interface needs to work with esProc external library pythonCli. The external library connexts to a Python program through userconfig.xml, whose configuration will be explained later.

A. Install Python:
Download Python 3.0 to install it in, for example, c:\Program Files\raqsoft\yimming\Python37.
B. Install esProc external library:
By default the external library is installed in esProc\extlib\pythonCli. Then select pythonCli on Select external libraries tab.
undefined

C. Configuring parameters:

Configure parameters in userconfig.xml under esProc’s external library directory (esProc\extlib\pythonCli\userconfig.xml):

Parameter	Name	Description
sAppHome	C:\Program Files\raqsoft\yimming	application directory
sPythonHome	c:\Program Files\raqsoft\yimming\ Python37\python.exe	Python file
sPythonHost	localhost	IP address
iPythonScriptPort	8512	Port number

The application is the Python service-side application:
undefined

After all configuration is done, restart esProc to employ the ym_exec() interface.

To call demo.py, for example:

	A
1	=ym_env()
2	=ym_exec("d:/demo.py", false, 12345, 10737418240, 123.45, decimal(1234567890123456), "aaa 123")
3	>ym_close(A1)

Result:

	value
1	False
2	12345
3	10737418240
4	123.45
5	1234567890123456
6	aaa 123

3. Uses of model building algorithm module

To call a Python Partial Least Squares algorithm (PLS, which esProc deosn’t offer) in SPL, first you need to install Yimming External Library. Configuration guide can be found in SPL Smart Modeling and Scoring.

The PLS algorithm contains complex parameters. We specify the invocation format to make it convenient:

ym_exec(pyfile, data, jsonstr)

The SPL program calls and executes pyfile; data is the table sequence for which model is built; the algorithm’s many parameters will be written in JSON strings and represented by parameter jsonstr. Make sure the parameters correspond to those in pyfile’s apply() interface handling to be correctly parsed.

data：Name of a data file over which scoring is to be performed or that has column headers. It includes the column where the target variable (target) settles.

jsonstr: JSON strings. For example:
{target:0,n_components:3,deflation_mode:'regression',

mode:'A',norm_y_weights:False,

scale:False,algorithm:'nipals',

max_iter:500,tol:0.000001,copy:True}
target, which must not be absent, specifies the column holding the target variable.

SPL script (pls_demo.dfx):

	A	B
1	=ym_env()
2	="d:/script/pls_demo.py"
3	=file("d:/script/data_test.csv").import@cqt()	//Data file
4	{target:0,n_components:3,deflation_mode:'regression', mode:'A',norm_y_weights:False }	//The first column is the target variable and parameters are written in JSON format
5	=ym_exec(A2, A3, A4)
6	>ym_close(A2)

The data file (data_test.csv) where the first column is the target variable:

0	1	2	3	4	5	6	7	8	9
181.6	-0.00182	-0.00796	-0.00748	-0.00286	0.004846	0.015545	0.028104	0.039865	0.046408
154.5	-0.00102	-0.00789	-0.00795	-0.00361	0.004065	0.015055	0.028321	0.041063	0.048227
195	0.001206	-0.00464	-0.00404	0.000681	0.008794	0.020834	0.036321	0.051656	0.059063
150.8	-0.00154	-0.00802	-0.00768	-0.0028	0.00554	0.01712	0.03072	0.043453	0.050239
…

A sample of coding Python algorithm module (Take pls_demo.py file for example)

from scipy.linalg import pinv2

import numpy as np

import pandas as pd

import demjson

#algorithm class pls_demo：

class pls_demo():

. . . . . . .

Pass

#interface implementation

def apply(lists):

if len(lists)<2:

return None

data = lists[0] #data parameter

val = lists[1] #jsonstr string parameter

if (type(data).__name__ =="str"):

data = pd.read_csv(data)

#1. Handle special values in JSON strings

#print(val)

val = val.lower().replace("false", "'False'")

val = val.replace("true", "'True'")

val = val.replace("none", "'None'")

dic = demjson.decode(val)

if dic.__contains__('target') ==False:

print("param target is not set")

return

#2. Handle parameter target that is either column count or column name

targ = dic['target']

if type(targ).__name__ == "int":

col = data.columns

colname = col.tolist()[targ]

else:

colname = targ

Y = data[colname]

X = data.drop(colname, axis=1)

# 3. Handle model building parameters, during which defaults should be set for those without passed-in values

if dic.__contains__('n_components') :n_components=dic['n_components']

else: n_components=15

if dic.__contains__('deflation_mode') :deflation_mode=dic['deflation_mode']

else: deflation_mode="regression"

if dic.__contains__('mode'):mode=dic['mode']

else: mode="A"

…….

# 4. Load algorithm module

#print("n_components={}".format(n_components))

pls_model = pls_demo(n_components,

deflation_mode,

mode,…)

# Training data

pls_model.fit(X, Y)

# Scoring

y_pred = pls_model.predict(X)

#5. Append return value

f = ["value"]

df = pd.DataFrame(y_pred, columns=f)

#print(y_pred)

lls=[]

lls.append(df)

return lls

#6. Test

if __name__ == '__main__':

ls = []

ls.append("a2ef764c53ec1fbc_X.new.csv")

val = "{target:0,n_components:3,deflation_mode:'regression'," \

"mode:'a',norm_y_weights:False," \

"scale:False,algorithm:'nipals'," \

"max_iter:500,tol:0.000001,copy:True}"

ls.append(val)

apply(ls)

SPL Official Website 👉 https://www.scudata.com

SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL

SPL Learning Material 👉 https://c.scudata.com

SPL Source Code and Package 👉 https://github.com/SPLWare/esProc

Discord 👉 https://discord.gg/2bkGwqTj

Youtube 👉 https://www.youtube.com/@esProc_SPL

esProc

Hoo • 837 View • 4 Years ago