Privileged Data Retrieval from HTTP-based Websites
SPL (Structured Process Language), on which esProc is based, provides httpfile function to read data from the HTTP-based websites. Some servers, for security reasons, will verify and authenticate users’ identities before letting them get access. Most verification and authentication methods fall into two categories. In one category, after a user enters required information on the verification/authentication page, the server records the verified/authenticated user information in Session and sends the Session ID back to Cookies at the client-side, or sends the verified/authenticated user information to Cookies along with the Session ID. When a user visits a web page to try to obtain privileged data, their information that has already been stored in Cookies will be read and placed in the request header to let the server verify/authenticate the user’s identity and decide whether they should be allowed to get access. In the other category, after a user enters required information on the verification/authentication page, the server returns an access token that will be passed back when trying to access privileged data on a web page during the token’s validity period.
Now let’s look at how SPL performs verification/authentication and enables users to access privileged data.
1. The server saves identity information in Session or Cookie
For example, RaqForum only allows verified/authenticated Raqsoft employees to access a certain internal section.
A |
|
1 |
=httpfile("http://c.raqsoft.com.cn/article/1628656263716") |
2 |
=A1.read() |
The above script reads a post directly, but A2 returns the information saying that you are not authorized to get access. The data retrieval fails. The script below performs verification/authentication before retrieving data:
A |
|
1 |
=httpfile("http://c.raqsoft.com.cn/login4get?nameOrEmail=tom&userPassword=900150983CD24FB0D6963F7D28E17F72&rememberLogin=true") |
2 |
=A1.read() |
3 |
=A1.property("Set-Cookie") |
4 |
=httpfile("http://c.raqsoft.com.cn/article/1628656263716";"Cookie":A3) |
5 |
=A4.read() |
A1 Define an httpfile object to visit RaqForum login interface page. The password passed in is the string of original password encrypted by MD5.
A2 Read the content returned from the verification/authentication page to finish the verification/authentication process. Generally, the returned content contains information that whether the verification/authentication succeeds or fails.
A3 Read the value of Set-Cookie property from the response header in verification/authentication request to write it to the client-side Cookie.
A4 Define the httpfile object for accessing posts in the internal section and place A3’s content in request header Cookie.
A5 Read a post from the internal section and return the desired data.
The visitor needs to know the login interface in advance. The data provider will get down the related information in a specific document. In this case data is submitted using GET method. Some interfaces require the submission using the POST method or in JSON format. Both are supported by the SPL httpfile function.
2. The server returns a Token
Below is the general verification/authentication process using this type of methods:
A |
|
1 |
=httpfile("https://xxxxxx","{\"userId\":\"abc\",\"password\":\"sdfikje87kd908\"}";"Content-Type":"application/json") |
2 |
=A1.read() |
3 |
=json(A2).accessToken |
4 |
=httpfile("https://xxxxxx","{\"accessToken\":\""+A3+"\",\"other\":\"xxx\"}";"Content-Type":"application/json") |
5 |
=A4.read() |
A1 Define an httpfile object to visit the login interface page. The above example script submits the username, password and other parameters in JSON format. In real-world scenarios, a method required by the server will be used to submit the information.
A2 Read the content returned from the verification/authentication page to finish the verification/authentication process. Generally, the returned content contains information on whether the verification/authentication succeeds or fails. There will be accessToken, valid period, and other related information if verification/authentication is accepted.
A3 Suppose the returned content is in JSON format, then convert the content into a JSON object and get the accessToken value.
A4 Define the httpfile object on a privileged web page. Here we assume that the server requires that the JSON format be used to pass parameters in. Assign A3’s content to parameter accessToken and pass other necessary parameters.
A5 Read the desired data.
We will discuss other verification/authentication methods, if any, later in subsequent essays.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL