SPL: recursively union field values

 

We may use data file with multiple layers apart from common two-dimension data table, for example, the JSON data file which stores sales records. It may be constructed with multiple layers by time, place, and other dimensions, and each layer of the structure has sales fields. Recursion helps to get all the field values of sales.

How to recursively union field values with multi-layer data is shown in the specific examples.

 

[e.g.1] Calculate the gross sales of 2014 based on the sales data in JSON format, and some of the data are as follows:

   [

      {"YEAR":2013,"MONTH":7,"SALES": [

         {"ID":10248,"CUSTOMERID":"VINET", … ,"SALES":2440},

         {"ID":10249,"CUSTOMERID":"TOMSP", … ,"SALES":1863.4},

         {"ID":10250,"CUSTOMERID":"HANAR", … ,"SALES":1813.0},

         …

      ]},

      {"YEAR":2013,"MONTH":8,"SALES": [

         {"ID":10270,"CUSTOMERID":"WARTH", … ,"SALES":1376.0},

         {"ID":10271,"CUSTOMERID":"SPLIR", … ,"SALES":48.0},

         {"ID":10272,"CUSTOMERID":"RATTC", … ,"SALES":1456.0},

         …

      ]},

      …

   ]

 

The A.field@r() function in SPL gets the field values recursively.

The SPL script is as follows:

IMG_256

A

1

=json(file("sales.json").read())

2

=A1.select(YEAR==2014)

3

=A2.field@r("SALES")

4

=A3.sum(~.sum())

A1: import the JSON data file.

A2: select the sales records of 2014.

A3: use A.field@r() function to recursively get all of the sales field values.

A4: loop to calculate the sales of each month in 2014, and then calculate the gross sales.

 

[e.g.2] Count the number of confirmed cases worldwide based on the JSON data of confirmed COVID-19 cases around the world at some point. Some of the data are as follows:

   [

      {Region:"USA",Confirmed:[

         {Region:"California",Confirmed:3671902},

         {Region:"New York",Confirmed:1139740},

         {Region:"Virginia",Confirmed:620801},

         {Region:"Florida",Confirmed:2064525},

         …

      ]},

      {Region:"Brazil",Confirmed:[…]},

      {Region:"India",Confirmed: […]},

      {Region:"France",Confirmed: […]}

      …

   ]

In this example, some nations count the number of country, some count the number of every state or province, and some of every city. In this case, recursive union should be adopted rather than simply looping the field values to get a sum. SPL offers A.conj@r() function to recursively union the sequence members.

 

The SPL script is as follows:

IMG_256

A

1

=json(file("COVID-19.json").read())

2

=A1.field@r("Confirmed")

3

=A2.conj@r().sum()

A1: import the JSON data file.

A2: use A.field@r() function to recursively get all of the confirmed field values.

A3: use A.conj@r() function to recursively union all confirmed numbers and then get the sum.