Excel Post-grouping Calculations
Excel supports grouping & aggregation on data, as well as offering pivot tables to perform multidimensional analysis. But Excel methods and operations are too simple to handle complicated tasks or manage them conveniently. Such tasks include getting ranks on each subset, and filtering and sorting by aggregates. In this article, we’ll give examples of post-grouping calculations, analyze it and provide SPL code solution. SPL (Structured Process Language) is the language used by esProc, the specialized data computing engine. It’s convenient to use for handling structured computations.
I Intra-grouping sorting
Below is the student scores table. We want to get rankings for each subject. To do this we need to first group records by subject and then perform sorting on each group.
Below is the Excel data:
The expected result:
A SPL script interacts with Excel through clipboard. We edit an SPL script in esProc designer, paste the to-be-analyzed Excel data to the clipboard, execute the script, and then paste result to Excel.
SPL provides stepwise coding in an easy way:
A |
B |
|
1 |
=clipboard().import@t() |
/ Import student scores table with column headers |
2 |
=A1.group(Subject) |
/ Group records by Subject |
3 |
=A2.(~.sort(Score:-1)) |
/ Sort each group by score in ascending order |
4 |
=A3.(~.derive(#:Rank)) |
/ Add a Rank field for each group of records |
5 |
=A4.conj() |
/ Concatenate groups of sorted records |
6 |
=A5.new(Subject,Rank,Score,Name) |
/ Create a new table sequence with fields of different orders |
7 |
=clipboard(A6.export@t()) |
/ Paste the sorting result to clipboard |
Then we just paste the result to Excel to get the desired table.
To get records containing only the top 3 scores only for each subject, we just use top function to get the eligible records from each group.
The expected result:
SPL uses top function to get records of top 3 scores:
A |
B |
|
1 |
=clipboard().import@t() |
/ Import student scores table with column headers |
2 |
=A1.group(Subject) |
/ Group records by Subject |
3 |
=A2.(~.top(-3;Score)) |
/ Get the top 3 scores by Score from each group |
4 |
=A3.(~.derive(#:Rank)) |
/ Add a Rank field for each group of records |
5 |
=A4.conj() |
/ Concatenate groups of sorted records |
6 |
=A5.new(Subject,Rank,Score,Name) |
/ Create a new table sequence with fields of different orders |
7 |
=clipboard(A6.export@t()) |
/ Paste the result to clipboard |
Then we just paste the result to Excel to get the desired table.
II Sorting by sub-aggregate
Another scenario is to sort groups according to the sub-aggregates, such as ranking students by their total scores based on the student scores table. The expected result is as follows:
To do this, SPL uses groups function to group records while calculating total for each group:
A |
B |
|
1 |
=clipboard().import@t() |
/ Import student scores table with column headers |
2 |
=A1.groups(Name;sum(Score):Total) |
/ Group records by name and calculate each student’s total score |
3 |
=A2.sort(Total:-1) |
/ Sort total scores in descending order |
4 |
=A3.derive(#:Rank) |
/ Add a Rank field |
5 |
=clipboard(A4.export@t()) |
/ Export the sorting result and paste it to clipboard |
III Intra-group filtering
This time we want to find students whose performances are not satisfactory. Since exams of subjects are different in difficulty levels, it’s unsuitable to get them by Fail. Instead, we need to find those whose scores are below the average for each subject.
The expected result:
So we group records by subject, calculate the average of each subject, and find students whose scores are below the average.
SPL script:
A |
B |
|
1 |
=clipboard().import@t() |
/ Import student scores table with column headers |
2 |
=A1.group(Subject) |
/ Group records by Subject |
3 |
=A2.((AVG=~.avg(Score),~.select(Score<AVG))) |
/ Calculate average of each group, and find students whose scores are below the average |
4 |
=A3.conj() |
/ Concatenate records of eligible students |
5 |
=clipboard(A4.export@t()) |
/ Export the result and paste it to clipboard |
IV Filtering by sub-aggregate
In another scenario, we want to group records by name, calculate total of each group, and find students whose total scores are below the average total.
The expected result:
To perform an operation, filtering here, over the subtotals, we use groups function to group records, calculate the subtotals, get the average total and find the eligible students.
SPL script:
A |
B |
|
1 |
=clipboard().import@t() |
/ Import student scores table with column headers |
2 |
=A1.groups(Name;sum(Score):Total) |
/ Group records by name and calculate total score for each student |
3 |
=A2.select(Total<A2.avg(Total)) |
/ Calculate average total score and find students whose scores are below the average |
4 |
=clipboard(A3.export@t()) |
/ Export the result and paste it to clipboard |
V Intra-group percentage calculation
The following Excel sheet stores GDP values of some countries in 2019. The task is to group records by continent and calculate the percentage of each country’s GDP in its continent.
Below is the 2019 GDP table (Unit: $billion):
The expected result:
We can group records by Area, sum the total GDP of each continent, and then calculate the percentage of each country’s GDP in every group.
SPL script:
A |
B |
|
1 |
=clipboard().import@t() |
/ Import 2019 GDP table with column headers from clipboard |
2 |
=A1.group(Area) |
/ Group records by Area |
3 |
=A2.((SUM=~.sum(GDP),~.derive(string(GDP/SUM,"0.00%"):Percentage))) |
/ Sum the total GDP of each group and calculate percentage of each country in it |
4 |
=A3.conj() |
/ Concatenate result records of all groups |
5 |
=clipboard(A4.export@t()) |
/ Export the result and paste it to clipboard |
VI Calculating percentage of each group
Let’s look at how to calculate the percentage of each continent’s GDP in the global total.
The expected result:
Here we also use groups function to group records, calculate GDP of each continent, and then get the percentage of each continent’s GDP in the global total.
SPL script:
A |
B |
|
1 |
=clipboard().import@t() |
/ Import 2019 GDP table with column headers from clipboard |
2 |
=A1.groups(Area;sum(GDP):Total) |
/ Group records by Area and calculate total GDP of each continent |
3 |
=A2.derive(string(Total/A2.sum(Total),"0.00%"):Percentage) |
/ Calculate percentage of each continent’s GDP in the global total |
4 |
=A3.run(Total=string(Total,"0.00")) |
/ Adjust the display accuracy of Total value |
5 |
=clipboard(A4.export@t()) |
/ Export the result and paste it to clipboard |
Read Excel Grouping & Aggregation Examples to learn more analyses and solutions about Excel grouping & aggregation.
Find more agile computing examples in SPL Cookbook.
SPL Official Website 👉 https://www.scudata.com
SPL Feedback and Help 👉 https://www.reddit.com/r/esProcSPL
SPL Learning Material 👉 https://c.scudata.com
SPL Source Code and Package 👉 https://github.com/SPLWare/esProc
Discord 👉 https://discord.gg/cFTcUNs7
Youtube 👉 https://www.youtube.com/@esProc_SPL
Chinese version