Intra-group Inter-row Calculation

Problem description & analysis

data.xlsx records dates when events under different users happen. The data is ordered by USER ID and DATE. Below is part of the file:

A

B

C

1

USER   ID

EVENT   ID

DATE       

2

1

4

2020-01-01

3

1

5

2020-01-05

4

1

6

2020-01-13

5

2

7

2020-01-03

6

2

8

2020-01-05

7

2

9

2020-01-06

We want to add a calculated column isTrue according to a specific rule. For the same USER ID, the first event is always true; from the second event on, if the interval between this event and the last true event is 10 or more days, the event is true, otherwise it is false. The expected result should be as follows:

A

B

C

D

1

USER   ID

EVENT   ID

DATE       

isTrue

2

1

4

2020-01-01

1

3

1

5

2020-01-05

0

4

1

6

2020-01-13

1

5

2

7

2020-01-03

1

6

2

8

2020-01-05

0

7

2

9

2020-01-06

0

The task requires inter-row calculations on each group, which involves value assignment to the temporary variable and its change.

Solution

Below is the esProc script:

A

1

=file("data.xlsx").xlsimport@t()

2

=A2.derive(if('USER   ID'!='USER   ID'[-1],(t=DATE,1),if(DATE-t>=10,(t=DATE,1),0)):isTrue)

3

=file("result.xlsx").xlsexport@t(A2)

A2: If USER ID in the current row is different from that in the last row, the current event is the first one in its group. In this case, assign 1 to isTrue and the temporary value t is equal to the current DATE. If USER ID in the current row is the same as that in the last row, an intra-group calculation is needed, which uses the temporary variable and the current DATE to calculate the interval and assign 1 to isTrue if the result is equal to or greater than 10, and modify the value of temporary variable t to the current DATE.

Q & A Collection

 https://stackoverflow.com/questions/63838362/date-diff-between-many-rows-in-excel