Store accumulator-stage lookups directly#10645
Open
charlesmyu wants to merge 3 commits intomasterfrom
Open
Conversation
e52fbc5 to
e413d1d
Compare
e1c532c to
9594586
Compare
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 66 metrics, 5 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.61.0-SNAPSHOT~4c54b6b488, baseline=1.61.0-SNAPSHOT~cc122288e5
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.057 s) : 0, 1057197
Total [baseline] (8.839 s) : 0, 8838951
Agent [candidate] (1.061 s) : 0, 1060529
Total [candidate] (8.826 s) : 0, 8825501
section iast
Agent [baseline] (1.228 s) : 0, 1228202
Total [baseline] (9.562 s) : 0, 9561941
Agent [candidate] (1.24 s) : 0, 1239552
Total [candidate] (9.58 s) : 0, 9580283
gantt
title insecure-bank - break down per module: candidate=1.61.0-SNAPSHOT~4c54b6b488, baseline=1.61.0-SNAPSHOT~cc122288e5
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.177 ms) : 0, 1177
crashtracking [candidate] (1.201 ms) : 0, 1201
BytebuddyAgent [baseline] (626.917 ms) : 0, 626917
BytebuddyAgent [candidate] (628.885 ms) : 0, 628885
AgentMeter [baseline] (29.027 ms) : 0, 29027
AgentMeter [candidate] (29.105 ms) : 0, 29105
GlobalTracer [baseline] (256.276 ms) : 0, 256276
GlobalTracer [candidate] (257.699 ms) : 0, 257699
AppSec [baseline] (31.52 ms) : 0, 31520
AppSec [candidate] (31.566 ms) : 0, 31566
Debugger [baseline] (58.462 ms) : 0, 58462
Debugger [candidate] (58.782 ms) : 0, 58782
Remote Config [baseline] (584.325 µs) : 0, 584
Remote Config [candidate] (589.061 µs) : 0, 589
Telemetry [baseline] (8.656 ms) : 0, 8656
Telemetry [candidate] (8.724 ms) : 0, 8724
Flare Poller [baseline] (8.583 ms) : 0, 8583
Flare Poller [candidate] (7.931 ms) : 0, 7931
section iast
crashtracking [baseline] (1.195 ms) : 0, 1195
crashtracking [candidate] (1.22 ms) : 0, 1220
BytebuddyAgent [baseline] (797.115 ms) : 0, 797115
BytebuddyAgent [candidate] (806.043 ms) : 0, 806043
AgentMeter [baseline] (11.323 ms) : 0, 11323
AgentMeter [candidate] (11.593 ms) : 0, 11593
GlobalTracer [baseline] (247.613 ms) : 0, 247613
GlobalTracer [candidate] (249.182 ms) : 0, 249182
AppSec [baseline] (26.429 ms) : 0, 26429
AppSec [candidate] (27.295 ms) : 0, 27295
Debugger [baseline] (62.922 ms) : 0, 62922
Debugger [candidate] (62.261 ms) : 0, 62261
Remote Config [baseline] (519.083 µs) : 0, 519
Remote Config [candidate] (536.438 µs) : 0, 536
Telemetry [baseline] (14.892 ms) : 0, 14892
Telemetry [candidate] (14.905 ms) : 0, 14905
Flare Poller [baseline] (4.928 ms) : 0, 4928
Flare Poller [candidate] (4.86 ms) : 0, 4860
IAST [baseline] (25.184 ms) : 0, 25184
IAST [candidate] (25.35 ms) : 0, 25350
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.61.0-SNAPSHOT~4c54b6b488, baseline=1.61.0-SNAPSHOT~cc122288e5
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.07 s) : 0, 1069528
Total [baseline] (11.142 s) : 0, 11141810
Agent [candidate] (1.067 s) : 0, 1066995
Total [candidate] (11.17 s) : 0, 11170191
section appsec
Agent [baseline] (1.243 s) : 0, 1242947
Total [baseline] (11.149 s) : 0, 11148507
Agent [candidate] (1.245 s) : 0, 1245413
Total [candidate] (11.125 s) : 0, 11124694
section iast
Agent [baseline] (1.238 s) : 0, 1237926
Total [baseline] (11.291 s) : 0, 11291196
Agent [candidate] (1.228 s) : 0, 1228480
Total [candidate] (11.372 s) : 0, 11371948
section profiling
Agent [baseline] (1.182 s) : 0, 1182394
Total [baseline] (10.977 s) : 0, 10976588
Agent [candidate] (1.188 s) : 0, 1187847
Total [candidate] (11.055 s) : 0, 11055256
gantt
title petclinic - break down per module: candidate=1.61.0-SNAPSHOT~4c54b6b488, baseline=1.61.0-SNAPSHOT~cc122288e5
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.197 ms) : 0, 1197
crashtracking [candidate] (1.199 ms) : 0, 1199
BytebuddyAgent [baseline] (634.476 ms) : 0, 634476
BytebuddyAgent [candidate] (633.475 ms) : 0, 633475
AgentMeter [baseline] (29.4 ms) : 0, 29400
AgentMeter [candidate] (29.319 ms) : 0, 29319
GlobalTracer [baseline] (259.307 ms) : 0, 259307
GlobalTracer [candidate] (259.182 ms) : 0, 259182
AppSec [baseline] (31.845 ms) : 0, 31845
AppSec [candidate] (31.839 ms) : 0, 31839
Debugger [baseline] (59.921 ms) : 0, 59921
Debugger [candidate] (59.923 ms) : 0, 59923
Remote Config [baseline] (593.595 µs) : 0, 594
Remote Config [candidate] (598.377 µs) : 0, 598
Telemetry [baseline] (8.696 ms) : 0, 8696
Telemetry [candidate] (8.674 ms) : 0, 8674
Flare Poller [baseline] (7.884 ms) : 0, 7884
Flare Poller [candidate] (6.583 ms) : 0, 6583
section appsec
crashtracking [baseline] (1.181 ms) : 0, 1181
crashtracking [candidate] (1.188 ms) : 0, 1188
BytebuddyAgent [baseline] (655.606 ms) : 0, 655606
BytebuddyAgent [candidate] (656.589 ms) : 0, 656589
AgentMeter [baseline] (12.065 ms) : 0, 12065
AgentMeter [candidate] (12.059 ms) : 0, 12059
GlobalTracer [baseline] (258.092 ms) : 0, 258092
GlobalTracer [candidate] (258.927 ms) : 0, 258927
IAST [baseline] (23.858 ms) : 0, 23858
IAST [candidate] (23.935 ms) : 0, 23935
AppSec [baseline] (177.098 ms) : 0, 177098
AppSec [candidate] (177.602 ms) : 0, 177602
Debugger [baseline] (65.7 ms) : 0, 65700
Debugger [candidate] (65.73 ms) : 0, 65730
Remote Config [baseline] (568.268 µs) : 0, 568
Remote Config [candidate] (566.962 µs) : 0, 567
Telemetry [baseline] (8.909 ms) : 0, 8909
Telemetry [candidate] (8.916 ms) : 0, 8916
Flare Poller [baseline] (3.607 ms) : 0, 3607
Flare Poller [candidate] (3.636 ms) : 0, 3636
section iast
crashtracking [baseline] (1.216 ms) : 0, 1216
crashtracking [candidate] (1.19 ms) : 0, 1190
BytebuddyAgent [baseline] (803.829 ms) : 0, 803829
BytebuddyAgent [candidate] (797.12 ms) : 0, 797120
AgentMeter [baseline] (11.646 ms) : 0, 11646
AgentMeter [candidate] (11.342 ms) : 0, 11342
GlobalTracer [baseline] (249.08 ms) : 0, 249080
GlobalTracer [candidate] (247.398 ms) : 0, 247398
IAST [baseline] (25.449 ms) : 0, 25449
IAST [candidate] (25.124 ms) : 0, 25124
AppSec [baseline] (26.725 ms) : 0, 26725
AppSec [candidate] (26.394 ms) : 0, 26394
Debugger [baseline] (63.473 ms) : 0, 63473
Debugger [candidate] (63.654 ms) : 0, 63654
Remote Config [baseline] (531.874 µs) : 0, 532
Remote Config [candidate] (527.722 µs) : 0, 528
Telemetry [baseline] (14.816 ms) : 0, 14816
Telemetry [candidate] (14.856 ms) : 0, 14856
Flare Poller [baseline] (4.928 ms) : 0, 4928
Flare Poller [candidate] (4.875 ms) : 0, 4875
section profiling
crashtracking [baseline] (1.168 ms) : 0, 1168
crashtracking [candidate] (1.183 ms) : 0, 1183
BytebuddyAgent [baseline] (683.299 ms) : 0, 683299
BytebuddyAgent [candidate] (685.944 ms) : 0, 685944
AgentMeter [baseline] (8.621 ms) : 0, 8621
AgentMeter [candidate] (8.653 ms) : 0, 8653
GlobalTracer [baseline] (215.099 ms) : 0, 215099
GlobalTracer [candidate] (216.618 ms) : 0, 216618
AppSec [baseline] (32.0 ms) : 0, 32000
AppSec [candidate] (32.052 ms) : 0, 32052
Debugger [baseline] (63.596 ms) : 0, 63596
Debugger [candidate] (63.838 ms) : 0, 63838
Remote Config [baseline] (588.626 µs) : 0, 589
Remote Config [candidate] (589.348 µs) : 0, 589
Telemetry [baseline] (9.071 ms) : 0, 9071
Telemetry [candidate] (9.826 ms) : 0, 9826
Flare Poller [baseline] (4.273 ms) : 0, 4273
Flare Poller [candidate] (3.521 ms) : 0, 3521
ProfilingAgent [baseline] (93.978 ms) : 0, 93978
ProfilingAgent [candidate] (94.479 ms) : 0, 94479
Profiling [baseline] (94.556 ms) : 0, 94556
Profiling [candidate] (95.036 ms) : 0, 95036
LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 2 performance regressions! Performance is the same for 18 metrics, 16 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~4c54b6b488, baseline=1.61.0-SNAPSHOT~cc122288e5
dateFormat X
axisFormat %s
section baseline
no_agent (16.977 ms) : 16811, 17143
. : milestone, 16977,
appsec (18.653 ms) : 18461, 18845
. : milestone, 18653,
code_origins (17.905 ms) : 17726, 18085
. : milestone, 17905,
iast (18.51 ms) : 18325, 18695
. : milestone, 18510,
profiling (18.67 ms) : 18483, 18857
. : milestone, 18670,
tracing (17.722 ms) : 17546, 17898
. : milestone, 17722,
section candidate
no_agent (17.928 ms) : 17746, 18110
. : milestone, 17928,
appsec (18.519 ms) : 18331, 18707
. : milestone, 18519,
code_origins (18.122 ms) : 17943, 18302
. : milestone, 18122,
iast (18.703 ms) : 18512, 18894
. : milestone, 18703,
profiling (18.4 ms) : 18215, 18585
. : milestone, 18400,
tracing (17.634 ms) : 17456, 17812
. : milestone, 17634,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~4c54b6b488, baseline=1.61.0-SNAPSHOT~cc122288e5
dateFormat X
axisFormat %s
section baseline
no_agent (1.179 ms) : 1168, 1191
. : milestone, 1179,
iast (3.277 ms) : 3229, 3325
. : milestone, 3277,
iast_FULL (5.878 ms) : 5819, 5937
. : milestone, 5878,
iast_GLOBAL (3.422 ms) : 3375, 3469
. : milestone, 3422,
profiling (2.104 ms) : 2084, 2123
. : milestone, 2104,
tracing (1.811 ms) : 1795, 1827
. : milestone, 1811,
section candidate
no_agent (1.184 ms) : 1174, 1195
. : milestone, 1184,
iast (3.195 ms) : 3153, 3237
. : milestone, 3195,
iast_FULL (5.73 ms) : 5674, 5787
. : milestone, 5730,
iast_GLOBAL (3.604 ms) : 3543, 3664
. : milestone, 3604,
profiling (2.079 ms) : 2061, 2098
. : milestone, 2079,
tracing (1.759 ms) : 1744, 1774
. : milestone, 1759,
DacapoParameters
See matching parameters
SummaryFound 1 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 0 unstable metrics.
Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~4c54b6b488, baseline=1.61.0-SNAPSHOT~cc122288e5
dateFormat X
axisFormat %s
section baseline
no_agent (1.479 ms) : 1467, 1491
. : milestone, 1479,
appsec (3.825 ms) : 3603, 4046
. : milestone, 3825,
iast (2.27 ms) : 2200, 2340
. : milestone, 2270,
iast_GLOBAL (2.305 ms) : 2235, 2375
. : milestone, 2305,
profiling (2.096 ms) : 2040, 2152
. : milestone, 2096,
tracing (2.071 ms) : 2016, 2125
. : milestone, 2071,
section candidate
no_agent (1.479 ms) : 1467, 1490
. : milestone, 1479,
appsec (2.54 ms) : 2484, 2597
. : milestone, 2540,
iast (2.278 ms) : 2207, 2349
. : milestone, 2278,
iast_GLOBAL (2.309 ms) : 2238, 2380
. : milestone, 2309,
profiling (2.11 ms) : 2053, 2167
. : milestone, 2110,
tracing (2.082 ms) : 2027, 2136
. : milestone, 2082,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~4c54b6b488, baseline=1.61.0-SNAPSHOT~cc122288e5
dateFormat X
axisFormat %s
section baseline
no_agent (15.64 s) : 15640000, 15640000
. : milestone, 15640000,
appsec (14.909 s) : 14909000, 14909000
. : milestone, 14909000,
iast (18.227 s) : 18227000, 18227000
. : milestone, 18227000,
iast_GLOBAL (17.529 s) : 17529000, 17529000
. : milestone, 17529000,
profiling (14.967 s) : 14967000, 14967000
. : milestone, 14967000,
tracing (15.086 s) : 15086000, 15086000
. : milestone, 15086000,
section candidate
no_agent (14.984 s) : 14984000, 14984000
. : milestone, 14984000,
appsec (14.837 s) : 14837000, 14837000
. : milestone, 14837000,
iast (18.206 s) : 18206000, 18206000
. : milestone, 18206000,
iast_GLOBAL (18.032 s) : 18032000, 18032000
. : milestone, 18032000,
profiling (15.502 s) : 15502000, 15502000
. : milestone, 15502000,
tracing (15.091 s) : 15091000, 15091000
. : milestone, 15091000,
|
e413d1d to
89df516
Compare
9594586 to
cb6648d
Compare
89df516 to
8651527
Compare
cb6648d to
780cd6f
Compare
Contributor
Author
This stack of pull requests is managed by Graphite. Learn more about stacking. |
780cd6f to
a43ff79
Compare
8651527 to
7e4b7de
Compare
pawel-big-lebowski
approved these changes
Mar 5, 2026
Contributor
pawel-big-lebowski
left a comment
There was a problem hiding this comment.
It’s always better to keep the primary types in memory. Great enhancement.
Base automatically changed from
charles.yu/djm-0000/fix-spark-plan-metrics
to
master
March 10, 2026 19:06
a43ff79 to
b1b3e4c
Compare
mhlidd
approved these changes
Mar 10, 2026
| // stage ID -> accumulator ID? Put this behind some FF | ||
| private final Map<Long, SparkSQLUtils.AccumulatorWithStage> accumulators = | ||
| new RemoveEldestHashMap<>(MAX_ACCUMULATOR_SIZE); | ||
| private final Map<Long, Integer> acc2stage = new HashMap<>(); |
Contributor
There was a problem hiding this comment.
Suggested change
| private final Map<Long, Integer> acc2stage = new HashMap<>(); | |
| private final Map<Long, Integer> accumulatorToStageID = new HashMap<>(); |
nit: I just find this easier to understand
Contributor
Author
There was a problem hiding this comment.
|
|
||
| private static Set<Integer> stageIdsForPlan( | ||
| SparkPlanInfo info, Map<Long, AccumulatorWithStage> accumulators) { | ||
| private static Set<Integer> stageIdsForPlan(SparkPlanInfo info, Map<Long, Integer> accumulators) { |
Contributor
There was a problem hiding this comment.
Suggested change
| private static Set<Integer> stageIdsForPlan(SparkPlanInfo info, Map<Long, Integer> accumulators) { | |
| private static Set<Integer> stageIdsForPlan(SparkPlanInfo info, Map<Long, Integer> accumulatorToStageID) { |
nit: Same as above
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

What Does This Do
Stores the map of accumulator ID to stage ID directly in a
Long-Integermap, rather than storing the stage-level accumulators themselves.Motivation
The stage-level accumulators naively roll up metrics from the task level by summing all values. This means they are not accurate in all cases, particular for metrics that are better visualized as a distribution of values across all tasks instead of a single sum.
In #10553 we rollup task-level metrics ourselves and encode them into the Spark SQL metrics as distributions in order to improve the granularity of information collected. This left one remaining use of the stage-level accumulators - mapping operations in Spark SQL plans to their respective stages.
Since we do not need the entire accumulator to accomplish this, we should simplify that to a ID-ID map instead to save space. We can also remove the 50k limit since we are not storing the entire accumulator, allowing us to avoid creating orphaned operations (previously occurred when the
EldestHashMapoverflowed).Additional Notes
I didn't feel the explicit need for a FF here, since the change is fairly straightforwards and shouldn't negatively impact perf. However, if we feel strongly otherwise I'm happy to make that happen.
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.