Add Claude skill to create instrumentations by PerfectSlayer · Pull Request #10774 · DataDog/dd-trace-java

PerfectSlayer · 2026-03-09T16:56:04Z

What Does This Do

This PR introduces a Claude skill to create APM integrations.

Motivation

This is part of the experimentation to get APM Instrumentation Toolkit integration with dd-trace-java.

Additional Notes

I tried to include upgrade and feedback directly from the skill. I expect it to improve itself overtime with usage.

Contributor Checklist

Format the title according to the contribution guidelines
Assign the type: and (comp: or inst:) labels in addition to any other useful labels
Avoid using close, fix, or any linking keywords when referencing an issue
Use solves instead, and assign the PR milestone to the issue
Update the CODEOWNERS file on source file addition, migration, or deletion
Update public documentation with any new configuration flags or behaviors

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

pr-commenter · 2026-03-09T17:37:05Z

Benchmarks

Startup

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	bbujon/ai-toolkit
git_commit_date	1773234317	1773323658
git_commit_sha	`7be2605`	`37136b7`
release_version	1.61.0-SNAPSHOT~7be26056d4	1.61.0-SNAPSHOT~37136b760d

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1773325568	1773325568
ci_job_id	1500365983	1500365983
ci_pipeline_id	102132963	102132963
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-vmmvs6bv 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-vmmvs6bv 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module	Agent	Agent
parent	None	None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 62 metrics, 9 unstable metrics.

Startup time reports for insecure-bank

gantt
    title insecure-bank - global startup overhead: candidate=1.61.0-SNAPSHOT~37136b760d, baseline=1.61.0-SNAPSHOT~7be26056d4

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.057 s) : 0, 1057371
Total [baseline] (8.818 s) : 0, 8817871
Agent [candidate] (1.056 s) : 0, 1056098
Total [candidate] (8.796 s) : 0, 8796163
section iast
Agent [baseline] (1.243 s) : 0, 1243022
Total [baseline] (9.569 s) : 0, 9569313
Agent [candidate] (1.228 s) : 0, 1227629
Total [candidate] (9.534 s) : 0, 9533873

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.057 s	-
Agent	iast	1.243 s	185.651 ms (17.6%)
Total	tracing	8.818 s	-
Total	iast	9.569 s	751.442 ms (8.5%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.056 s	-
Agent	iast	1.228 s	171.531 ms (16.2%)
Total	tracing	8.796 s	-
Total	iast	9.534 s	737.71 ms (8.4%)

gantt
    title insecure-bank - break down per module: candidate=1.61.0-SNAPSHOT~37136b760d, baseline=1.61.0-SNAPSHOT~7be26056d4

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.191 ms) : 0, 1191
crashtracking [candidate] (1.198 ms) : 0, 1198
BytebuddyAgent [baseline] (628.018 ms) : 0, 628018
BytebuddyAgent [candidate] (627.637 ms) : 0, 627637
AgentMeter [baseline] (29.201 ms) : 0, 29201
AgentMeter [candidate] (29.038 ms) : 0, 29038
GlobalTracer [baseline] (257.009 ms) : 0, 257009
GlobalTracer [candidate] (256.441 ms) : 0, 256441
AppSec [baseline] (31.576 ms) : 0, 31576
AppSec [candidate] (31.512 ms) : 0, 31512
Debugger [baseline] (58.589 ms) : 0, 58589
Debugger [candidate] (58.689 ms) : 0, 58689
Remote Config [baseline] (583.838 µs) : 0, 584
Remote Config [candidate] (600.871 µs) : 0, 601
Telemetry [baseline] (8.661 ms) : 0, 8661
Telemetry [candidate] (8.659 ms) : 0, 8659
Flare Poller [baseline] (6.495 ms) : 0, 6495
Flare Poller [candidate] (6.383 ms) : 0, 6383
section iast
crashtracking [baseline] (1.231 ms) : 0, 1231
crashtracking [candidate] (1.191 ms) : 0, 1191
BytebuddyAgent [baseline] (808.782 ms) : 0, 808782
BytebuddyAgent [candidate] (796.79 ms) : 0, 796790
AgentMeter [baseline] (11.801 ms) : 0, 11801
AgentMeter [candidate] (11.329 ms) : 0, 11329
GlobalTracer [baseline] (249.677 ms) : 0, 249677
GlobalTracer [candidate] (247.551 ms) : 0, 247551
IAST [baseline] (25.506 ms) : 0, 25506
IAST [candidate] (25.229 ms) : 0, 25229
AppSec [baseline] (26.797 ms) : 0, 26797
AppSec [candidate] (26.505 ms) : 0, 26505
Debugger [baseline] (62.828 ms) : 0, 62828
Debugger [candidate] (62.81 ms) : 0, 62810
Remote Config [baseline] (531.083 µs) : 0, 531
Remote Config [candidate] (524.809 µs) : 0, 525
Telemetry [baseline] (14.942 ms) : 0, 14942
Telemetry [candidate] (14.778 ms) : 0, 14778
Flare Poller [baseline] (4.656 ms) : 0, 4656
Flare Poller [candidate] (4.701 ms) : 0, 4701

Startup time reports for petclinic

gantt
    title petclinic - global startup overhead: candidate=1.61.0-SNAPSHOT~37136b760d, baseline=1.61.0-SNAPSHOT~7be26056d4

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.059 s) : 0, 1058646
Total [baseline] (11.059 s) : 0, 11058542
Agent [candidate] (1.06 s) : 0, 1060335
Total [candidate] (11.025 s) : 0, 11024891
section appsec
Agent [baseline] (1.247 s) : 0, 1246696
Total [baseline] (11.147 s) : 0, 11147238
Agent [candidate] (1.249 s) : 0, 1248881
Total [candidate] (11.246 s) : 0, 11245540
section iast
Agent [baseline] (1.226 s) : 0, 1226281
Total [baseline] (11.228 s) : 0, 11228326
Agent [candidate] (1.228 s) : 0, 1228067
Total [candidate] (11.252 s) : 0, 11251993
section profiling
Agent [baseline] (1.179 s) : 0, 1178872
Total [baseline] (10.995 s) : 0, 10994567
Agent [candidate] (1.183 s) : 0, 1183458
Total [candidate] (10.969 s) : 0, 10969457

baseline results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.059 s	-
Agent	appsec	1.247 s	188.05 ms (17.8%)
Agent	iast	1.226 s	167.635 ms (15.8%)
Agent	profiling	1.179 s	120.227 ms (11.4%)
Total	tracing	11.059 s	-
Total	appsec	11.147 s	88.695 ms (0.8%)
Total	iast	11.228 s	169.784 ms (1.5%)
Total	profiling	10.995 s	-63.975 ms (-0.6%)

candidate results

Module	Variant	Duration	Δ tracing
Agent	tracing	1.06 s	-
Agent	appsec	1.249 s	188.546 ms (17.8%)
Agent	iast	1.228 s	167.731 ms (15.8%)
Agent	profiling	1.183 s	123.123 ms (11.6%)
Total	tracing	11.025 s	-
Total	appsec	11.246 s	220.649 ms (2.0%)
Total	iast	11.252 s	227.103 ms (2.1%)
Total	profiling	10.969 s	-55.434 ms (-0.5%)

gantt
    title petclinic - break down per module: candidate=1.61.0-SNAPSHOT~37136b760d, baseline=1.61.0-SNAPSHOT~7be26056d4

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.194 ms) : 0, 1194
crashtracking [candidate] (1.193 ms) : 0, 1193
BytebuddyAgent [baseline] (628.682 ms) : 0, 628682
BytebuddyAgent [candidate] (629.781 ms) : 0, 629781
AgentMeter [baseline] (29.183 ms) : 0, 29183
AgentMeter [candidate] (29.244 ms) : 0, 29244
GlobalTracer [baseline] (257.34 ms) : 0, 257340
GlobalTracer [candidate] (257.722 ms) : 0, 257722
AppSec [baseline] (31.626 ms) : 0, 31626
AppSec [candidate] (31.601 ms) : 0, 31601
Debugger [baseline] (59.589 ms) : 0, 59589
Debugger [candidate] (59.732 ms) : 0, 59732
Remote Config [baseline] (599.636 µs) : 0, 600
Remote Config [candidate] (585.116 µs) : 0, 585
Telemetry [baseline] (8.686 ms) : 0, 8686
Telemetry [candidate] (8.658 ms) : 0, 8658
Flare Poller [baseline] (5.721 ms) : 0, 5721
Flare Poller [candidate] (5.763 ms) : 0, 5763
section appsec
crashtracking [baseline] (1.204 ms) : 0, 1204
crashtracking [candidate] (1.208 ms) : 0, 1208
BytebuddyAgent [baseline] (658.901 ms) : 0, 658901
BytebuddyAgent [candidate] (659.915 ms) : 0, 659915
AgentMeter [baseline] (12.033 ms) : 0, 12033
AgentMeter [candidate] (12.025 ms) : 0, 12025
GlobalTracer [baseline] (258.004 ms) : 0, 258004
GlobalTracer [candidate] (258.711 ms) : 0, 258711
IAST [baseline] (23.975 ms) : 0, 23975
IAST [candidate] (23.963 ms) : 0, 23963
AppSec [baseline] (177.617 ms) : 0, 177617
AppSec [candidate] (177.679 ms) : 0, 177679
Debugger [baseline] (65.539 ms) : 0, 65539
Debugger [candidate] (65.779 ms) : 0, 65779
Remote Config [baseline] (573.695 µs) : 0, 574
Remote Config [candidate] (574.116 µs) : 0, 574
Telemetry [baseline] (8.98 ms) : 0, 8980
Telemetry [candidate] (9.052 ms) : 0, 9052
Flare Poller [baseline] (3.607 ms) : 0, 3607
Flare Poller [candidate] (3.603 ms) : 0, 3603
section iast
crashtracking [baseline] (1.187 ms) : 0, 1187
crashtracking [candidate] (1.189 ms) : 0, 1189
BytebuddyAgent [baseline] (796.05 ms) : 0, 796050
BytebuddyAgent [candidate] (796.636 ms) : 0, 796636
AgentMeter [baseline] (11.332 ms) : 0, 11332
AgentMeter [candidate] (11.353 ms) : 0, 11353
GlobalTracer [baseline] (247.582 ms) : 0, 247582
GlobalTracer [candidate] (247.274 ms) : 0, 247274
IAST [baseline] (25.08 ms) : 0, 25080
IAST [candidate] (25.128 ms) : 0, 25128
AppSec [baseline] (26.32 ms) : 0, 26320
AppSec [candidate] (26.447 ms) : 0, 26447
Debugger [baseline] (62.848 ms) : 0, 62848
Debugger [candidate] (64.339 ms) : 0, 64339
Remote Config [baseline] (520.963 µs) : 0, 521
Remote Config [candidate] (531.991 µs) : 0, 532
Telemetry [baseline] (14.831 ms) : 0, 14831
Telemetry [candidate] (14.719 ms) : 0, 14719
Flare Poller [baseline] (4.653 ms) : 0, 4653
Flare Poller [candidate] (4.453 ms) : 0, 4453
section profiling
crashtracking [baseline] (1.164 ms) : 0, 1164
crashtracking [candidate] (1.166 ms) : 0, 1166
BytebuddyAgent [baseline] (679.911 ms) : 0, 679911
BytebuddyAgent [candidate] (683.002 ms) : 0, 683002
AgentMeter [baseline] (8.603 ms) : 0, 8603
AgentMeter [candidate] (8.623 ms) : 0, 8623
GlobalTracer [baseline] (215.366 ms) : 0, 215366
GlobalTracer [candidate] (215.837 ms) : 0, 215837
AppSec [baseline] (31.932 ms) : 0, 31932
AppSec [candidate] (32.035 ms) : 0, 32035
Debugger [baseline] (65.106 ms) : 0, 65106
Debugger [candidate] (62.059 ms) : 0, 62059
Remote Config [baseline] (578.177 µs) : 0, 578
Remote Config [candidate] (574.626 µs) : 0, 575
Telemetry [baseline] (8.131 ms) : 0, 8131
Telemetry [candidate] (10.644 ms) : 0, 10644
Flare Poller [baseline] (3.499 ms) : 0, 3499
Flare Poller [candidate] (4.28 ms) : 0, 4280
ProfilingAgent [baseline] (93.753 ms) : 0, 93753
ProfilingAgent [candidate] (94.388 ms) : 0, 94388
Profiling [baseline] (94.319 ms) : 0, 94319
Profiling [candidate] (94.958 ms) : 0, 94958

Load

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	bbujon/ai-toolkit
git_commit_date	1773234317	1773323658
git_commit_sha	`7be2605`	`37136b7`
release_version	1.61.0-SNAPSHOT~7be26056d4	1.61.0-SNAPSHOT~37136b760d

See matching parameters

	Baseline	Candidate
application	insecure-bank	insecure-bank
ci_job_date	1773326049	1773326049
ci_job_id	1500365985	1500365985
ci_pipeline_id	102132963	102132963
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-0-zmycl00v 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-0-zmycl00v 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 2 performance improvements and 4 performance regressions! Performance is the same for 11 metrics, 19 unstable metrics.

scenario	Δ mean agg_http_req_duration_p50	Δ mean agg_http_req_duration_p95	Δ mean throughput	candidate mean agg_http_req_duration_p50	candidate mean agg_http_req_duration_p95	candidate mean throughput	baseline mean agg_http_req_duration_p50	baseline mean agg_http_req_duration_p95	baseline mean throughput
scenario:load:insecure-bank:iast:high_load	better [-279.232µs; -125.948µs] or [-10.542%; -4.755%]	unstable [-1056.993µs; -149.788µs] or [-13.254%; -1.878%]	unstable [-54.577op/s; +295.139op/s] or [-4.103%; +22.189%]	2.446ms	7.372ms	1450.406op/s	2.649ms	7.975ms	1330.125op/s
scenario:load:insecure-bank:iast_GLOBAL:high_load	worse [+77.753µs; +225.969µs] or [+2.901%; +8.431%]	unstable [+268.568µs; +1131.914µs] or [+3.520%; +14.836%]	unstable [-229.144op/s; +94.706op/s] or [-17.343%; +7.168%]	2.832ms	8.330ms	1254.000op/s	2.680ms	7.630ms	1321.219op/s
scenario:load:insecure-bank:iast_FULL:high_load	better [-558.971µs; -113.234µs] or [-10.364%; -2.100%]	unstable [-1285.243µs; +108.582µs] or [-10.094%; +0.853%]	unstable [-103.006op/s; +101.705op/s] or [-13.510%; +13.339%]	5.057ms	12.145ms	761.818op/s	5.393ms	12.733ms	762.469op/s
scenario:load:petclinic:code_origins:high_load	worse [+377.532µs; +1042.448µs] or [+2.210%; +6.103%]	unsure [+177.513µs; +1179.699µs] or [+0.627%; +4.168%]	unstable [-35.883op/s; +17.133op/s] or [-13.435%; +6.415%]	17.790ms	28.983ms	257.719op/s	17.080ms	28.305ms	267.094op/s
scenario:load:petclinic:appsec:high_load	worse [+424.992µs; +1254.475µs] or [+2.297%; +6.780%]	unsure [+0.455ms; +1.681ms] or [+1.496%; +5.532%]	unstable [-33.759op/s; +15.384op/s] or [-13.643%; +6.217%]	19.342ms	31.457ms	238.250op/s	18.502ms	30.389ms	247.438op/s
scenario:load:petclinic:no_agent:high_load	worse [+1.619ms; +3.036ms] or [+9.525%; +17.868%]	unstable [+1.356ms; +4.953ms] or [+4.719%; +17.240%]	unstable [-56.582op/s; -2.043op/s] or [-21.172%; -0.765%]	19.320ms	31.881ms	237.938op/s	16.992ms	28.727ms	267.250op/s

Request duration reports for petclinic

gantt
    title petclinic - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~37136b760d, baseline=1.61.0-SNAPSHOT~7be26056d4
    dateFormat X
    axisFormat %s
section baseline
no_agent (17.454 ms) : 17279, 17630
.   : milestone, 17454,
appsec (18.86 ms) : 18666, 19054
.   : milestone, 18860,
code_origins (17.47 ms) : 17297, 17642
.   : milestone, 17470,
iast (17.539 ms) : 17363, 17715
.   : milestone, 17539,
profiling (18.645 ms) : 18460, 18830
.   : milestone, 18645,
tracing (17.832 ms) : 17656, 18008
.   : milestone, 17832,
section candidate
no_agent (19.617 ms) : 19417, 19818
.   : milestone, 19617,
appsec (19.597 ms) : 19396, 19798
.   : milestone, 19597,
code_origins (18.107 ms) : 17925, 18290
.   : milestone, 18107,
iast (17.678 ms) : 17502, 17854
.   : milestone, 17678,
profiling (18.916 ms) : 18729, 19104
.   : milestone, 18916,
tracing (17.744 ms) : 17565, 17923
.   : milestone, 17744,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	17.454 ms [17.279 ms, 17.63 ms]	-
appsec	18.86 ms [18.666 ms, 19.054 ms]	1.405 ms (8.1%)
code_origins	17.47 ms [17.297 ms, 17.642 ms]	15.249 µs (0.1%)
iast	17.539 ms [17.363 ms, 17.715 ms]	84.332 µs (0.5%)
profiling	18.645 ms [18.46 ms, 18.83 ms]	1.191 ms (6.8%)
tracing	17.832 ms [17.656 ms, 18.008 ms]	378.023 µs (2.2%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	19.617 ms [19.417 ms, 19.818 ms]	-
appsec	19.597 ms [19.396 ms, 19.798 ms]	-20.543 µs (-0.1%)
code_origins	18.107 ms [17.925 ms, 18.29 ms]	-1.51 ms (-7.7%)
iast	17.678 ms [17.502 ms, 17.854 ms]	-1.939 ms (-9.9%)
profiling	18.916 ms [18.729 ms, 19.104 ms]	-700.876 µs (-3.6%)
tracing	17.744 ms [17.565 ms, 17.923 ms]	-1.873 ms (-9.5%)

Request duration reports for insecure-bank

gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.61.0-SNAPSHOT~37136b760d, baseline=1.61.0-SNAPSHOT~7be26056d4
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.188 ms) : 1176, 1199
.   : milestone, 1188,
iast (3.446 ms) : 3397, 3494
.   : milestone, 3446,
iast_FULL (6.066 ms) : 6004, 6129
.   : milestone, 6066,
iast_GLOBAL (3.469 ms) : 3409, 3528
.   : milestone, 3469,
profiling (2.213 ms) : 2192, 2234
.   : milestone, 2213,
tracing (1.817 ms) : 1800, 1833
.   : milestone, 1817,
section candidate
no_agent (1.172 ms) : 1161, 1183
.   : milestone, 1172,
iast (3.154 ms) : 3113, 3194
.   : milestone, 3154,
iast_FULL (5.887 ms) : 5828, 5946
.   : milestone, 5887,
iast_GLOBAL (3.659 ms) : 3601, 3716
.   : milestone, 3659,
profiling (1.982 ms) : 1965, 1999
.   : milestone, 1982,
tracing (1.824 ms) : 1809, 1839
.   : milestone, 1824,

baseline results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.188 ms [1.176 ms, 1.199 ms]	-
iast	3.446 ms [3.397 ms, 3.494 ms]	2.258 ms (190.1%)
iast_FULL	6.066 ms [6.004 ms, 6.129 ms]	4.879 ms (410.7%)
iast_GLOBAL	3.469 ms [3.409 ms, 3.528 ms]	2.281 ms (192.0%)
profiling	2.213 ms [2.192 ms, 2.234 ms]	1.025 ms (86.3%)
tracing	1.817 ms [1.8 ms, 1.833 ms]	628.877 µs (52.9%)

candidate results

Variant	Request duration [CI 0.99]	Δ no_agent
no_agent	1.172 ms [1.161 ms, 1.183 ms]	-
iast	3.154 ms [3.113 ms, 3.194 ms]	1.982 ms (169.1%)
iast_FULL	5.887 ms [5.828 ms, 5.946 ms]	4.715 ms (402.3%)
iast_GLOBAL	3.659 ms [3.601 ms, 3.716 ms]	2.487 ms (212.2%)
profiling	1.982 ms [1.965 ms, 1.999 ms]	810.165 µs (69.1%)
tracing	1.824 ms [1.809 ms, 1.839 ms]	651.982 µs (55.6%)

Dacapo

Parameters

	Baseline	Candidate
baseline_or_candidate	baseline	candidate
git_branch	master	bbujon/ai-toolkit
git_commit_date	1773234317	1773323658
git_commit_sha	`7be2605`	`37136b7`
release_version	1.61.0-SNAPSHOT~7be26056d4	1.61.0-SNAPSHOT~37136b760d

See matching parameters

	Baseline	Candidate
application	biojava	biojava
ci_job_date	1773325778	1773325778
ci_job_id	1500365987	1500365987
ci_pipeline_id	102132963	102132963
cpu_model	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz	Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version	Linux runner-zfyrx7zua-project-304-concurrent-1-7pw35xg4 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux	Linux runner-zfyrx7zua-project-304-concurrent-1-7pw35xg4 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics.

Execution time for tomcat

gantt
    title tomcat - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~37136b760d, baseline=1.61.0-SNAPSHOT~7be26056d4
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.471 ms) : 1460, 1483
.   : milestone, 1471,
appsec (2.516 ms) : 2461, 2572
.   : milestone, 2516,
iast (2.253 ms) : 2184, 2322
.   : milestone, 2253,
iast_GLOBAL (2.296 ms) : 2226, 2366
.   : milestone, 2296,
profiling (2.082 ms) : 2027, 2136
.   : milestone, 2082,
tracing (2.077 ms) : 2023, 2131
.   : milestone, 2077,
section candidate
no_agent (1.475 ms) : 1463, 1486
.   : milestone, 1475,
appsec (2.516 ms) : 2461, 2572
.   : milestone, 2516,
iast (2.265 ms) : 2195, 2334
.   : milestone, 2265,
iast_GLOBAL (2.295 ms) : 2225, 2366
.   : milestone, 2295,
profiling (2.497 ms) : 2332, 2663
.   : milestone, 2497,
tracing (2.067 ms) : 2013, 2120
.   : milestone, 2067,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.471 ms [1.46 ms, 1.483 ms]	-
appsec	2.516 ms [2.461 ms, 2.572 ms]	1.045 ms (71.0%)
iast	2.253 ms [2.184 ms, 2.322 ms]	781.78 µs (53.1%)
iast_GLOBAL	2.296 ms [2.226 ms, 2.366 ms]	824.631 µs (56.1%)
profiling	2.082 ms [2.027 ms, 2.136 ms]	610.556 µs (41.5%)
tracing	2.077 ms [2.023 ms, 2.131 ms]	605.499 µs (41.2%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	1.475 ms [1.463 ms, 1.486 ms]	-
appsec	2.516 ms [2.461 ms, 2.572 ms]	1.042 ms (70.6%)
iast	2.265 ms [2.195 ms, 2.334 ms]	789.798 µs (53.5%)
iast_GLOBAL	2.295 ms [2.225 ms, 2.366 ms]	820.594 µs (55.6%)
profiling	2.497 ms [2.332 ms, 2.663 ms]	1.023 ms (69.3%)
tracing	2.067 ms [2.013 ms, 2.12 ms]	591.605 µs (40.1%)

Execution time for biojava

gantt
    title biojava - execution time [CI 0.99] : candidate=1.61.0-SNAPSHOT~37136b760d, baseline=1.61.0-SNAPSHOT~7be26056d4
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.324 s) : 15324000, 15324000
.   : milestone, 15324000,
appsec (14.951 s) : 14951000, 14951000
.   : milestone, 14951000,
iast (18.31 s) : 18310000, 18310000
.   : milestone, 18310000,
iast_GLOBAL (17.799 s) : 17799000, 17799000
.   : milestone, 17799000,
profiling (15.409 s) : 15409000, 15409000
.   : milestone, 15409000,
tracing (14.89 s) : 14890000, 14890000
.   : milestone, 14890000,
section candidate
no_agent (14.984 s) : 14984000, 14984000
.   : milestone, 14984000,
appsec (14.533 s) : 14533000, 14533000
.   : milestone, 14533000,
iast (18.36 s) : 18360000, 18360000
.   : milestone, 18360000,
iast_GLOBAL (17.774 s) : 17774000, 17774000
.   : milestone, 17774000,
profiling (14.686 s) : 14686000, 14686000
.   : milestone, 14686000,
tracing (15.149 s) : 15149000, 15149000
.   : milestone, 15149000,

baseline results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	15.324 s [15.324 s, 15.324 s]	-
appsec	14.951 s [14.951 s, 14.951 s]	-373.0 ms (-2.4%)
iast	18.31 s [18.31 s, 18.31 s]	2.986 s (19.5%)
iast_GLOBAL	17.799 s [17.799 s, 17.799 s]	2.475 s (16.2%)
profiling	15.409 s [15.409 s, 15.409 s]	85.0 ms (0.6%)
tracing	14.89 s [14.89 s, 14.89 s]	-434.0 ms (-2.8%)

candidate results

Variant	Execution Time [CI 0.99]	Δ no_agent
no_agent	14.984 s [14.984 s, 14.984 s]	-
appsec	14.533 s [14.533 s, 14.533 s]	-451.0 ms (-3.0%)
iast	18.36 s [18.36 s, 18.36 s]	3.376 s (22.5%)
iast_GLOBAL	17.774 s [17.774 s, 17.774 s]	2.79 s (18.6%)
profiling	14.686 s [14.686 s, 14.686 s]	-298.0 ms (-2.0%)
tracing	15.149 s [15.149 s, 15.149 s]	165.0 ms (1.1%)

wconti27 · 2026-03-10T12:15:16Z

.claude/skills/add-apm-integrations/SKILL.md

+## Step 2 – Clarify the task
+
+If the user has not already provided all of the following, ask before proceeding:
+
+- **Framework name** and **minimum supported version** (e.g. `okhttp-3.0`)
+- **Target class(es) and method(s)** to instrument (fully qualified class names preferred)
+- **Target system**: one of `Tracing`, `Profiling`, `AppSec`, `Iast`, `CiVisibility`, `Usm`, `ContextTracking`
+- **Whether this is a bootstrap instrumentation** (affects allowed imports)


Im curious, genuine question, do you know if the ask a user a question works in the current state of the skill, given AskUserQuestion is not in allowed-tools?

I think if it is not in the allowed tools, it will come down to the security rules, the user allowed tools, and ask to use it otherwise. It’s not "allowed by default" but might be useful to add it nonetheless 🤔 Similarly, it will need web search but I don’t want to enabled it by default for security reasons.

wconti27 · 2026-03-10T12:17:33Z

.claude/skills/apm-integrations/SKILL.md

+1. `docs/how_instrumentations_work.md` — full reference (types, methods, advice, helpers, context stores, decorators)
+2. `docs/add_new_instrumentation.md` — step-by-step walkthrough
+3. `docs/how_to_test.md` — test types and how to run them


Generally for reference files, I advise to use proper markdown linking, it does't help the LLM, but it does help engineers to quickly navigate to the files. Just a suggestion 😄

I advise to use proper markdown linking

So you would use something like that?

1. [docs/how_instrumentations_work.md](full reference (types, methods, advice, helpers, context stores, decorators)) 2. [docs/add_new_instrumentation.md](step-by-step walkthrough) 3. [docs/how_to_test.md](test types and how to run them)

wconti27 · 2026-03-10T12:23:57Z

.claude/skills/apm-integrations/SKILL.md

+
+Before writing any code, read all three files in full:
+
+1. `docs/how_instrumentations_work.md` — full reference (types, methods, advice, helpers, context stores, decorators)


I wonder how it will perform given this reference is almost 1k lines, I'm not sure tbh.

Good question... Not sure either but it is ingesting many documentation files and instrumentations before starting implementation, but it looks like doing it using subagent. So we might be in the clear about context management.

Here is a report about creating (again) the Feign instrumentation:

Direct reads (Read tool) ┌──────────────────────────────────────────────────────────────┬───────────────────────────┐ │ File │ Lines │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/GoogleHttpClientInstrumentation.java │ 121 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/GoogleHttpClientDecorator.java │ 68 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/HeadersInjectAdapter.java │ 16 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/build.gradle │ 21 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/AbstractGoogleHttpClientTest.groovy │ 53 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ google-http-client-1.19/GoogleHttpClientTest.groovy │ 21 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ pekko-http-1.0/HttpHeaderSubclassesInstrumentation.java │ 60 (partial) │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ javax-websocket-1.0/SessionInstrumentation.java │ 60 (partial) │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ apache-httpclient-4.0/HelperMethods.java │ 76 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ apache-httpclient-4.0/ApacheHttpClientInstrumentation.java │ 277 │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ settings.gradle.kts │ 10 (partial) │ ├──────────────────────────────────────────────────────────────┼───────────────────────────┤ │ Total │ 783 lines across 11 files │ └──────────────────────────────────────────────────────────────┴───────────────────────────┘ Via grep/bash (content snippets) - HttpClientDecorator.java — abstract method signatures (~15 lines) - HttpClientTest.groovy — abstract method signatures (~10 lines) - Various directory listings and module path lookups Via subagents (delegated research) - Feign API research agent — 13 tool calls, web searches on Feign's API, class hierarchy, Maven coordinates, and version history - HTTP client patterns agent — 34 tool calls, read OkHttp, Apache HttpClient, and Google HTTP Client instrumentation files in full (~1,500 estimated lines across ~12 files) Summary ┌───────────────────┬───────────┬──────────────┐ │ Source │ Files │ ~Lines │ ├───────────────────┼───────────┼──────────────┤ │ Direct reads │ 11 │ 783 │ ├───────────────────┼───────────┼──────────────┤ │ Subagent reads │ ~12 │ ~1,500 │ ├───────────────────┼───────────┼──────────────┤ │ Web/docs research │ — │ — │ ├───────────────────┼───────────┼──────────────┤ │ Total │ ~23 files │ ~2,300 lines │ └───────────────────┴───────────┴──────────────┘ The subagents did the bulk of the pattern research, freeing the main context for writing the actual implementation.

.claude/skills/add-apm-integrations/SKILL.md

mcculls · 2026-03-10T12:43:34Z

.claude/skills/add-apm-integrations/SKILL.md

+  - `@Advice.Return` — the return value (exit only)
+  - `@Advice.Thrown` — the thrown exception (exit only)
+  - `@Advice.Enter` — the return value of the enter method (exit only)
+- Use `CallDepthThreadLocalMap` to guard against recursive instrumentation of the same method


Add: "- Do not use lambdas in advice methods"

EDIT: this should go in the "Must NOT do" section below...

mcculls · 2026-03-10T12:46:39Z

.claude/skills/add-apm-integrations/SKILL.md

+Enter method:
+1. `AgentSpan span = startSpan(DECORATE.operationName(), ...)`
+2. `DECORATE.afterStart(span)` + set domain-specific tags
+3. `AgentScope scope = activateSpan(span)` — return or store via `@Advice.Local`


Should we push it towards the Context API as that will be preferred going forwards?

ContextScope scope = span.attach()

I think we should revisit our docs (/docs) first, and then reflect the upgrade to the skill. WDYT?
Upgrading the code base would also help as it is heavily reading at the other instrumentations as example as it does not have reference document / codebase.

I added the files it reads to get knowledge to build (again) the Feign instrumentation here: #10774 (comment)
You can see he’s relying on some other instrumentations to know how to proceed. So cleaning up our codebase or providing references to the skills would help better I guess.

jordan-wong · 2026-03-10T13:07:42Z

.claude/skills/add-apm-integrations/SKILL.md

+
+## Step 12 – Retrospective: update this skill with what was learned
+
+After the instrumentation is complete (or abandoned), review the full session and improve this skill for future use.


I haven't seen this type of instruction before and I'm curious how it'll perform.

My one concern with this is that we are instructing it to update the instrumentation with lessons learned before any human review is in the loop, could be too early?

I like the idea though and would like to see it in action, especially as we are in prototyping stages.

My one concern with this is that we are instructing it to update the instrumentation with lessons learned before any human review is in the loop, could be too early?

It's interesting to see the changes it makes according to the instrumentation challenges it faces.
I did not include its discovery and changes so far because it feels too early. Especially without way golden instrumentations and easy way to compare to output.

wconti27 · 2026-03-10T13:43:43Z

.claude/skills/add-apm-integrations/SKILL.md

+- [ ] `settings.gradle.kts` entry added in alphabetical order
+- [ ] `build.gradle` has `compileOnly` deps and `muzzle` directives with `assertInverse = true`
+- [ ] `@AutoService(InstrumenterModule.class)` annotation present on the module class
+- [ ] `helperClassNames()` lists ALL referenced helpers (including inner, anonymous, and enum synthetic classes)
+- [ ] Advice methods are `static` with `@Advice.OnMethodEnter` / `@Advice.OnMethodExit` annotations
+- [ ] `suppress = Throwable.class` on enter/exit (unless the hooked method is a constructor)
+- [ ] No logger field in the Advice class or InstrumenterModule class
+- [ ] No `inline=false` left in production code
+- [ ] No `java.util.logging.*` / `java.nio.*` / `javax.management.*` in bootstrap path
+- [ ] Span lifecycle order is correct: startSpan → afterStart → activateSpan (enter); onError → beforeFinish → finish → close (exit)
+- [ ] Muzzle passes


can we mention the new context API and reference, with notes that the context api must be used and there may be limited examples, and new integrations can be based off of reference integrations, but still should use the new context api.

but still should use the new context api.

For clarification, using the new Context API where an instrumentation is dependent of some other instrumentations using the legacy way may make the generated instrumentation fails. It’s not like always apply it to make it work, it is contextual about how instrumentations interact with each others. And in this case, it feels like the LLM is doing a good job at finding the most relevant / working API to use on average.

wconti27 · 2026-03-10T13:59:58Z

.claude/skills/add-apm-integrations/SKILL.md

+- [ ] Instrumentation tests pass
+- [ ] `latestDepTest` passes
+- [ ] `spotlessCheck` passes


Can we mention the new context API and reference, with notes that the context api must be used and there may be limited examples, and new integrations can be based off of reference integrations, but still should use the new context api.

PerfectSlayer requested a review from a team as a code owner March 9, 2026 16:56

PerfectSlayer added the tag: no release notes Changes to exclude from release notes label Mar 9, 2026

PerfectSlayer requested a review from manuel-alvarez-alvarez March 9, 2026 16:56

PerfectSlayer added tag: experimental Experimental changes tag: ai generated Largely based on code generated by an AI or LLM labels Mar 9, 2026

PerfectSlayer requested review from jordan-wong and wconti27 and removed request for manuel-alvarez-alvarez March 9, 2026 16:57

wconti27 reviewed Mar 10, 2026

View reviewed changes

mcculls reviewed Mar 10, 2026

View reviewed changes

.claude/skills/add-apm-integrations/SKILL.md Show resolved Hide resolved

mcculls reviewed Mar 10, 2026

View reviewed changes

jordan-wong reviewed Mar 10, 2026

View reviewed changes

wconti27 reviewed Mar 10, 2026

View reviewed changes

PerfectSlayer added 3 commits March 11, 2026 15:27

feat(ai): Add skill to create instrumentations

6ae02a1

feat(ai): Add lambda rule for advice

dfb1972

feat(ai): Use markdown link format

e851657

PerfectSlayer force-pushed the bbujon/ai-toolkit branch from b56d918 to e851657 Compare March 11, 2026 14:27

feat(ai): Improve skill name and trigger

37136b7


		Before writing any code, read all three files in full:

		1. `docs/how_instrumentations_work.md` — full reference (types, methods, advice, helpers, context stores, decorators)


		## Step 12 – Retrospective: update this skill with what was learned

		After the instrumentation is complete (or abandoned), review the full session and improve this skill for future use.

Conversation

PerfectSlayer commented Mar 9, 2026

What Does This Do

Motivation

Additional Notes

Contributor Checklist

Uh oh!

pr-commenter bot commented Mar 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmarks

Startup

Parameters

Summary

Load

Parameters

Summary

Dacapo

Parameters

Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mcculls Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pr-commenter bot commented Mar 9, 2026 •

edited

Loading

mcculls Mar 10, 2026 •

edited

Loading