feat: implement Java tracer agent for function profiling and ranking by aseembits93 · Pull Request #1648 · codeflash-ai/codeflash

aseembits93 · 2026-02-24T15:58:38Z

Summary

Adds a Java agent (com.codeflash.agent.*) using ASM bytecode instrumentation that profiles method calls during test execution, recording call counts, timing, caller-callee relationships, and serialized arguments to SQLite
The SQLite output matches the Python tracer's schema exactly, so ProfileStats and FunctionRanker work unchanged for ranking Java functions by addressable time
Adds JavaTracer Python class that attaches the agent via -javaagent in Maven Surefire's argLine, and integrates with the existing tracer/optimizer pipeline

New Java files (6 source + 4 test)

File	Purpose
`CodeflashAgent.java`	`premain()` entry point — parses args, registers transformer, shutdown hook
`CodeflashTransformer.java`	ASM `ClassFileTransformer` — instruments methods with `enter()`/`exit()` calls
`CallTracker.java`	Thread-safe singleton — ThreadLocal call stacks, ConcurrentHashMap timings
`MethodKey.java`	Immutable value type matching Python's `(filename, lineno, funcname, classname)`
`MethodStats.java`	Thread-safe timing accumulator with recursion handling
`PstatsWriter.java`	SQLite writer producing `pstats`, `function_calls`, `metadata`, `total_time` tables

Modified files

File	Change
`pom.xml`	Added ASM 9.7 deps, `Premain-Class`/`Can-Retransform-Classes` manifest attrs
`codeflash/tracer.py`	Routes Java files to `JavaTracer` instead of directly to optimizer
`codeflash/optimization/optimizer.py`	Falls back to `args.trace_file` when discovery returns no trace path
`codeflash/languages/java/tracer.py`	New `JavaTracer` class for running Maven tests with the agent

Test plan

All 227 existing Java runtime tests pass (no regressions)
49 new agent tests pass (MethodKey, CallTracker, PstatsWriter, CodeflashAgent)
Tests cover: thread safety, recursion handling, caller tracking, maxFunctionCount, SQLite schema compatibility with Python's ProfileStats, argument capture via Kryo
Shaded JAR builds successfully with correct manifest (Premain-Class, Can-Retransform-Classes)
Python linting (ruff) passes on all modified/new Python files
Integration test: run codeflash --file SomeJavaFile.java on a real Java project and verify agent attaches, trace SQLite produced, functions ranked

🤖 Generated with Claude Code

Add a Java agent using ASM bytecode instrumentation that profiles method calls during test execution. The agent records call counts, timing data, caller-callee relationships, and serialized arguments to SQLite — matching the Python tracer's schema so ProfileStats and FunctionRanker work unchanged. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Use rsplit instead of split when parsing dotted function names so that "com.testapp.Calculator.fibonacci" correctly extracts class_name="Calculator" and base_function_name="fibonacci" instead of class_name="com". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

claude · 2026-02-24T16:16:13Z

PR Review Summary

Prek Checks

ruff check: Passed
ruff format: 1 file reformatted (codeflash/languages/java/tracer.py) — auto-fixed and pushed in commit 2c336249
mypy: 31 errors across 3 files — all pre-existing (none introduced by this PR). Errors are in function_ranker.py (3), optimizer.py (22), and tracer.py (6), all on lines unchanged by this PR.

Code Review

2 inline comments posted:

CallTracker.java reset() — missing onStack/callStack cleanup (HIGH)
reset() clears timings, functionCallCount, capturedArgs, and timestamps, but does NOT clear the thread-local onStack and callStack. This means keys from a previous run persist after reset, causing incorrect recursion detection. This directly affects tests where @BeforeEach calls reset().
tracer.py:126 — overly broad exception catch (MEDIUM)
except (IndexError, OSError, Exception) is equivalent to except Exception since Exception is the parent of both. This silently swallows all errors during language detection, which could hide real bugs (e.g., broken imports, configuration errors).

Other observations (not blocking):

JavaTracer.run() declares a test_command parameter that is never used
The _handle_java_tracing() function manually parses sys.argv for --max-function-count and --tracer-timeout instead of reusing the existing ArgumentParser, which could drift out of sync
.github/workflows/duplicate-code-detector.yml correctly includes *.java in file filters and languages/java/ in cross-module checks
No stale path references found in CLAUDE.md or .claude/rules/

Test Coverage

File	PR	main	Delta
`codeflash/benchmarking/function_ranker.py`	84.9%	84.7%	+0.2%
`codeflash/optimization/optimizer.py`	19.5%	19.1%	+0.4%
`codeflash/languages/java/tracer.py`	0.0%	N/A (new)	⚠️ Below 75% threshold
`codeflash/tracer.py`	0.0%*	0.0%*	—
Overall	79.3%	78.4%	+0.9%

* tracer.py is the tracer entry point invoked via subprocess, so coverage tooling does not capture it in either branch.

Coverage notes:

⚠️ codeflash/languages/java/tracer.py is a new 182-line file with 0% test coverage. It contains the JavaTracer class with Maven integration, package prefix detection, and agent argument construction — all of which would benefit from unit tests.
Overall coverage increased (+0.9%), so no regression.
35 tests failed on the PR branch (mostly pre-existing test_tracer.py failures), 3221 passed.

Optimization PRs

17 open codeflash-ai[bot] PRs found — all target feature branches (add/support_react or omni-java), none target main. No PRs to merge.

Last updated: 2026-02-24T16:37:00Z

claude · 2026-02-24T16:18:27Z

codeflash-java-runtime/src/main/java/com/codeflash/agent/CallTracker.java

+    public void reset() {
+        timings.clear();
+        functionCallCount.clear();
+        capturedArgs.clear();
+        globalStartNs.set(0);
+        globalEndNs.set(0);
+    }


Bug: reset() doesn't clear the onStack and callStack ThreadLocals. After reset(), any method keys still in the thread-local onStack set will persist, causing incorrect recursion detection on subsequent uses. Similarly, stale frames on callStack will remain.

This matters for tests (where reset() is called in @BeforeEach) and for any scenario where the tracker is reused after reset.

Suggested change

public void reset() {

timings.clear();

functionCallCount.clear();

capturedArgs.clear();

globalStartNs.set(0);

globalEndNs.set(0);

}

public void reset() {

timings.clear();

functionCallCount.clear();

capturedArgs.clear();

globalStartNs.set(0);

globalEndNs.set(0);

callStack.get().clear();

onStack.get().clear();

}

claude · 2026-02-24T16:18:29Z

codeflash/tracer.py

                        optimizer.run_with_args(full_args)
-                        return ArgumentParser()  # Return dummy parser since we're done
+                        return ArgumentParser()
        except (IndexError, OSError, Exception):


Exception is a superclass of both IndexError and OSError, so this is equivalent to except Exception. Catching all exceptions silently can hide real bugs (e.g., import errors, broken module initialization). Consider narrowing to just the expected failures:

Suggested change

except (IndexError, OSError, Exception):

except (IndexError, OSError, ImportError, ValueError):

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

aseembits93 and others added 3 commits February 24, 2026 21:28

updating workflows to mains

a496558

github-actions bot added the workflow-modified This PR modifies GitHub Actions workflows label Feb 24, 2026

style: auto-fix ruff formatting in java tracer

2c33624

claude bot reviewed Feb 24, 2026

View reviewed changes

aseembits93 and others added 2 commits February 24, 2026 21:50

chore: sync workflow files from main

6a3b977

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

reverting workflows to original state

92fa01c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement Java tracer agent for function profiling and ranking#1648

feat: implement Java tracer agent for function profiling and ranking#1648
aseembits93 wants to merge 6 commits intoomni-javafrom
java-tracer-agent

aseembits93 commented Feb 24, 2026

Uh oh!

claude bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

claude bot Feb 24, 2026

Uh oh!

claude bot Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	except (IndexError, OSError, Exception):
	except (IndexError, OSError, ImportError, ValueError):

Conversation

aseembits93 commented Feb 24, 2026

Summary

New Java files (6 source + 4 test)

Modified files

Test plan

Uh oh!

claude bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Prek Checks

Code Review

Test Coverage

Optimization PRs

Uh oh!

claude bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

claude bot commented Feb 24, 2026 •

edited

Loading