Skip to content

Conversation

@shashjar
Copy link
Member

Using an EndpointTraceItemTable RPC query to fetch error counts from EAP, used in the calculation of suspect tags.

Tested locally to ensure that the count aggregation is coming back correctly from EAP:

sentry.issues.suspect_tags: >>>>> query_error_counts called (organization_id=1 project_id=2 start='2025-12-13 00:00:00+00:00' end='2025-12-13 01:00:00+00:00' environments=[])

sentry.utils.snuba_rpc: Running a EndpointTraceItemTable RPC query (rpc_query={'meta': {'organizationId': '1', 'cogsCategory': 'issues', 'referrer': 'issues.suspect_tags.query_error_counts', 'projectIds': ['2'], 'startTimestamp': '2025-12-13T00:00:00Z', 'endTimestamp': '2025-12-13T01:00:00Z', 'traceItemType': 'TRACE_ITEM_TYPE_OCCURRENCE'}, 'columns': [{'aggregation': {'aggregate': 'FUNCTION_COUNT', 'key': {'type': 'TYPE_INT', 'name': 'group_id'}}, 'label': 'count'}], 'limit': 1} referrer='issues.suspect_tags.query_error_counts' organization_id=1 trace_item_type=7)

sentry.issues.suspect_tags: >>>>> EAP response received (response_count=1 has_column_values=True raw_response='column_values {\n  attribute_name: "count"\n  results {\n    val_double: 6\n  }\n}\npage_token {\n  offset: 1\n}\nmeta {\n  request_id: "aa3ed9bf554a40099ab2a70cc9b097a9"\n  query_info {\n    stats {\n      progress_bytes: 486\n    }\n    metadata {\n    }\n  }\n  downsampled_storage_meta {\n  }\n}\n')

sentry.issues.suspect_tags: >>>>> Snuba vs EAP comparison (snuba_count=6 eap_count=6 match=True)

To roll this out, we can:

  1. Merge in this PR (no change in behavior in any region)
  2. Flip the eap.occurrences.should_double_read option on for the S4S region only
  3. At this point, we'll be calculating both values in S4S but sticking with the Snuba value as the source of truth
  4. We should see the eap.occurrences.validate_reads metric start getting recorded and then converge to 0 mismatches over time (as EAP gets up-to-speed on all the groups within retention)
  5. At that point (01/21/2026, 90 days after 10/23/2025), we should be able to remove the legacy read path and use EAP only

@github-actions github-actions bot added the Scope: Backend Automatically applied to PRs that change backend components label Dec 13, 2025
@shashjar shashjar requested review from a team and thetruecpaul December 13, 2025 00:31
@shashjar shashjar marked this pull request as ready for review December 13, 2025 00:32
@shashjar shashjar requested a review from a team as a code owner December 13, 2025 00:32
@codecov
Copy link

codecov bot commented Dec 13, 2025

Codecov Report

❌ Patch coverage is 91.30435% with 2 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/sentry/issues/suspect_tags.py 91.30% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master   #104916      +/-   ##
===========================================
- Coverage   80.57%    80.49%   -0.08%     
===========================================
  Files        9431      9416      -15     
  Lines      404261    403719     -542     
  Branches    25655     25756     +101     
===========================================
- Hits       325718    324990     -728     
- Misses      78075     78295     +220     
+ Partials      468       434      -34     

@shashjar shashjar marked this pull request as draft December 13, 2025 19:25
@shashjar shashjar marked this pull request as ready for review December 16, 2025 20:14
Copy link
Contributor

@thetruecpaul thetruecpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@shashjar shashjar merged commit d45aaca into master Dec 23, 2025
67 checks passed
@shashjar shashjar deleted the read-suspect-tag-error-counts-from-eap branch December 23, 2025 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Scope: Backend Automatically applied to PRs that change backend components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants