avoid unnecessary COUNT queries in slice endpoint#64054
avoid unnecessary COUNT queries in slice endpoint#64054henry3260 wants to merge 1 commit intoapache:mainfrom
Conversation
63db595 to
d008c44
Compare
d008c44 to
0eae1d5
Compare
SameerMesiah97
left a comment
There was a problem hiding this comment.
Solid performance improvement. But this needs more coverage for E2E behavior.
| pytest.param(slice(-1, -3, -1), id="-1:-3:-1"), | ||
| ], | ||
| ) | ||
| def test_xcom_get_with_slice(self, client, dag_maker, session, key): |
There was a problem hiding this comment.
Right now, your new test only covers the performance impact of your changes (i.e. when COUNT is called). It doesn’t validate the actual slicing behavior.
We already have E2E coverage in test_xcom_get_with_slice, but the dataset there is very small (effectively length 3), so some of the more complex cases introduced here don’t really get exercised (e.g. mixed-sign slicing that requires COUNT normalization, or early return paths).
I think a simple approach would be:
- Modify this test to includde a slightly larger (but still sparse) dataset, for example:
xcom_values = [0, None, 2, None, 4, 5, None, 7, 8, 9]
- Add this line right above the assertions:
expected = [v for v in xcom_values if v is not None][key]
- Assert against that instead of a hardcoded list.
I would also add a few more targeted cases to cover the different scenarios (same-sign, mixed-sign, early return, etc).
Why
The
get_mapped_xcom_by_sliceendpoint previously issued multipleCOUNTqueries in several negative slicing scenarios start or stop were negative. Eachget_query_count() call performs an extra database query, which is unnecessary and hurts performance for common slice operations.This change moves the total count calculation to the top of the function and only executes it when both start and stop are explicitly provided and they have different signs (one positive, one negative). In all other cases involving negative indices we can correctly compute the slice bounds without knowing the total length, avoiding the
COUNTquery entirely.Was generative AI tooling used to co-author this PR?
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.