perf: short-circuit and collect_bool for IN list with column references#20694
perf: short-circuit and collect_bool for IN list with column references#20694zhangxffff wants to merge 3 commits intoapache:mainfrom
Conversation
|
run benchmark in_list |
|
🤖 Hi @zhangxffff, thanks for the request (#20694 (comment)). |
|
@Dandandan @adriangb This PR adds a short-circuit optimization that breaks early when all rows already match, and incorporates the suggestions from #20428 (BooleanBuffer::collect_bool and first-expr initialization). Would appreciate your review when you have a chance. |
|
run benchmark in_list |
|
🤖 |
|
🤖: Benchmark completed Details
|
🚀 |
Co-authored-by: Adrian Garcia Badaracco <1755071+adriangb@users.noreply.github.com>
|
I'm curious if the |
Benchmark result (before vs after vs after_null_count): For nulls=20% cases: after version showed ~3-5% regressions due to calling true_count() on every iteration. after_null_count eliminates this, matching before (e.g. list=28/match=100%/nulls=20%: 100.9µs vs 104.5µs). For the in_list_cols/Utf8 cases: the benchmark implicitly contains ~20% nulls, so the |
Which issue does this PR close?
Rationale for this change
Third PR in the IN list optimization series (split from #20428):
What changes are included in this PR?
try_foldtoforloop; when all non-null rows are alreadytrue, skip remaining list items (up to 27x faster for match=100%/nulls=0%)BooleanBuffer::collect_bool: use inmake_comparatorfallback path for nested types instead(0..n).map().collect()(suggested by @Dandandan in perf: Optimize IN list with column references evaluation #20428 )or_kleene(all_false, rhs)(suggested by @Dandandan in perf: Optimize IN list with column references evaluation #20428 )Are these changes tested?
Yes, and add test to cover short-circuit, short-circuit with nulls, and struct column references (make_comparator fallback path)
Benchmark result:
Are there any user-facing changes?