-
Notifications
You must be signed in to change notification settings - Fork 2k
Open
Copy link
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
max_distinct_count in datafusion/physical-plan/src/joins/utils.rs panics with "attempt to subtract with overflow" in the Precision::Exact branch (line 725):
Precision::Exact(count) => {
let count = count - stats.null_count.get_value().unwrap_or(&0); // <-- panicThis happens when num_rows (Exact) is smaller than null_count, which became possible after #20228, which added fetch support to HashJoinExec. When a limit is pushed down, HashJoinExec::partition_statistics() calls stats.with_fetch(self.fetch, 0, 1), which reduces num_rows to Exact(fetch_value) but does not reduce null_count in column statistics.
Example failing pipeline:
https://github.com/datafusion-contrib/datafusion-distributed/actions/runs/22798285744/job/66136064932?pr=366
To Reproduce
git clone https://github.com/datafusion-contrib/datafusion-distributed
cd datafusion-distributed
git checkout branch-53
cargo test --test tpcds_plans_test tests::test_tpcds_19 --all-features
Expected behavior
No substraction overflow
Additional context
No response
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working