Skip to content

Conversation

@samruddhibaviskar11
Copy link

Summary

Fixes a bug where DataFrame arithmetic with pyarrow-backed dtypes would convert missing columns to float64 with NaN instead of preserving the original ExtensionDtype with pd.NA.

Changes

  • Update DataFrame._arith_method_with_reindex to pre-populate missing columns with the appropriate dtype before reindexing
  • Preserve ExtensionDtypes (e.g. int64[pyarrow]) by creating NA-filled columns with the original dtype
  • Maintain existing behavior for NumPy-backed dtypes

Tests

test_arith_reindex_with_pyarrow_dtype

Thanks to @mattharrison, @jorisvandenbossche, and @jbrockmendel for the discussion and guidance.

Fixes #63288

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Adding pyarrow column to missing column changes to NaN in DataFrame addition

1 participant