Skip to content

ENH: Vectorization (e.g. SIMD) in pandas' Hashtable Operations for Performance Improvement #63374

@113xiaoji

Description

@113xiaoji

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

I’ve seen related issues around alignment and vectorization (e.g., #3146
) and understand pandas prioritizes index alignment and general-purpose data structures. However, this question is focused specifically on whether SIMD or other low-level vectorization techniques have been considered or intentionally avoided in the internal hashtable engine.

Feature Description

Hi pandas team,

I'm wondering what the current stance is in the community regarding the possibility of introducing vectorized (e.g., SIMD-based) operations into pandas' hashtable infrastructure (e.g., used in groupby, factorize, categorical operations, etc.).

Hashtable lookups and insertions are often performance-critical paths, especially when dealing with large, high-cardinality data. With modern CPUs supporting SIMD instructions (e.g., AVX2, AVX-512), has there been any past discussion or interest in exploring:

  • SIMD acceleration for probing and inserting into hash tables?

  • Potential trade-offs in code complexity, portability, and maintainability?

  • Alignment with pandas’ reliance on NumPy, PyArrow, or other external backends for low-level performance?

Would love to know the core team’s view — especially if this is considered an area for experimentation, or whether existing architectural decisions rule this out.

Thanks for the amazing work on pandas!

Alternative Solutions

.

Additional Context

.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions