Skip to content

Use runtime CPU feature detection to select which SIMD instruction set to use #125

@james7132

Description

@james7132

There are options like core::arch::is_x86_feature_detected which can detect which instruction sets are available. Unfortunately the checks cannot be done inside each function call due to the cost of feature detection.

One potential way around this is to do feature detection during initialization, and use a tagged pointer to store the features detected. As any SIMD-supporting platform is at least 32-bit wide, there are at least two bits at the bottom of every pointer to a backing allocation that are always zero. If the default block size is increased to 8 to 64 bytes, the number of tag bits increases. An example mapping for x86 may include:

  • 00 - Default, none detected.
  • 01 - SSE2 detected
  • 10 - SSE4.1 detected
  • 11 - AVX detected

These bits can then be zeroed out on access in a branchless way, which should have a slight impact negative performance impact to point queries (contains, insert, etc.), but allow for the most performant instructions to be used without explicitly compiling for a particular target feature set.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions