[SPARK-54731][SQL] Fix Reverse expression to handle BinaryType by performing raw byte-level reversal#55051
Open
xiaoxuandev wants to merge 1 commit intoapache:masterfrom
Open
[SPARK-54731][SQL] Fix Reverse expression to handle BinaryType by performing raw byte-level reversal#55051xiaoxuandev wants to merge 1 commit intoapache:masterfrom
xiaoxuandev wants to merge 1 commit intoapache:masterfrom
Conversation
…forming raw byte-level reversal ### What changes were proposed in this pull request? When Spark's `reverse` function is called on a BinaryType column, it previously threw a type mismatch error because BinaryType was not in the accepted input types. This patch adds BinaryType support to the Reverse expression. The changes: 1. Add BinaryType to the inputTypes TypeCollection in Reverse. 2. Add a BinaryType match case in the interpreted execution path (doReverse) using Scala's Array.reverse. 3. Add a BinaryType match case in the code generation path (doGenCode) with a dedicated binaryCodeGen method that performs byte-level reversal via a for loop. 4. Update @ExpressionDescription to document binary support (since 4.2.0), fix parameter name from `array` to `expr`. 5. Update data type mismatch error test expectations to include BINARY in the accepted types. 6. Add end-to-end SQL test for reverse on BinaryType in DataFrameFunctionsSuite covering both interpreted and codegen paths. ### Why are the changes needed? Without this fix, calling `reverse` on a BinaryType column fails with a type error. This is inconsistent with other Spark functions like `length` and `concat` that already support BinaryType. ### Does this PR introduce _any_ user-facing change? Yes. The `reverse` function now accepts BinaryType input and returns the byte-reversed binary value. Previously this would throw an AnalysisException. ### How was this patch tested? - Unit tests in CollectionExpressionsSuite: multi-byte reversal, empty byte array, single byte, byte array containing 0x00, null input. - End-to-end test in DataFrameFunctionsSuite: reverse on BinaryType column through both interpreted and codegen (cached) paths. - Updated data type mismatch tests to reflect the new accepted type set. - Scalastyle and ExpressionsSchemaSuite pass. ### Was this patch authored or co-authored using generative AI tooling? Yes, co-authored with Kiro.
Member
|
Are there any reference that supports binary with reverse? |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
When Spark's
reversefunction is called on a BinaryType column, it previously threw a type mismatch error because BinaryType was not in the accepted input types. This patch adds BinaryType support to the Reverse expression.The changes:
Why are the changes needed?
Without this fix, calling
reverseon a BinaryType column fails with a type error. This is inconsistent with other Spark functions likelengthandconcatthat already support BinaryType.Does this PR introduce any user-facing change?
Yes. The
reversefunction now accepts BinaryType input and returns the byte-reversed binary value. Previously this would throw an AnalysisException.How was this patch tested?
Was this patch authored or co-authored using generative AI tooling?
Yes, co-authored with Kiro.