feat: implement cast from whole numbers to binary format and bool to decimal#3083
Conversation
|
@parthchandra , @andygrove these are the changes to support int to binary cast and bool to decimal cast operation . I have also had to change the base test to make sure that we use DSL (since Int -> Binary casts are only supported in the DSL side of Spark and would raise an encoding exception if we use Spark SQL) . Also, Int -> Binary casts do not support Try or ANSI eval modes in Spark and I have followed the same convention here as well |
|
Fixed format issue with |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3083 +/- ##
============================================
+ Coverage 56.12% 59.98% +3.85%
- Complexity 976 1464 +488
============================================
Files 119 175 +56
Lines 11743 16171 +4428
Branches 2251 2685 +434
============================================
+ Hits 6591 9700 +3109
- Misses 4012 5114 +1102
- Partials 1140 1357 +217 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
andygrove
left a comment
There was a problem hiding this comment.
Thanks for adding int→binary and boolean→decimal cast support!
Issue: Dead code
The function canCastToBinary (lines 354-358 in CometCast.scala) is defined but never called anywhere. This appears to be unused code that should be removed.
private def canCastToBinary(fromType: DataType): SupportLevel = fromType match {
case DataTypes.ByteType | DataTypes.ShortType | DataTypes.IntegerType | DataTypes.LongType =>
Compatible()
case _ => Unsupported(Some(s"Cast from BinaryType to $fromType is not supported"))
}The actual logic for int→binary casts is correctly handled in the canCastFromByte, canCastFromShort, canCastFromInt, and canCastFromLong functions.
Otherwise, the implementation looks good:
- Correctly limits int→binary to LEGACY mode only (matching Spark)
- Boolean→Decimal implementation handles precision/scale correctly
- Tests use column data from parquet (not just literals)
- Compatibility docs updated
- ANSI and Try modes correctly disabled for int→binary tests
This review was generated with AI assistance.
|
There is a test failure: |
|
Thank you @andygrove ,I am working on the test failures (I believe there is a bug on the DSL side vs using Spark SQL) |
910fe4f to
69d8290
Compare
|
There is a mismatch in the error message thrown by spark vs comet for a cast test from StringType to DecimalType(2,2) . I will raise a new PR to fix that error message . Steps to Reproduce : |
|
cc : @andygrove , @parthchandra |
|
I also found that the tests are not running on truly deterministic inputs . For example this test is failing (if I disable / change inputs for a test before that) : |
|
Related PR to fix test in decimal cast : #3248 |
8ccd20a to
fb8814f
Compare
d1f1512 to
05e7e50
Compare
|
@andygrove , @parthchandra I rebased the feature with main and the CometCastSuite finishes successfully on my local machine. Please review and kickoff CI whenever you get a chance |
parthchandra
left a comment
There was a problem hiding this comment.
lgtm. (minor comment)
|
Thank you for the approval @parthchandra |
|
@andygrove , @parthchandra , could you please review or potentially merge the branch whenever you get a chance |
|
Fixed comment indentation issues |
bc83230 to
b49b178
Compare
|
Rebased with main and fixed merged conflicts |
Which issue does this PR close?
try_castis only supported starting Spark 4x)Relavent spark code :
https://github.com/apache/spark/blob/f89fbd49c45045b44259341c6c73e30ee9c46dd0/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala#L682
(Note that only legacy mode routes the code here while we Try and ANSI are not supported)
Closes #.
Rationale for this change
What changes are included in this PR?
How are these changes tested?