Conversation
|
need to merge #19821 first |
| } | ||
| _ => { | ||
| return internal_err!( | ||
| "First argument of `DATE_PART` must be non-null scalar Utf8" |
There was a problem hiding this comment.
same as DF date_part, part is a literal
| use datafusion_expr::planner::{ExprPlanner, PlannerResult}; | ||
|
|
||
| #[derive(Default, Debug)] | ||
| pub struct SparkFunctionPlanner; |
There was a problem hiding this comment.
If we're including this planner now, I feel we should update the lib docs with an example of using this
https://github.com/apache/datafusion/blob/main/datafusion/spark/src/lib.rs
There was a problem hiding this comment.
yes, I can do that. I also think it would be nice to provide a way to register the expr planner and the udfs at the same time with something like
.we could do a
with_spark_features ? could track that in a separate issue/PR
There was a problem hiding this comment.
That sounds good as a followup 👍
| internal_err!("spark date_part should have been simplified to standard date_part") | ||
| } | ||
|
|
||
| fn simplify( |
There was a problem hiding this comment.
I like that we're using simplify here 👍
| let date_part_expr = Expr::ScalarFunction(ScalarFunction::new_udf( | ||
| datafusion_functions::datetime::date_part(), | ||
| vec![part_expr, date_expr], | ||
| )); |
There was a problem hiding this comment.
One concern is if the nullability of the output field will match here?
There was a problem hiding this comment.
ah you're right, should we update
to be nullable depending on the inputs ?There was a problem hiding this comment.
Yes, I think that would be the easiest/most consistent way to fix this
bcf8a96 to
3405613
Compare
Merged! |
3405613 to
20e0b96
Compare
|
Thanks @cht42 |
Which issue does this PR close?
datafusion-sparkSpark Compatible Functions #15914Rationale for this change
The current date_part function in datafusion have a few differences with the spark implementation:
Full list of spark supported aliases: https://github.com/apache/spark/blob/a03bedb6c1281c5263a42bfd20608d2ee005ab05/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala#L3356-L3371
What changes are included in this PR?
New date_part function in spark crate.
Are these changes tested?
Yes with SLT
Are there any user-facing changes?
yes