[FLINK-39244][table] Support precision 0-9 for TO_TIMESTAMP_LTZ function#27757
[FLINK-39244][table] Support precision 0-9 for TO_TIMESTAMP_LTZ function#27757twalthr merged 4 commits intoapache:masterfrom
Conversation
@davidradl Thanks for taking a look and raising this! To address your point about PR #27770 rounding to 3, 6, or 9: The discrepancy comes down to the difference between the SQL function layer (this PR) and the serialization layer (PR #27770). They do not collide; they simply operate at different scopes. The Serialization Layer (Avro Limitations)PR #27770 deals with Avro, and the Avro 1.12.0 specification only defines the following timestamp logical types:
There is no Avro logical type for intermediate precisions like 1, 2, 4, 5, 7, or 8. These are Avro spec limitations, not Flink limitations. If a user executes The SQL Function Layer (Standard Database Behavior)For the SQL layer, supporting all intermediate precisions (0-9) is the correct and standard behavior for timestamp types. Restricting Flink's SQL precision strictly to powers of 1000 (3, 6, 9) would be an artificial limitation. Here is how other major databases handle this, which FlinkSQL aligns with by supporting 0-9:
By supporting 0-9 here, we bring Flink completely in line with standard ANSI SQL database behavior! Let me know what you think. |
...nk-table-common/src/main/java/org/apache/flink/table/expressions/ValueLiteralExpression.java
Outdated
Show resolved
Hide resolved
...able-planner/src/test/java/org/apache/flink/table/planner/functions/TimeFunctionsITCase.java
Outdated
Show resolved
Hide resolved
da9d166 to
fd0e599
Compare
...able-planner/src/test/java/org/apache/flink/table/planner/functions/TimeFunctionsITCase.java
Show resolved
Hide resolved
22f561d to
8e58476
Compare
...nk-table-common/src/main/java/org/apache/flink/table/expressions/ValueLiteralExpression.java
Outdated
Show resolved
Hide resolved
...t/java/org/apache/flink/table/types/inference/strategies/ToTimestampLtzTypeStrategyTest.java
Outdated
Show resolved
Hide resolved
...able-planner/src/test/java/org/apache/flink/table/planner/functions/TimeFunctionsITCase.java
Show resolved
Hide resolved
...le-planner/src/test/scala/org/apache/flink/table/planner/expressions/TemporalTypesTest.scala
Outdated
Show resolved
Hide resolved
twalthr
left a comment
There was a problem hiding this comment.
LGTM, thanks for tackling this time problem!
|
@flinkbot run azure |
…eral serialization for TO_TIMESTAMP_LTZ
- Output type now follows input precision: TIMESTAMP_LTZ(3) for precision
0-3, TIMESTAMP_LTZ(precision) for 4-9. String-based variants infer
precision from the format pattern's 'S' count.
- Literal serialization uses string-based TO_TIMESTAMP_LTZ('timestamp',
'format', 'UTC') instead of numeric epoch to avoid long overflow beyond
year 2262.
- Add function version 2 for plan serialization compatibility.
- Update docs (sql_functions.yml, sql_functions_zh.yml, Python docstrings)
with precision-dependent output types, parameter descriptions, and examples.
- Update tests to match new output types and display formats.
What is the purpose of the change
This pull request extends the
TO_TIMESTAMP_LTZfunction to support all precision values from 0 to 9, where precision specifies the unit of the epoch value as 10^(-precision) seconds. Previously only 0 (seconds) and 3 (milliseconds) were supported, which limited users working with higher-precision epoch values such as microseconds (6) or nanoseconds (9).The output type is now precision-aware:
TIMESTAMP_LTZ(3)for precision 0-3, andTIMESTAMP_LTZ(precision)for precision 4-9. The output type for precision 0-3 is kept atTIMESTAMP_LTZ(3)to preserve backward compatibility — previouslyTO_TIMESTAMP_LTZalways returnedTIMESTAMP_LTZ(3), and changing it for existing precision values (0 and 3) could break downstream schemas and queries that depend on the output type. For the string-based variant, the output precision is inferred from the format pattern'sScount (e.g.,SSSSSS→TIMESTAMP_LTZ(6)), with a minimum of 3 for the same backward compatibility reason.Brief change log
ToTimestampLtzTypeStrategy): Output type follows input precision —TIMESTAMP_LTZ(3)for precision 0-3,TIMESTAMP_LTZ(precision)for 4-9. For string variants, precision is inferred from the format pattern's trailingScount. Validates precision is between 0 and 9.DateTimeUtils): Replaced per-precision switch cases with a genericepochToTimestampDatamethod usingMath.powfor factor computation. Thedoubleoverload now preserves fractional parts by converting to nanoseconds before truncation. AddedtoEpochValue(Instant, precision)for the inverse conversion andprecisionFromFormat(String)as shared logic for format-based precision inference.DateTimeUtils,ToTimestampLtzFunction): AddedparseTimestampData(String, String, int)overload that accepts precision. The string+format variant ofTO_TIMESTAMP_LTZnow passes the format's precision to the parser, preserving sub-millisecond fractional digits. The existing 2-arg overload remains hardcoded to precision 3 forTO_TIMESTAMPcompatibility (FLINK-14925).ToTimestampLtzInputTypeStrategy): Added compile-time range validation for numeric literal arguments, so out-of-range epoch values fail at validation time rather than silently returning null at runtime.ValueLiteralExpression):TIMESTAMP_LTZliterals now serialize using the string-basedTO_TIMESTAMP_LTZ('timestamp', 'format', 'UTC')variant with a precision-matching format pattern. Previously it used numeric epoch millis which would overflowlongfor dates after April 11, 2262 at nanosecond precision.BuiltInFunctionDefinitions): BumpedTO_TIMESTAMP_LTZto version 2 for plan serialization compatibility.sql_functions.yml,sql_functions_zh.yml, and Pythonexpressions.pydocstrings with precision-dependent output types, parameter descriptions, and examples.Verifying this change
This change added tests and can be verified as follows:
ExpressionTestto verifyTIMESTAMP_LTZliteral serialization round-trips correctly at every precision, including edge cases beyond thelongnanosecond overflow (year 2262, year 9999).ToTimestampLtzTypeStrategyTestcovering precision 0-9 (numeric and format-based), out-of-range precision, and nullable input combinations.TimeFunctionsITCasefor string-based parsing at precision 4/6/8/9, format-based precision inference, and the case where input has fewer fractional digits than the format pattern.LiteralExpressionsSerializationITCasefor precisions 0, 3, 6, and 9.TemporalTypesTestwith tests for precision 1 (deciseconds), precision 9 (nanoseconds), string+format variants at different precisions, DOUBLE/DECIMAL at higher precisions, and the fewer-input-digits-than-format case.select.q,select_batch.q) forCliClientITCaseto match newTIMESTAMP_LTZ(3)display format.Does this pull request potentially affect one of the following parts:
@Public(Evolving): noDateTimeUtils.epochToTimestampDatais called per-record but the change replaces a switch statement withMath.powcomputations, which has negligible performance impact.Documentation