[SPARK-57490][SQL] Support CAST between nanosecond timestamp types of different precision#56544
Open
MaxGekk wants to merge 3 commits into
Open
[SPARK-57490][SQL] Support CAST between nanosecond timestamp types of different precision#56544MaxGekk wants to merge 3 commits into
MaxGekk wants to merge 3 commits into
Conversation
… different precision ### What changes were proposed in this pull request? This PR adds support for `CAST` between same-family nanosecond-precision timestamp types of different precision: - `TIMESTAMP_NTZ(p1)` <-> `TIMESTAMP_NTZ(p2)` - `TIMESTAMP_LTZ(q1)` <-> `TIMESTAMP_LTZ(q2)` where the precisions are in [7, 9]. A cross-precision cast only re-floors the sub-microsecond part of the value; `epochMicros` and the time zone are untouched. Widening (target precision >= source) is lossless; narrowing floors the sub-microsecond digits toward the past, consistent with the existing nanos -> micros narrowing rule. The store-assignment / up-cast contract follows the established micros <-> nanos precedent: widening is allowed as an ANSI store assignment but is not an up-cast, narrowing is explicit-CAST only, and equal precision is the identity cast. ### Why are the changes needed? Casting between two nanosecond timestamps of the same family but different precision was previously absent from `Cast.canCast` / `Cast.canAnsiCast` and failed type checking, even though casts to/from the microsecond counterparts, `DATE` and `STRING` were already supported. ### Does this PR introduce _any_ user-facing change? Yes, but only behind the preview flag `spark.sql.timestampNanosTypes.enabled`. With the flag enabled, `CAST(... AS TIMESTAMP_NTZ(p))` / `CAST(... AS TIMESTAMP_LTZ(p))` from another nanosecond timestamp of the same family is now supported. ### How was this patch tested? Added unit tests in `CastSuiteBase` covering all NTZ/LTZ precision pairs (widening, narrowing with floor semantics, pre-epoch values, nulls) and the store-assignment / up-cast contract. ### Was this patch authored or co-authored using generative AI tooling? Generated-by: Cursor
Format the newly added truncateTimestampNanosToPrecision method signature in SparkDateTimeUtils to satisfy scalafmt in the sql/api module.
Add SQLQuery test coverage in cast.sql for TIMESTAMP_NTZ(p1) <-> TIMESTAMP_NTZ(p2) and TIMESTAMP_LTZ(q1) <-> TIMESTAMP_LTZ(q2) using typed literals and :: syntax, while avoiding duplication with CastSuite value-semantics assertions.
cb53c9c to
30dee82
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR adds support for
CASTbetween same-family nanosecond-precision timestamp types of different precision:TIMESTAMP_NTZ(p1)<->TIMESTAMP_NTZ(p2)TIMESTAMP_LTZ(q1)<->TIMESTAMP_LTZ(q2)where the precisions are in
[7, 9].Both
TimestampNTZNanosTypeandTimestampLTZNanosTypeshare the same physical valueTimestampNanosVal(epochMicros, nanosWithinMicro)(withnanosWithinMicroin[0, 999]). A cross-precision cast only re-floors the sub-microsecond part of the value;epochMicrosand the time zone are untouched:nanosWithinMicrotoward the past to the target precision step (drops the lowest9 - precisionsub-microsecond digits), consistent with the existing nanos -> micros narrowing rule.The store-assignment / up-cast contract follows the established micros <-> nanos precedent (SPARK-57293):
Concretely, this adds the
canCast/canAnsiCast/canANSIStoreAssignrules, the interpreted and codegen eval paths inCast, and aDateTimeUtils.truncateTimestampNanosToPrecisionhelper reused by both paths.In addition, this PR adds SQLQuery coverage in
cast.sqlfor the SQL parser/typed-literal/::cast surface of cross-precision nanos casts, while avoiding value-semantics duplication withCastSuite*.Why are the changes needed?
The nanosecond-capable timestamp types
TIMESTAMP_NTZ(p)andTIMESTAMP_LTZ(p)(p in[7, 9]) are gated behindspark.sql.timestampNanosTypes.enabled. Casts between these types and their microsecond counterparts (TIMESTAMP_NTZ/TIMESTAMP), as well as to/fromDATEandSTRING, are already supported. However, casting between two nanosecond timestamps of the same family but different precision was not allowed: such a cast was absent fromCast.canCast/Cast.canAnsiCastand failed type checking.Does this PR introduce any user-facing change?
Yes, but only behind the preview flag
spark.sql.timestampNanosTypes.enabled. With the flag enabled,CAST(... AS TIMESTAMP_NTZ(p))/CAST(... AS TIMESTAMP_LTZ(p))from another nanosecond timestamp of the same family is now supported instead of failing type checking.How was this patch tested?
Added unit tests in
CastSuiteBasecovering:Added SQLQuery tests in
cast.sql(and regenerated goldens) for parser + typed-literal +::cross-precision cast paths.Ran:
build/sbt 'catalyst/testOnly *CastSuite *CastWithAnsiOnSuite *CastWithAnsiOffSuite'(355 tests passed)build/sbt "sql/testOnly org.apache.spark.sql.SQLQueryTestSuite -- -z cast.sql"(6 tests passed)build/sbt "sql-api/scalastyle" "catalyst/scalastyle" "catalyst/Test/scalastyle"Was this patch authored or co-authored using generative AI tooling?
Generated-by: Cursor