feat: decimal support for gcd and lcm by theirix · Pull Request #22655 · apache/datafusion

theirix · 2026-05-30T20:15:31Z

Which issue does this PR close?

Closes feat: Decimal support for gcd and lcm #19057.

Rationale for this change

A binary gcd and lcm UDF in the datafusion-functions crate supports only Int64, but not Decimals. Adding missing support for decimals.

What changes are included in this PR?

Updated gcd and lcm functions to add decimal support. The integer path is more performant and stays intact. For decimals, the Euclidean algorithm is used for GCD
Added coercion rules: casting to decimals if any argument is decimal; otherwise, stay with ints as before
Common functionality extracted to common.rs to avoid inter-UDF dependency
In order to use calculate_binary_math for Decimals, updated it to accept a target type instead of raw Decimal128Type::DATA_TYPE - it causes scaling issues for these UDFs, see Improve scale support for binary decimal operations #19621

A bit more on (4). The driving force is this failing example:

query R
select gcd(2::decimal(38, 0), 3::decimal(38, 0));
----
1

Previously in #19874, I suggested a more complicated solution to extend calculate_binary_math. However, it only affected gcd/lcm and could be considered overkill. This PR extends these functions with an extra parameter cast_target for calculate_binary_decimal_math to perform a proper cast to the actual type used, rather than to the default Decimal128Type::DATA_TYPE - it is much lighter.

Are these changes tested?

Added unit test for UDFs with decimals for array and scalar paths
Added unit tests for the gcd/lcm math itself
Added new SLT tests for decimals

Are there any user-facing changes?

No

github-actions · 2026-05-30T20:23:22Z

Thank you for opening this pull request!

Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch).

Details

     Cloning apache/main
    Building datafusion-functions v54.0.0 (current)
       Built [  30.573s] (current)
     Parsing datafusion-functions v54.0.0 (current)
      Parsed [   0.084s] (current)
    Building datafusion-functions v54.0.0 (baseline)
       Built [  29.103s] (baseline)
     Parsing datafusion-functions v54.0.0 (baseline)
      Parsed [   0.086s] (baseline)
    Checking datafusion-functions v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   0.576s] 223 checks: 221 pass, 2 fail, 0 warn, 30 skip

--- failure function_marked_deprecated: function #[deprecated] added ---

Description:
A function is now #[deprecated]. Downstream crates will get a compiler warning when using this function.
        ref: https://doc.rust-lang.org/reference/attributes/diagnostics.html#the-deprecated-attribute
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.48.0/src/lints/function_marked_deprecated.ron

Failed in:
  function datafusion_functions::utils::calculate_binary_decimal_math in /home/runner/work/datafusion/datafusion/datafusion/functions/src/utils.rs:157

--- failure function_missing: pub fn removed or renamed ---

Description:
A publicly-visible function cannot be imported by its prior path. A `pub use` may have been removed, or the function itself may have been renamed or removed entirely.
        ref: https://doc.rust-lang.org/cargo/reference/semver.html#item-remove
       impl: https://github.com/obi1kenobi/cargo-semver-checks/tree/v0.48.0/src/lints/function_missing.ron

Failed in:
  function datafusion_functions::math::gcd::compute_gcd, previously in file /home/runner/work/datafusion/datafusion/target/semver-checks/git-apache_main/c37ba366f681d4072190d15e2308e91220f34258/datafusion/functions/src/math/gcd.rs:175

     Summary semver requires new major version: 1 major and 1 minor checks failed
    Finished [  61.906s] datafusion-functions
    Building datafusion-sqllogictest v54.0.0 (current)
       Built [ 179.887s] (current)
     Parsing datafusion-sqllogictest v54.0.0 (current)
      Parsed [   0.027s] (current)
    Building datafusion-sqllogictest v54.0.0 (baseline)
       Built [ 179.980s] (baseline)
     Parsing datafusion-sqllogictest v54.0.0 (baseline)
      Parsed [   0.025s] (baseline)
    Checking datafusion-sqllogictest v54.0.0 -> v54.0.0 (no change; assume patch)
     Checked [   0.117s] 223 checks: 223 pass, 30 skip
     Summary no semver update required
    Finished [ 363.498s] datafusion-sqllogictest

theirix · 2026-05-31T13:22:59Z

@neilconway , @Jefffrey I'd appreciate hearing your thoughts on this PR

Jefffrey · 2026-06-10T23:54:17Z

 query error DataFusion error: Arrow error: Compute error: Signed integer overflow in GCD\(0, \-9223372036854775808\)
 select gcd(0, -9223372036854775808);

+# gcd decimal


could we add test cases for:

negative number inputs

decimals with different scale/precision inputs

decimal point 1.23 inputs

I think we're still missing cases for decimals like 1.23 (non whole numbers)

A test for non-integers is already here https://github.com/apache/datafusion/pull/22655/changes#diff-a3dd4ed33ba3cb7e6f13d3d50f84f93d654a13ef250dcb53e75373a8567b4847R712 . Am I missing something special about 1.23 ?

I wanted to see a case where we calculate gcd on non-integers; the test case you linked still operates on whole numbers since it cases to decimals with 0 scale

e.g. from postgres

postgres=# select gcd(15.3, 2.9); gcd ----- 0.1 (1 row)

Ah, that's true. Introduced scale, now it matches the Postgres test case

Jefffrey · 2026-06-10T23:55:52Z

+
+/// Computes a binary math function for input arrays using a specified function
+/// and applies rescaling to given precision and scale.
+/// Deprecated, use [`calculate_binary_decimal_math_cast`] instead.


instead of just stating its deprecated might be better to explicitly mark it as so #[deprecated]

Since the original function calculate_binary_math_cast is public, we cannot track its usage outside, and removal is only possible with deprecation. It shouldn't have been declared as public in the first place - so the new function, too.

What I can see as a plan:

Deprecate it explicitly with the macro, as you suggest

Port some usages in the datafusion repo in the next PR

Mark the new function as pub(crate) as it is intended to be used locally only

A second thought - marking it as deprecated fails a clippy check. Rolled back to an informational comment until the PR removes its usage in (2)

I think we should still deprecate it, and for the clippy check just throw a #[expect(deprecated)] to suppress them (and hopefully fix them later)

Makes sense. Since there are only a few instances in a single file, I introduced the change now - anyway, it will be changed. Sorry for the PR creeping into more files.

Jefffrey · 2026-06-16T06:18:20Z

thanks @theirix

do we still want #19874 after this?

theirix · 2026-06-16T06:26:09Z

Thank you! This PR is more than enough.

Support decimals for gcd and lcm

ceb7593

github-actions Bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels May 30, 2026

github-actions Bot added the auto detected api change Auto detected API change label May 30, 2026

theirix marked this pull request as ready for review May 30, 2026 20:55

Jefffrey reviewed Jun 11, 2026

View reviewed changes

theirix added 11 commits June 11, 2026 13:04

Use plan_err

e7fa598

Remove duplicated unit tests

1c56990

Replace gcd unit tests with SLT

e3cdd83

Tests for decimal and non-integer inputs

ec41c47

Change deprecations

4b73901

Avoid hard-deprecation until internal usage is migrated

e998de8

Made calculate_binary_math_cast private

a65d1b4

Merge branch 'main' into gcd-lcm-decimal

fbb0b0e

doc: fixup comment placement

323721f

Deprecate calculate_binary_decimal_math and port usages in round UDF

a26eb47

Update non-whole number case with scales to match Postgres

07401f7

Jefffrey approved these changes Jun 14, 2026

View reviewed changes

Jefffrey added this pull request to the merge queue Jun 16, 2026

theirix mentioned this pull request Jun 16, 2026

Support different scales for decimal binary math functions #19874

Closed

theirix mentioned this pull request Jun 16, 2026

Improve scale support for binary decimal operations #19621

Closed

Merged via the queue into apache:main with commit a1e88e2 Jun 16, 2026
35 checks passed

Uh oh!

Conversation

theirix commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

github-actions Bot commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

theirix commented May 31, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jefffrey commented Jun 16, 2026

Uh oh!

theirix commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

theirix commented May 30, 2026 •

edited

Loading

github-actions Bot commented May 30, 2026 •

edited

Loading