Skip to content

OTLP nanosecond timestamp overflow in webapp event repository #3292

@nicktrn

Description

@nicktrn

Several places in the webapp multiply epoch milliseconds by 1,000,000 before converting to BigInt, which causes IEEE 754 precision loss (~256ns errors in ~0.2% of cases). The result exceeds Number.MAX_SAFE_INTEGER (~9e15) since epoch-ms * 1e6 is ~1.7e18.

The correct pattern is BigInt(ms) * BigInt(1_000_000) (multiplication after BigInt conversion). This already exists in convertDateToNanoseconds() in the same file - the buggy functions just didn't use it.

Affected locations:

  1. apps/webapp/app/v3/eventRepository/common.server.ts:24 - getNowInNanoseconds()

    // Bug: multiplication in float-land
    return BigInt(new Date().getTime() * 1_000_000);
    // Fix:
    return BigInt(new Date().getTime()) * BigInt(1_000_000);
  2. apps/webapp/app/v3/eventRepository/common.server.ts:38 - calculateDurationFromStart()

    // Bug:
    const duration = Number(BigInt($endtime.getTime() * 1_000_000) - startTime);
    // Fix:
    const duration = Number(BigInt($endtime.getTime()) * BigInt(1_000_000) - startTime);
  3. apps/webapp/app/v3/eventRepository/index.server.ts:217 - recordRunDebugLog()

  4. apps/webapp/app/v3/runEngineHandlers.server.ts:431 - retry event recording

Impact: Low. The errors are +-256 nanoseconds, which is unlikely to cause visible issues. But it's an easy fix and prevents any edge cases with span ordering.

Context: Found during compute workload manager PR review (#3114). The supervisor's msToNano() was fixed in that PR using a split approach that also preserves sub-ms precision from performance.now() arithmetic.

Linear: TRI-8269

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions