Skip to content

test: fix flaky test-watch-mode-inspect timeout#63361

Open
mcollina wants to merge 1 commit into
nodejs:mainfrom
mcollina:fix/flaky-test-watch-mode-inspect
Open

test: fix flaky test-watch-mode-inspect timeout#63361
mcollina wants to merge 1 commit into
nodejs:mainfrom
mcollina:fix/flaky-test-watch-mode-inspect

Conversation

@mcollina
Copy link
Copy Markdown
Member

Description

This PR fixes a persistent flaky timeout in test-watch-mode-inspect (marked FLAKY in , issue #44898).

Root Cause

A race condition between child-process restart and inspector-session connection:

  1. The test spawns a parent process with --inspect=0 and --watch
  2. It connects an inspector session and retrieves the child pid
  3. It touches the watched file → the child restarts
  4. It tries to connect a second inspector session immediately after resetPort()
  5. The WebSocket upgrade via /json/list + upgrade races against the restart:
    • /json/list may return an empty array (no targets yet) → response[0] throws → promise hangs
    • Or it gets the old (dying) target → WebSocket connects to a session that immediately ends → hangs
  6. Test runner kills after 120s → TIMEOUT

Fix

Replace the interval-based restart (write every 500ms, paused by a gettingDebuggedPid flag) with a single write + readiness gate:

  • restartAndWaitForReady(file, instance) writes the file once and returns a promise that resolves when the restarted child prints safe to debug now on stdout
  • The second getDebuggedPid call only fires after that promise resolves — guaranteeing the new child and its inspector session are fully ready

Removes the now-unused gettingDebuggedPid flag and the setTimeout backstop.

Validation

Both tests pass locally with the fix. The ordering is now deterministic:

Closes: #44898

@nodejs-github-bot nodejs-github-bot added needs-ci PRs that need a full CI run. test Issues and PRs related to the tests. labels May 16, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 90.05%. Comparing base (5750802) to head (49b6ea5).
⚠️ Report is 13 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #63361      +/-   ##
==========================================
- Coverage   90.06%   90.05%   -0.01%     
==========================================
  Files         714      714              
  Lines      225367   225522     +155     
  Branches    42593    42645      +52     
==========================================
+ Hits       202966   203103     +137     
- Misses      14182    14197      +15     
- Partials     8219     8222       +3     

see 38 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

This test randomly times out (~120s) on CI due to a race condition
between child-process restart (triggered by touching the watched file)
and the second inspector-session connection.

The old code used an interval-based restart (write every 500ms) and
a 'gettingDebuggedPid' flag to pause writes during a session. This
still left a race window where getDebuggedPid() would attempt to
connect the inspector via HTTP GET /json/list + WebSocket upgrade
either before the new child was ready (empty target list) or after
the old session was being destroyed, causing the promise to hang.

Fix: Replace the interval with a single write that triggers exactly
one restart, then wait for the restarted child's 'safe to debug now'
stdout line before connecting the second inspector session. This
eliminates the race by ensuring the new child process and its
inspector session are fully ready before any connection attempt.

Removes the now-unused gettingDebuggedPid flag and the pending
setTimeout delay that was needed as a backstop for the interval.

Fixes: nodejs#44898
Signed-off-by: Matteo Collina <hello@matteocollina.com>
PR-URL: nodejs#63361
@mcollina mcollina force-pushed the fix/flaky-test-watch-mode-inspect branch from cad4400 to 49b6ea5 Compare May 16, 2026 10:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-ci PRs that need a full CI run. test Issues and PRs related to the tests.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Investigate flaky test-watch-mode (no. 3)

2 participants