Skip to content

WIP: start issue #240 implementation for skipped downstream node status#249

Open
Pyrotechnic679 wants to merge 2 commits into
microsoft:mainfrom
Pyrotechnic679:issue-240-skipped-status-wip
Open

WIP: start issue #240 implementation for skipped downstream node status#249
Pyrotechnic679 wants to merge 2 commits into
microsoft:mainfrom
Pyrotechnic679:issue-240-skipped-status-wip

Conversation

@Pyrotechnic679

Copy link
Copy Markdown

Summary

This PR is an initial WIP for issue #240.

It starts implementation work to make downstream, unexecuted nodes observable after workflow failure (instead of remaining ambiguous as pending).

Current state

  • Initial schema/runtime/test/doc changes are included.
  • Work is not finished yet.
  • Full validation/testing has not been completed yet.

Intent

Opening early to signal active work and allow maintainers to review direction before finalizing implementation details.

@Pyrotechnic679

Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

@Pyrotechnic679 Pyrotechnic679 force-pushed the issue-240-skipped-status-wip branch from a33d705 to 8641868 Compare June 23, 2026 01:26
@Pyrotechnic679 Pyrotechnic679 marked this pull request as ready for review June 23, 2026 01:28
@pinodeca pinodeca linked an issue Jun 23, 2026 that may be closed by this pull request
…tions

- Widen df.nodes.nodes_status_chk in the 0.2.3->0.2.4 upgrade script to allow
  'skipped', matching the fresh-install DDL in src/lib.rs so upgraded installs
  also get the new terminal status (without it the runtime no-ops forever).
- Use canonical df.await_instance in test 49 (consistent with other e2e tests),
  fix branch-filter indentation, and add trailing newline.

Copy link
Copy Markdown
Contributor

Reviewed and pushed a follow-up commit (8d104c9) addressing the main blocker plus some cleanup, and ran the local validation matrix.

What changed

  • Upgrade-script DDL (blocking fix): the WIP added skipped to nodes_status_chk in the fresh-install DDL (src/lib.rs) but not to the 0.2.3 → 0.2.4 upgrade script. Without it, upgraded installs keep the old constraint, so the runtime's schema probe finds skipped unsupported and silently no-ops forever — the feature would only work on fresh installs. Added a DROP/ADD CONSTRAINT nodes_status_chk ... 'skipped' ... NOT VALID block to sql/pg_durable--0.2.3--0.2.4.sql that matches the install DDL exactly (pure widening, no data migration, Scenario B2 safe).
  • E2E test cleanup (tests/e2e/sql/49_...): switched to the canonical df.await_instance (every other e2e test uses it; df.wait_for_completion is the deprecated alias), fixed branch-filter indentation, and added the missing trailing newline.

Validation run locally

  • cargo fmt -p pg_durable -- --check — clean
  • cargo clippy --features pg17 — clean
  • ./scripts/test-unit.sh — 183 passed, 0 failed
  • ./scripts/test-e2e-local.sh 49_failed_downstream_nodes_skipped — PASS (both the THEN-chain and IF-condition scenarios)

Still open / discussion

The orchestration change also adds a scheduled activity on the failure path; note this alters recorded history for any orchestration that fails across a binary upgrade (narrow window) — flagging for awareness.

pinodeca added a commit that referenced this pull request Jun 23, 2026
The proposal previously framed the #240 skipped-node work as part of
'this PR'. That work landed earlier in PR #249; this PR (#263) defines
the consolidated state model and adds the remaining implementation
(cancelled, status_reason, race-loser reconciliation, loop reset).
pinodeca added a commit that referenced this pull request Jun 23, 2026
Adds docs/node-state-model.md, a design proposal consolidating the
node-status work for issues #240 (skipped downstream nodes) and #171
(race-loser nodes left running/pending). Defines instance and node
lifecycle states, legal transitions, a coarse status set plus nullable
status_reason, loop iteration-scoping/reset semantics, and prior-art
alignment with Airflow/Temporal/BPMN/Step Functions.

The #240 skipped-node work shipped earlier in PR #249; this PR (#263)
defines the consolidated model and will add the remaining implementation
(cancelled, status_reason, race-loser reconciliation, loop reset) after
review.

Design only; implementation to follow on this PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

No "skipped" indication for downstream steps after a step fails

2 participants