Skip to content

Weight Window Edge Case Fixes#3968

Open
jtramm wants to merge 18 commits into
openmc-dev:developfrom
jtramm:ww_edge_cases
Open

Weight Window Edge Case Fixes#3968
jtramm wants to merge 18 commits into
openmc-dev:developfrom
jtramm:ww_edge_cases

Conversation

@jtramm

@jtramm jtramm commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Fix two weight window application edge cases

Note: This PR requires commits from #3965, so definitely wait to review this PR until #3965 is merged first!

Summary

Two small fixes to how weight windows are applied during transport, both found while auditing apply_weight_window() during the MGXS bootstrapping work (#3965):

  1. Zero lower bounds turn the weight window game off instead of demanding unbounded splitting.
  2. The weight cutoff is normalized with the window bounds instead of being applied in absolute units inside an otherwise frame-normalized game.

Neither change affects any existing test results.

Change 1 — zero-bound cells play no game

A weight window lower bound of zero means no weight window information exists for that cell. OpenMC's weight window generators already encode this convention by marking cells with no tally information as -1 (which fails WeightWindow::is_valid()), and MCNP uses zero to disable the game per cell — from the MCNP 6.2 manual (WWN card, note 2): "If wij=0, the weight-window game is turned off in cell j for energy or time bin i and the weight cutoff game is turned on with a 1-for-2 roulette limit." Files imported with WeightWindowsList.from_wwinp pass those zeros through unchanged.

Previously a zero lower bound passed is_valid() and was treated as a real window: any particle entering the cell sat above the zero upper bound and split by max_split at every weight window checkpoint (weight/0 = inf), multiplying the population until terminated by absorption/leakage, the weight cutoff, or max_history_splits (default 10^7). Besides wwinp imports, zero-bound cells are theoretically (if extremely unlikely) reachable from FW-CADIS generation when a cell has a nonzero forward tally but a zero adjoint flux in some energy group (the inverse step maps those to exactly 0.0 rather than -1). A separate pathology affects particles born under a zero window: wgt_ww_born is set to the window midpoint of zero, producing an infinite normalization factor and (with the cutoff scaled, see Change 2) killing such particles at birth. Both misbehaviors are cured by playing no game.

The fix is to treat any non-positive lower bound as "no information": is_valid() now requires a strictly positive lower bound, making the zero case consistent with the generator's -1 convention and with MCNP's semantics.

Change 2 — weight cutoff normalized with the window

In PR #3459 - I added the ability for apply_weight_window() to rescale the window into the particle's birth-weight frame — lower, upper, and survival_weight are all multiplied by wgt_born / wgt_ww_born (and by ww_factor where applicable) — but weight_cutoff was compared in absolute units. The cutoff kill is the one biased operation in the weight window game (weight is deleted without roulette), justified by the discarded weight being negligible — and "negligible" is a statement relative to the window bounds, not an absolute scale. For particles born with weights far from the window normalization (e.g., strongly biased sources or unusual window normalizations), an absolute cutoff can sit inside the normalized windows and silently cull in-window particles. WeightWindow::scale() now scales the cutoff with the bounds, keeping it at the same relative depth below the window in every frame.

Testing

The fix is pinned by two invariant tests built on the existing weightwindows test problem, each expressing one consequence of "a non-positive lower bound means no information." Both are self-comparing (two runs of the same model with identical seeds must agree), so they need no reference files and no statistical tolerances, and each was verified to fail without Change 1. They cover the two distinct code paths a zero bound corrupts:

  • test_zero_bound_windows_play_no_game: all-zero windows must transport identically to weight windows disabled. This covers particles born under a zero window, where the birth normalization (wgt_ww_born = window midpoint = 0) previously produced an infinite scale factor. Without the fix the tallies differ; with it the runs match. ~1.5 s.
  • test_zero_and_negative_bounds_equivalent: windows with a zero-bound region must transport identically to the same windows with a -1 region, with particles born under valid windows encountering the region in flight. This covers the splitting pathology: without the fix this test does not complete within ten minutes (the zero region splits every particle by max_split at every checkpoint, even at the test model's max_history_splits of 200); with it the runs match. ~1.9 s.

The full weight window test family and the MGXS bootstrap regression test pass with no reference changes: the existing test window files use -1 (not 0) for no-information cells, and at the default cutoff of 1e-38 no particle weight approaches the cutoff in either frame.

Change 2 has no observable effect in any existing test by construction; it is a consistency fix whose behavior difference only appears for cutoffs or normalization frames far from the defaults. No test is added for it — constructing one would require a contrived frame, and the runtime consequence of the old behavior (a biased kill) is exactly the kind of silent statistical effect a fixed-seed test cannot meaningfully pin.

Checklist

  • I have performed a self-review of my own code
  • I have run clang-format (version 18) on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

John Tramm and others added 18 commits June 9, 2026 17:08
…bugs

Add a `weight_windows_file` argument to Model.convert_to_multigroup() and fix
two random ray solver bugs found while developing it.

Bootstrapping machinery:
- The "material_wise" MGXS generation method can now load and apply weight
  windows during its continuous energy solve via the new `weight_windows_file`
  argument. Materials far behind shielding -- which an analog solve cannot
  reach -- are then tallied and receive nonzero cross sections. The argument is
  only valid for "material_wise"; a warning is issued and it is ignored for the
  "stochastic_slab" and "infinite_medium" methods.

Random ray miss-rate / intersection reporting fix:
- RandomRaySimulation::simulate() now resets avg_miss_rate_ and
  total_geometric_intersections_ at the start of each solve. Previously a second
  solve on the same object (e.g. the adjoint solve during FW-CADIS weight window
  generation) accumulated onto the first solve's running totals, inflating the
  reported miss rate and intersection count. Reporting only; no effect on results.

Random ray weight-window application fix:
- apply_weight_windows() is now a no-op in random ray mode. Random ray rays
  carry unit weight and are not Monte Carlo particles, so subjecting them to
  weight-window roulette/splitting is meaningless and corrupted the transport
  sweep. This occurred when surface checkpoints were enabled and an on-the-fly
  weight window generator auto-enabled weight_windows_on, killing rays during
  the adjoint sweep.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add openmc.examples.sphere_with_shielded_pocket: a continuous energy
deep-shielding model with a steel pocket behind ~1 m of concrete, reached by
analog histories with probability ~4e-5. The concrete sphere is offset from
the source so the deep shielding (and thus the wide weight window range)
exists only in the small solid angle toward the pocket, keeping weight window
splitting cheap and the model fast enough for regression testing (~8 s serial
for the full three-stage bootstrap workflow). Intended for weight window and
variance reduction feature tests.

The new regression test runs the full bootstrap workflow on this model:
stochastic_slab MGXS -> random ray FW-CADIS weight windows -> material_wise
MGXS with weight_windows_file. The analog material_wise solve must produce
zero cross sections for the steel (negative control); the bootstrapped solve
must cover all materials. Verdicts verified robust over 10 Monte Carlo seeds
and 3 weight-window-generation seeds.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test as a TolerantPyAPITestHarness with
reference files so that subtle changes to the random ray or weight window
machinery are detected, not just binary coverage flips. The compared results
contain the FW-CADIS weight window bounds and the analog material-wise MGXS
data, both measured reproducible across runs and thread counts (bounds to
~5e-15; analog MGXS bitwise). The bootstrapped MGXS values are deliberately
not value-compared: a Monte Carlo solve with weight window splitting is not
run-to-run reproducible when multithreaded (same seed and thread count gave
few-percent differences, with tail energy groups flipping between zero and
nonzero), so only its binary material coverage -- the feature guarantee -- is
asserted. The coverage asserts are hard asserts inside the harness run so a
broken feature cannot be baked into references by an --update run.

Reference files generated with OPENMC_ENABLE_STRICT_FP=on (RelWithDebInfo) at
OMP_NUM_THREADS=2, matching CI; the test passes unchanged at 1, 2, and 4
threads and runs in ~8 s serial.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Correct and strengthen the MGXS bootstrap regression test. Measurement showed
the weight-window-split Monte Carlo solve IS reproducible across runs and
thread counts given a bit-identical weight window file (<= 4e-16, i.e. tally
summation ulps) -- the secondary particle sorting and PRNG stream control work
as designed. The instability previously attributed to it actually originates
upstream: the multithreaded random ray solve carries last-ulp lock-order
jitter in the window bounds (~6e-16 run-to-run at 2 threads, bitwise zero
serially), and weight window split/roulette decisions amplify even a one-ulp
bound perturbation into order-unity tally changes (measured 78%), because
particle weights are set from the bound values themselves and comparisons can
sit exactly on decision boundaries.

The window-producing stages (stochastic_slab MGXS and the random ray FW-CADIS
solve) are therefore pinned to a single thread, making the entire workflow
bit-reproducible at any ambient thread count. This allows the bootstrapped
MGXS library -- and with it the weight window application machinery -- to be
reference-compared as well: the results digest now contains the slab, analog,
and bootstrapped MGXS libraries plus the weight window bounds.

Reference files regenerated against the official NNDC HDF5 data library
(matching the regression suite requirement) with OPENMC_ENABLE_STRICT_FP=on at
OMP_NUM_THREADS=2; the test passes unchanged at 1, 2, and 4 threads (~10 s
serial). Analog pocket-reach probability re-measured with NNDC data: 4.6e-5,
i.e. <1% chance per seed of a false analog cover.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Weight window roulette survivors are assigned weights derived from the window
bounds themselves (survival_ratio x lower bound), and subsequent integer
splitting can land a particle's weight exactly back on a bound value -- e.g.
(3 x lower)/3 compared against lower. The split/roulette branch taken is then
decided by the last ulp of the bound. Since weight window data carries
ulp-level noise (for example from non-associative parallel reductions in the
random ray or Monte Carlo solves that generated it), transport results were
chaotically sensitive to bit-level differences in the weight window file: a
one-ulp perturbation of the bounds was measured to change deep-tally results
by 78%, because a single flipped decision changes a secondary generation
count, which shifts the global track counter used to seed subsequent
secondaries and decorrelates the remainder of the simulation.

Comparing against the window with a small relative dead band (1e-9) removes
the sensitivity structurally: the thresholds no longer coincide with the
small-rational lattice of weights the algorithm produces, and a bound
perturbation moves a pinned weight and its threshold together. The same
one-ulp window perturbation now changes results by <1e-15. The band is
statistically negligible (1e-9 of a window spanning a factor of several) and
weight window games are unbiased regardless of threshold placement.

Two regression test references regenerated for trajectory changes at former
exact ties: weightwindows[False-local] (single value shift of ~2e-5) and the
bootstrap test's MGXS digest.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Move the weight window comparison tolerance to constants.h as
WEIGHT_WINDOW_REL_TOL alongside its rationale, rather than burying it as a
local in apply_weight_window().

Stabilize the split count as well: ceil(weight/upper) has tie points at every
integer ratio, and the weight window arithmetic itself manufactures exact
integer ratios through a second pathway -- a roulette survivor assigned
weight * max_split (e.g. 10x a weight that sits on the bound lattice), later
split in a cell whose upper bound is an exact multiple of the same lower bound
(10/5 = 2). Computing ceil((1 - tol) * ratio), clamped to at least 2, makes
ratios within the dead band of an integer round down consistently.

With both protections, a weight window file perturbed at the ulp level now
changes bootstrap-style deep-tally results by <1e-15 in both observed tie
pathways (previously order unity). The bootstrap regression test no longer
pins its window-generation stages to a single thread: all stages run at the
ambient thread count and the test passes unchanged at 1, 2, and 4 threads
against one set of reference files, so it now also guards the robustness of
weight window transport against the ulp-level reduction noise inherent to
multithreaded solves. Full weight window test suite passes without further
reference changes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add the versionadded directive for the new weight_windows_file argument,
tighten the WEIGHT_WINDOW_REL_TOL comment, and apply clang-format to the
touched region of weight_windows.cpp.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test to follow the random_ray_auto_convert
pattern: generate weight windows (stochastic_slab MGXS + random ray FW-CADIS),
bootstrap the material-wise library via weight_windows_file, then run a random
ray solve with the bootstrapped library and compare a per-region, per-group
flux tally through the standard statepoint digest. The pocket flux depends
directly on the bootstrapped steel cross sections, and every upstream stage
feeds the compared tallies, so the workflow is tested implicitly without
custom result digests, coverage asserts, or an analog control run. The
reference file shrinks from ~240 kB to ~300 bytes. The test still runs all
stages at the ambient thread count and passes unchanged at 1, 2, and 4
threads against one set of references.

Also express the example sphere's center and radius in terms of its x extents
(surface at x = 120 flush with the pocket, x = -12 behind the source cavity)
so the geometry parameters document themselves.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test as the complete five-run workflow:
stochastic_slab MGXS generation, random ray FW-CADIS weight window generation,
material_wise MGXS generation bootstrapped with those windows, a second
FW-CADIS generation from the higher-fidelity library, and a final continuous
energy Monte Carlo solve using the improved windows, whose per-region flux
tallies are compared against the reference results through the standard
harness. Every stage of the workflow feeds the compared tallies, including
the weight window application in the final continuous energy solve.

Also center the example sphere at the origin with the source cavity offset
toward the -x surface (a pure translation of the previous geometry), which
describes the asymmetric-shielding design more conventionally.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Random ray requires ray origins sampled uniformly in space over the entire
domain, and ray sampling does not honor source domain constraints with the
halton sample method. Enclose the example's concrete sphere in a vacuum-bounded
box with a void gap so that the default box ray source from
convert_to_random_ray is uniform over the whole domain, and drop the test's
ray source override (which had restricted sampling to a box inscribed in the
sphere, biasing the ray density).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Drop the monkeypatch-based plumbing test, which duplicated coverage now
provided end-to-end by the bootstrap regression test while coupling the test
suite to a private helper. The remaining unit tests cover only the documented
validation paths (wrong-method warning and missing-file error), which the
regression test does not exercise and which require no nuclear data.

Also fix the grammar of the weight window split comment that was extended by
an earlier commit.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The path resolution behavior is exercised by the bootstrap regression test
(which passes a relative path while generation runs in a temporary directory),
a nonexistent path errors at runtime regardless, and the wrong-method warning
is a documented one-line courtesy that does not warrant a test fixture.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
A weight window lower bound of zero indicates that no weight window
information exists for a cell. OpenMC's weight window generators already mark
such cells with -1 (treated as invalid), and MCNP wwinp files use zero to
turn the weight window game off in a cell, with files imported from MCNP
passing those zeros through unchanged. Previously a zero lower bound was
treated as a valid window: any particle entering the cell sat above the zero
upper bound and was split by max_split at every weight window checkpoint
(weight/0 = inf), multiplying the particle population until terminated only
by the weight cutoff or the maximum split count -- up to millions of tracks
per entering history at the default limits. Zero-bound cells are also
reachable from FW-CADIS generation when a cell has a forward tally but a zero
adjoint flux in some energy group.

Treat a non-positive lower bound as having no information: is_valid() now
requires a strictly positive lower bound, so such cells play no game,
consistent with the generator's -1 convention and MCNP's wwinp semantics. A
regression test verifies that transport through all-zero windows is identical
to transport with weight windows disabled (verified to fail without the fix).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
apply_weight_window() rescales the window bounds and survival weight into the
particle's birth-weight frame (wgt_born / wgt_ww_born), but the weight cutoff
was applied in absolute units. The cutoff kill is the one biased operation in
the weight window game -- weight is removed without roulette -- and its
justification is that the discarded weight is negligible, which is a
statement relative to the window bounds rather than an absolute scale. For
particles born with weights far from the window normalization (e.g., strongly
biased sources), an absolute cutoff can sit inside the normalized windows and
cull in-window particles. Scaling the cutoff with the bounds keeps it at the
same relative depth below the window in every frame. No existing test results
change: at the default cutoff of 1e-38, particle weights never approach the
cutoff in either frame.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add a second invariant test for non-positive weight window bounds covering
the in-flight case: particles born under valid windows that encounter the
no-information region during transport. Zero and negative lower bounds must
produce identical transport, since both mean that no weight window
information exists in the cell. Without the zero-bound fix this test hangs:
the zero region demands a split at every checkpoint and the resulting
population multiplication does not complete within ten minutes even at the
test model's max_history_splits of 200.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant