Weight Window Edge Case Fixes#3968
Open
jtramm wants to merge 18 commits into
Open
Conversation
…bugs Add a `weight_windows_file` argument to Model.convert_to_multigroup() and fix two random ray solver bugs found while developing it. Bootstrapping machinery: - The "material_wise" MGXS generation method can now load and apply weight windows during its continuous energy solve via the new `weight_windows_file` argument. Materials far behind shielding -- which an analog solve cannot reach -- are then tallied and receive nonzero cross sections. The argument is only valid for "material_wise"; a warning is issued and it is ignored for the "stochastic_slab" and "infinite_medium" methods. Random ray miss-rate / intersection reporting fix: - RandomRaySimulation::simulate() now resets avg_miss_rate_ and total_geometric_intersections_ at the start of each solve. Previously a second solve on the same object (e.g. the adjoint solve during FW-CADIS weight window generation) accumulated onto the first solve's running totals, inflating the reported miss rate and intersection count. Reporting only; no effect on results. Random ray weight-window application fix: - apply_weight_windows() is now a no-op in random ray mode. Random ray rays carry unit weight and are not Monte Carlo particles, so subjecting them to weight-window roulette/splitting is meaningless and corrupted the transport sweep. This occurred when surface checkpoints were enabled and an on-the-fly weight window generator auto-enabled weight_windows_on, killing rays during the adjoint sweep. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add openmc.examples.sphere_with_shielded_pocket: a continuous energy deep-shielding model with a steel pocket behind ~1 m of concrete, reached by analog histories with probability ~4e-5. The concrete sphere is offset from the source so the deep shielding (and thus the wide weight window range) exists only in the small solid angle toward the pocket, keeping weight window splitting cheap and the model fast enough for regression testing (~8 s serial for the full three-stage bootstrap workflow). Intended for weight window and variance reduction feature tests. The new regression test runs the full bootstrap workflow on this model: stochastic_slab MGXS -> random ray FW-CADIS weight windows -> material_wise MGXS with weight_windows_file. The analog material_wise solve must produce zero cross sections for the steel (negative control); the bootstrapped solve must cover all materials. Verdicts verified robust over 10 Monte Carlo seeds and 3 weight-window-generation seeds. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test as a TolerantPyAPITestHarness with reference files so that subtle changes to the random ray or weight window machinery are detected, not just binary coverage flips. The compared results contain the FW-CADIS weight window bounds and the analog material-wise MGXS data, both measured reproducible across runs and thread counts (bounds to ~5e-15; analog MGXS bitwise). The bootstrapped MGXS values are deliberately not value-compared: a Monte Carlo solve with weight window splitting is not run-to-run reproducible when multithreaded (same seed and thread count gave few-percent differences, with tail energy groups flipping between zero and nonzero), so only its binary material coverage -- the feature guarantee -- is asserted. The coverage asserts are hard asserts inside the harness run so a broken feature cannot be baked into references by an --update run. Reference files generated with OPENMC_ENABLE_STRICT_FP=on (RelWithDebInfo) at OMP_NUM_THREADS=2, matching CI; the test passes unchanged at 1, 2, and 4 threads and runs in ~8 s serial. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Correct and strengthen the MGXS bootstrap regression test. Measurement showed the weight-window-split Monte Carlo solve IS reproducible across runs and thread counts given a bit-identical weight window file (<= 4e-16, i.e. tally summation ulps) -- the secondary particle sorting and PRNG stream control work as designed. The instability previously attributed to it actually originates upstream: the multithreaded random ray solve carries last-ulp lock-order jitter in the window bounds (~6e-16 run-to-run at 2 threads, bitwise zero serially), and weight window split/roulette decisions amplify even a one-ulp bound perturbation into order-unity tally changes (measured 78%), because particle weights are set from the bound values themselves and comparisons can sit exactly on decision boundaries. The window-producing stages (stochastic_slab MGXS and the random ray FW-CADIS solve) are therefore pinned to a single thread, making the entire workflow bit-reproducible at any ambient thread count. This allows the bootstrapped MGXS library -- and with it the weight window application machinery -- to be reference-compared as well: the results digest now contains the slab, analog, and bootstrapped MGXS libraries plus the weight window bounds. Reference files regenerated against the official NNDC HDF5 data library (matching the regression suite requirement) with OPENMC_ENABLE_STRICT_FP=on at OMP_NUM_THREADS=2; the test passes unchanged at 1, 2, and 4 threads (~10 s serial). Analog pocket-reach probability re-measured with NNDC data: 4.6e-5, i.e. <1% chance per seed of a false analog cover. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Weight window roulette survivors are assigned weights derived from the window bounds themselves (survival_ratio x lower bound), and subsequent integer splitting can land a particle's weight exactly back on a bound value -- e.g. (3 x lower)/3 compared against lower. The split/roulette branch taken is then decided by the last ulp of the bound. Since weight window data carries ulp-level noise (for example from non-associative parallel reductions in the random ray or Monte Carlo solves that generated it), transport results were chaotically sensitive to bit-level differences in the weight window file: a one-ulp perturbation of the bounds was measured to change deep-tally results by 78%, because a single flipped decision changes a secondary generation count, which shifts the global track counter used to seed subsequent secondaries and decorrelates the remainder of the simulation. Comparing against the window with a small relative dead band (1e-9) removes the sensitivity structurally: the thresholds no longer coincide with the small-rational lattice of weights the algorithm produces, and a bound perturbation moves a pinned weight and its threshold together. The same one-ulp window perturbation now changes results by <1e-15. The band is statistically negligible (1e-9 of a window spanning a factor of several) and weight window games are unbiased regardless of threshold placement. Two regression test references regenerated for trajectory changes at former exact ties: weightwindows[False-local] (single value shift of ~2e-5) and the bootstrap test's MGXS digest. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Move the weight window comparison tolerance to constants.h as WEIGHT_WINDOW_REL_TOL alongside its rationale, rather than burying it as a local in apply_weight_window(). Stabilize the split count as well: ceil(weight/upper) has tie points at every integer ratio, and the weight window arithmetic itself manufactures exact integer ratios through a second pathway -- a roulette survivor assigned weight * max_split (e.g. 10x a weight that sits on the bound lattice), later split in a cell whose upper bound is an exact multiple of the same lower bound (10/5 = 2). Computing ceil((1 - tol) * ratio), clamped to at least 2, makes ratios within the dead band of an integer round down consistently. With both protections, a weight window file perturbed at the ulp level now changes bootstrap-style deep-tally results by <1e-15 in both observed tie pathways (previously order unity). The bootstrap regression test no longer pins its window-generation stages to a single thread: all stages run at the ambient thread count and the test passes unchanged at 1, 2, and 4 threads against one set of reference files, so it now also guards the robustness of weight window transport against the ulp-level reduction noise inherent to multithreaded solves. Full weight window test suite passes without further reference changes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add the versionadded directive for the new weight_windows_file argument, tighten the WEIGHT_WINDOW_REL_TOL comment, and apply clang-format to the touched region of weight_windows.cpp. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test to follow the random_ray_auto_convert pattern: generate weight windows (stochastic_slab MGXS + random ray FW-CADIS), bootstrap the material-wise library via weight_windows_file, then run a random ray solve with the bootstrapped library and compare a per-region, per-group flux tally through the standard statepoint digest. The pocket flux depends directly on the bootstrapped steel cross sections, and every upstream stage feeds the compared tallies, so the workflow is tested implicitly without custom result digests, coverage asserts, or an analog control run. The reference file shrinks from ~240 kB to ~300 bytes. The test still runs all stages at the ambient thread count and passes unchanged at 1, 2, and 4 threads against one set of references. Also express the example sphere's center and radius in terms of its x extents (surface at x = 120 flush with the pocket, x = -12 behind the source cavity) so the geometry parameters document themselves. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test as the complete five-run workflow: stochastic_slab MGXS generation, random ray FW-CADIS weight window generation, material_wise MGXS generation bootstrapped with those windows, a second FW-CADIS generation from the higher-fidelity library, and a final continuous energy Monte Carlo solve using the improved windows, whose per-region flux tallies are compared against the reference results through the standard harness. Every stage of the workflow feeds the compared tallies, including the weight window application in the final continuous energy solve. Also center the example sphere at the origin with the source cavity offset toward the -x surface (a pure translation of the previous geometry), which describes the asymmetric-shielding design more conventionally. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Random ray requires ray origins sampled uniformly in space over the entire domain, and ray sampling does not honor source domain constraints with the halton sample method. Enclose the example's concrete sphere in a vacuum-bounded box with a void gap so that the default box ray source from convert_to_random_ray is uniform over the whole domain, and drop the test's ray source override (which had restricted sampling to a box inscribed in the sphere, biasing the ray density). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Drop the monkeypatch-based plumbing test, which duplicated coverage now provided end-to-end by the bootstrap regression test while coupling the test suite to a private helper. The remaining unit tests cover only the documented validation paths (wrong-method warning and missing-file error), which the regression test does not exercise and which require no nuclear data. Also fix the grammar of the weight window split comment that was extended by an earlier commit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The path resolution behavior is exercised by the bootstrap regression test (which passes a relative path while generation runs in a temporary directory), a nonexistent path errors at runtime regardless, and the wrong-method warning is a documented one-line courtesy that does not warrant a test fixture. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
A weight window lower bound of zero indicates that no weight window information exists for a cell. OpenMC's weight window generators already mark such cells with -1 (treated as invalid), and MCNP wwinp files use zero to turn the weight window game off in a cell, with files imported from MCNP passing those zeros through unchanged. Previously a zero lower bound was treated as a valid window: any particle entering the cell sat above the zero upper bound and was split by max_split at every weight window checkpoint (weight/0 = inf), multiplying the particle population until terminated only by the weight cutoff or the maximum split count -- up to millions of tracks per entering history at the default limits. Zero-bound cells are also reachable from FW-CADIS generation when a cell has a forward tally but a zero adjoint flux in some energy group. Treat a non-positive lower bound as having no information: is_valid() now requires a strictly positive lower bound, so such cells play no game, consistent with the generator's -1 convention and MCNP's wwinp semantics. A regression test verifies that transport through all-zero windows is identical to transport with weight windows disabled (verified to fail without the fix). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
apply_weight_window() rescales the window bounds and survival weight into the particle's birth-weight frame (wgt_born / wgt_ww_born), but the weight cutoff was applied in absolute units. The cutoff kill is the one biased operation in the weight window game -- weight is removed without roulette -- and its justification is that the discarded weight is negligible, which is a statement relative to the window bounds rather than an absolute scale. For particles born with weights far from the window normalization (e.g., strongly biased sources), an absolute cutoff can sit inside the normalized windows and cull in-window particles. Scaling the cutoff with the bounds keeps it at the same relative depth below the window in every frame. No existing test results change: at the default cutoff of 1e-38, particle weights never approach the cutoff in either frame. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add a second invariant test for non-positive weight window bounds covering the in-flight case: particles born under valid windows that encounter the no-information region during transport. Zero and negative lower bounds must produce identical transport, since both mean that no weight window information exists in the cell. Without the zero-bound fix this test hangs: the zero region demands a split at every checkpoint and the resulting population multiplication does not complete within ten minutes even at the test model's max_history_splits of 200. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…negative tallies.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix two weight window application edge cases
Note: This PR requires commits from #3965, so definitely wait to review this PR until #3965 is merged first!
Summary
Two small fixes to how weight windows are applied during transport, both found while auditing
apply_weight_window()during the MGXS bootstrapping work (#3965):Neither change affects any existing test results.
Change 1 — zero-bound cells play no game
A weight window lower bound of zero means no weight window information exists for that cell. OpenMC's weight window generators already encode this convention by marking cells with no tally information as -1 (which fails
WeightWindow::is_valid()), and MCNP uses zero to disable the game per cell — from the MCNP 6.2 manual (WWN card, note 2): "If wij=0, the weight-window game is turned off in cell j for energy or time bin i and the weight cutoff game is turned on with a 1-for-2 roulette limit." Files imported withWeightWindowsList.from_wwinppass those zeros through unchanged.Previously a zero lower bound passed
is_valid()and was treated as a real window: any particle entering the cell sat above the zero upper bound and split bymax_splitat every weight window checkpoint (weight/0 = inf), multiplying the population until terminated by absorption/leakage, the weight cutoff, ormax_history_splits(default 10^7). Besides wwinp imports, zero-bound cells are theoretically (if extremely unlikely) reachable from FW-CADIS generation when a cell has a nonzero forward tally but a zero adjoint flux in some energy group (the inverse step maps those to exactly 0.0 rather than -1). A separate pathology affects particles born under a zero window:wgt_ww_bornis set to the window midpoint of zero, producing an infinite normalization factor and (with the cutoff scaled, see Change 2) killing such particles at birth. Both misbehaviors are cured by playing no game.The fix is to treat any non-positive lower bound as "no information":
is_valid()now requires a strictly positive lower bound, making the zero case consistent with the generator's -1 convention and with MCNP's semantics.Change 2 — weight cutoff normalized with the window
In PR #3459 - I added the ability for
apply_weight_window()to rescale the window into the particle's birth-weight frame —lower,upper, andsurvival_weightare all multiplied bywgt_born / wgt_ww_born(and byww_factorwhere applicable) — butweight_cutoffwas compared in absolute units. The cutoff kill is the one biased operation in the weight window game (weight is deleted without roulette), justified by the discarded weight being negligible — and "negligible" is a statement relative to the window bounds, not an absolute scale. For particles born with weights far from the window normalization (e.g., strongly biased sources or unusual window normalizations), an absolute cutoff can sit inside the normalized windows and silently cull in-window particles.WeightWindow::scale()now scales the cutoff with the bounds, keeping it at the same relative depth below the window in every frame.Testing
The fix is pinned by two invariant tests built on the existing
weightwindowstest problem, each expressing one consequence of "a non-positive lower bound means no information." Both are self-comparing (two runs of the same model with identical seeds must agree), so they need no reference files and no statistical tolerances, and each was verified to fail without Change 1. They cover the two distinct code paths a zero bound corrupts:test_zero_bound_windows_play_no_game: all-zero windows must transport identically to weight windows disabled. This covers particles born under a zero window, where the birth normalization (wgt_ww_born= window midpoint = 0) previously produced an infinite scale factor. Without the fix the tallies differ; with it the runs match. ~1.5 s.test_zero_and_negative_bounds_equivalent: windows with a zero-bound region must transport identically to the same windows with a -1 region, with particles born under valid windows encountering the region in flight. This covers the splitting pathology: without the fix this test does not complete within ten minutes (the zero region splits every particle bymax_splitat every checkpoint, even at the test model'smax_history_splitsof 200); with it the runs match. ~1.9 s.The full weight window test family and the MGXS bootstrap regression test pass with no reference changes: the existing test window files use -1 (not 0) for no-information cells, and at the default cutoff of 1e-38 no particle weight approaches the cutoff in either frame.
Change 2 has no observable effect in any existing test by construction; it is a consistency fix whose behavior difference only appears for cutoffs or normalization frames far from the defaults. No test is added for it — constructing one would require a contrived frame, and the runtime consequence of the old behavior (a biased kill) is exactly the kind of silent statistical effect a fixed-seed test cannot meaningfully pin.
Checklist