MGXS Bootstrapping#3965
Open
jtramm wants to merge 14 commits into
Open
Conversation
…bugs Add a `weight_windows_file` argument to Model.convert_to_multigroup() and fix two random ray solver bugs found while developing it. Bootstrapping machinery: - The "material_wise" MGXS generation method can now load and apply weight windows during its continuous energy solve via the new `weight_windows_file` argument. Materials far behind shielding -- which an analog solve cannot reach -- are then tallied and receive nonzero cross sections. The argument is only valid for "material_wise"; a warning is issued and it is ignored for the "stochastic_slab" and "infinite_medium" methods. Random ray miss-rate / intersection reporting fix: - RandomRaySimulation::simulate() now resets avg_miss_rate_ and total_geometric_intersections_ at the start of each solve. Previously a second solve on the same object (e.g. the adjoint solve during FW-CADIS weight window generation) accumulated onto the first solve's running totals, inflating the reported miss rate and intersection count. Reporting only; no effect on results. Random ray weight-window application fix: - apply_weight_windows() is now a no-op in random ray mode. Random ray rays carry unit weight and are not Monte Carlo particles, so subjecting them to weight-window roulette/splitting is meaningless and corrupted the transport sweep. This occurred when surface checkpoints were enabled and an on-the-fly weight window generator auto-enabled weight_windows_on, killing rays during the adjoint sweep. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add openmc.examples.sphere_with_shielded_pocket: a continuous energy deep-shielding model with a steel pocket behind ~1 m of concrete, reached by analog histories with probability ~4e-5. The concrete sphere is offset from the source so the deep shielding (and thus the wide weight window range) exists only in the small solid angle toward the pocket, keeping weight window splitting cheap and the model fast enough for regression testing (~8 s serial for the full three-stage bootstrap workflow). Intended for weight window and variance reduction feature tests. The new regression test runs the full bootstrap workflow on this model: stochastic_slab MGXS -> random ray FW-CADIS weight windows -> material_wise MGXS with weight_windows_file. The analog material_wise solve must produce zero cross sections for the steel (negative control); the bootstrapped solve must cover all materials. Verdicts verified robust over 10 Monte Carlo seeds and 3 weight-window-generation seeds. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test as a TolerantPyAPITestHarness with reference files so that subtle changes to the random ray or weight window machinery are detected, not just binary coverage flips. The compared results contain the FW-CADIS weight window bounds and the analog material-wise MGXS data, both measured reproducible across runs and thread counts (bounds to ~5e-15; analog MGXS bitwise). The bootstrapped MGXS values are deliberately not value-compared: a Monte Carlo solve with weight window splitting is not run-to-run reproducible when multithreaded (same seed and thread count gave few-percent differences, with tail energy groups flipping between zero and nonzero), so only its binary material coverage -- the feature guarantee -- is asserted. The coverage asserts are hard asserts inside the harness run so a broken feature cannot be baked into references by an --update run. Reference files generated with OPENMC_ENABLE_STRICT_FP=on (RelWithDebInfo) at OMP_NUM_THREADS=2, matching CI; the test passes unchanged at 1, 2, and 4 threads and runs in ~8 s serial. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Correct and strengthen the MGXS bootstrap regression test. Measurement showed the weight-window-split Monte Carlo solve IS reproducible across runs and thread counts given a bit-identical weight window file (<= 4e-16, i.e. tally summation ulps) -- the secondary particle sorting and PRNG stream control work as designed. The instability previously attributed to it actually originates upstream: the multithreaded random ray solve carries last-ulp lock-order jitter in the window bounds (~6e-16 run-to-run at 2 threads, bitwise zero serially), and weight window split/roulette decisions amplify even a one-ulp bound perturbation into order-unity tally changes (measured 78%), because particle weights are set from the bound values themselves and comparisons can sit exactly on decision boundaries. The window-producing stages (stochastic_slab MGXS and the random ray FW-CADIS solve) are therefore pinned to a single thread, making the entire workflow bit-reproducible at any ambient thread count. This allows the bootstrapped MGXS library -- and with it the weight window application machinery -- to be reference-compared as well: the results digest now contains the slab, analog, and bootstrapped MGXS libraries plus the weight window bounds. Reference files regenerated against the official NNDC HDF5 data library (matching the regression suite requirement) with OPENMC_ENABLE_STRICT_FP=on at OMP_NUM_THREADS=2; the test passes unchanged at 1, 2, and 4 threads (~10 s serial). Analog pocket-reach probability re-measured with NNDC data: 4.6e-5, i.e. <1% chance per seed of a false analog cover. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Weight window roulette survivors are assigned weights derived from the window bounds themselves (survival_ratio x lower bound), and subsequent integer splitting can land a particle's weight exactly back on a bound value -- e.g. (3 x lower)/3 compared against lower. The split/roulette branch taken is then decided by the last ulp of the bound. Since weight window data carries ulp-level noise (for example from non-associative parallel reductions in the random ray or Monte Carlo solves that generated it), transport results were chaotically sensitive to bit-level differences in the weight window file: a one-ulp perturbation of the bounds was measured to change deep-tally results by 78%, because a single flipped decision changes a secondary generation count, which shifts the global track counter used to seed subsequent secondaries and decorrelates the remainder of the simulation. Comparing against the window with a small relative dead band (1e-9) removes the sensitivity structurally: the thresholds no longer coincide with the small-rational lattice of weights the algorithm produces, and a bound perturbation moves a pinned weight and its threshold together. The same one-ulp window perturbation now changes results by <1e-15. The band is statistically negligible (1e-9 of a window spanning a factor of several) and weight window games are unbiased regardless of threshold placement. Two regression test references regenerated for trajectory changes at former exact ties: weightwindows[False-local] (single value shift of ~2e-5) and the bootstrap test's MGXS digest. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Move the weight window comparison tolerance to constants.h as WEIGHT_WINDOW_REL_TOL alongside its rationale, rather than burying it as a local in apply_weight_window(). Stabilize the split count as well: ceil(weight/upper) has tie points at every integer ratio, and the weight window arithmetic itself manufactures exact integer ratios through a second pathway -- a roulette survivor assigned weight * max_split (e.g. 10x a weight that sits on the bound lattice), later split in a cell whose upper bound is an exact multiple of the same lower bound (10/5 = 2). Computing ceil((1 - tol) * ratio), clamped to at least 2, makes ratios within the dead band of an integer round down consistently. With both protections, a weight window file perturbed at the ulp level now changes bootstrap-style deep-tally results by <1e-15 in both observed tie pathways (previously order unity). The bootstrap regression test no longer pins its window-generation stages to a single thread: all stages run at the ambient thread count and the test passes unchanged at 1, 2, and 4 threads against one set of reference files, so it now also guards the robustness of weight window transport against the ulp-level reduction noise inherent to multithreaded solves. Full weight window test suite passes without further reference changes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add the versionadded directive for the new weight_windows_file argument, tighten the WEIGHT_WINDOW_REL_TOL comment, and apply clang-format to the touched region of weight_windows.cpp. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test to follow the random_ray_auto_convert pattern: generate weight windows (stochastic_slab MGXS + random ray FW-CADIS), bootstrap the material-wise library via weight_windows_file, then run a random ray solve with the bootstrapped library and compare a per-region, per-group flux tally through the standard statepoint digest. The pocket flux depends directly on the bootstrapped steel cross sections, and every upstream stage feeds the compared tallies, so the workflow is tested implicitly without custom result digests, coverage asserts, or an analog control run. The reference file shrinks from ~240 kB to ~300 bytes. The test still runs all stages at the ambient thread count and passes unchanged at 1, 2, and 4 threads against one set of references. Also express the example sphere's center and radius in terms of its x extents (surface at x = 120 flush with the pocket, x = -12 behind the source cavity) so the geometry parameters document themselves. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test as the complete five-run workflow: stochastic_slab MGXS generation, random ray FW-CADIS weight window generation, material_wise MGXS generation bootstrapped with those windows, a second FW-CADIS generation from the higher-fidelity library, and a final continuous energy Monte Carlo solve using the improved windows, whose per-region flux tallies are compared against the reference results through the standard harness. Every stage of the workflow feeds the compared tallies, including the weight window application in the final continuous energy solve. Also center the example sphere at the origin with the source cavity offset toward the -x surface (a pure translation of the previous geometry), which describes the asymmetric-shielding design more conventionally. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Random ray requires ray origins sampled uniformly in space over the entire domain, and ray sampling does not honor source domain constraints with the halton sample method. Enclose the example's concrete sphere in a vacuum-bounded box with a void gap so that the default box ray source from convert_to_random_ray is uniform over the whole domain, and drop the test's ray source override (which had restricted sampling to a box inscribed in the sphere, biasing the ray density). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Drop the monkeypatch-based plumbing test, which duplicated coverage now provided end-to-end by the bootstrap regression test while coupling the test suite to a private helper. The remaining unit tests cover only the documented validation paths (wrong-method warning and missing-file error), which the regression test does not exercise and which require no nuclear data. Also fix the grammar of the weight window split comment that was extended by an earlier commit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The path resolution behavior is exercised by the bootstrap regression test (which passes a relative path while generation runs in a temporary directory), a nonexistent path errors at runtime regardless, and the wrong-method warning is a documented one-line courtesy that does not warrant a test fixture. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…negative tallies.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Bootstrap material-wise MGXS generation with weight windows
Problem / Motivation
OpenMC currently has the ability to take a continuous energy model and convert it to multigroup model with the
model.convert_to_multigroup()function (more details here).Notably, the function supports three different methods for generating the needed MGXS data, all of which utilize a short continuous energy MC solve. The highest fidelity method is the
material_wiseoption, which runs a regular MC simulation on the provided geometry and tallies cross sections for each material. This works well in cases where an analog MC particles can easily visit all areas of the problem (as in a fission reactor simulation). However, it fails in shielding scenarios where some materials (e.g., a detector), require weight windows to receive any tallies.To allow for MGXS generation for hard to reach materials, we also have the
stochastic_slabmethod, which retains the material definition of the original problem but uses a surrogate geometry (a slab with many layers of randomly placed materials) in place of the real one and a source that is present throughout the entire slab. This still retains a lot of resonant self shielding effects but obviously is lower fidelity that thematerial_wisemethod given that spatial information is lost.Thus the idea has come about to allow for a "bootstrapping" workflow for shielding problems. In this workflow, lower fidelity
stochastic_slabMGXS are first generated and used to produce a first generation of weight windows with the random ray solver. These weight windows can then be used to produce a second generation of weight windows with thematerial_wisemethod, as with global FW-CADIS the MC solve should now be able to allow particles to reach all locations/materials. These improved second generation MGXS are then passed back in again to the random ray solver to produce a final set of weight windows that is potentially of a higher efficiency than the first.Summary of Changes
This PR adds the ability to "bootstrap" the
"material_wise"multigroup cross section (MGXS) generation method with weight windows, and fixes three weight window / random ray issues that were uncovered while developing and validating that feature. While the discovered issues might have been broken into other PRs, fixing them ended up being useful for validating that the current feature's regression test was running in a reproducible manner so I found it was best to include them here.Changes:
weight_windows_fileargument toModel.convert_to_multigroup().RandomRaySimulation::simulate().apply_weight_windows()a no-op in random ray mode.It also adds a continuous-energy deep-shielding example (
openmc.examples.sphere_with_shielded_pocket) designed for fast weight window testing, and an end-to-end regression test of the bootstrap workflow built on it. We didn't have a great fixed source test case for weight windows that was both (1) small/cheap to run and (2) actually needed weight windows, so I spent some time with Claude Fable cooking one up. Hopefully it should be useful for weight window testing as we add more features down the line as well.Change 1 —
weight_windows_filebootstrapping machineryModel.convert_to_multigroup(..., weight_windows_file=...)loads and applies a weight windows file during the CE generation solve, pushing particles into far regions so that all materials are tallied. The intended workflow is:"stochastic_slab"method together with the random ray solver and FW-CADIS. ("stochastic_slab"tallies all materials regardless of location.)weight_windows.h5to a higher-fidelity"material_wise"generation viaweight_windows_file=....The argument is only valid for
"material_wise"; a warning is issued and it is ignored for"stochastic_slab"and"infinite_medium"(their surrogate geometries are incompatible with a weight-window mesh defined over the real geometry, and they already tally all materials). The path is resolved to absolute up front because generation runs in a temporary directory.Documented in the random ray user guide under Bootstrapping Material-Wise MGXS with Weight Windows.
Change 2 — random ray miss-rate / intersection reporting fix
RandomRaySimulation::simulate()accumulatesavg_miss_rate_andtotal_geometric_intersections_over the batch loop but did not reset them at the start of a solve. Whensimulate()runs twice on the same object — e.g. the forward then adjoint solves of an FW-CADIS weight window generation — the second solve added onto the first solve's running totals while still dividing by a single solve's batch count. This inflated the reported miss rate (sometimes above 100%) and the reported intersection count.This is a reporting bug only — it has no effect on simulation results. Fixed by zeroing both accumulators at the top of
simulate().Change 3 — random ray weight-window application fix
apply_weight_windows()is now a no-op whensolver_type == RANDOM_RAY.Random ray "rays" are deterministic probes carrying unit statistical weight; they are not Monte Carlo particles. The random ray solver generates weight windows (FW-CADIS) but must never apply them to its own rays — rouletting or splitting a ray is meaningless and corrupts the transport sweep.
This could previously happen through an unfortunate interaction:
settings::weight_windows_on.weight_window_checkpointsalso enables thesurface(orcollision) checkpoint, the inheritedParticle::event_cross_surfacecallsapply_weight_windows()on a ray.WeightWindowsGenerator::update()fills the weight window bounds on the last batch of a solve. So the forward sweep transports against empty windows (no effect), but the subsequent adjoint sweep transports against the forward-flux windows and its rays get rouletted/killed.The visible symptoms were a ~10x drop in the reported adjoint geometric intersection count and a runaway miss rate — but more importantly, the adjoint flux (and therefore the generated FW-CADIS weight windows) were silently corrupted. This is a latent bug independent of the bootstrapping feature: any random ray FW-CADIS run with surface weight-window checkpoints enabled hits it.
Change 4 — weight window comparison dead band
This was a pretty interesting problem I noticed when adding the new regression test. Even after just a few hundred particles I was getting macro scale divergence due to changing the thread counts which was prohibitive for regression tests to pass reliably. Basically, it turns out that even a 1 ulp change (a single tiny rounding difference) in the random ray solve (which happens due to floating point non-associativity when rays pass and update the scalar flux estimates in a source region in different orders depending on parallelism) is enough to cause frequent divergence of the subsequent MC solve that uses the weight windows. It turns out the that 1 ulp difference doesn't just cause divergence when a "one in a quadrillion" probability particle happens to have a weight that falls on a boundary.
In reality, weight window roulette survivors are assigned weights derived from the window bounds themselves (
survival_ratio × lower), and subsequent integer splitting can land a particle's weight exactly back on a bound value — e.g.(3 × lower)/3later compared againstlowerwhen the particle re-enters the same mesh cell (common, since the mean free path is often on the scale of the mesh size). The split/roulette branch taken is then decided by the last ulp of the bound. Window data inherently carries ulp-level noise (non-associative parallel reductions in whatever solve generated it), and a single flipped decision changes a secondary-generation count, shifting the global track counter that seeds subsequent secondaries — decorrelating the rest of the simulation.So: to be clear - the proposed fix here is not just handling a rare edge case, but actually addresses a common case that the weight window routine forces on particles quite commonly by design.
apply_weight_window()now compares against the window with a small relative dead band (WEIGHT_WINDOW_REL_TOL = 1e-9inconstants.h): split aboveupper × (1 + ε), roulette belowlower × (1 − ε). The split count is stabilized the same way —ceil(weight/upper)has tie points at every integer ratio, and the arithmetic manufactures exact integer ratios through a second pathway (a roulette survivor assignedweight × max_split, later split in a cell whose upper bound is an exact multiple of the same lower bound, e.g. 10/5 = 2), so the count is computed asceil((1 − ε) × ratio)clamped to ≥2. This removes the sensitivity structurally — the thresholds no longer coincide with the small-rational lattice of weights the algorithm manufactures, and a bound perturbation moves a pinned weight and its threshold together. The band is statistically negligible and weight window games are unbiased regardless of threshold placement. One existing regression test reference shifted and was regenerated (weightwindows[False-local]; see notes below).Testing
tests/regression_tests/random_ray_auto_convert_bootstrap/): runs the complete bootstrap workflow on a new continuous-energy example,openmc.examples.sphere_with_shielded_pocket— a concrete sphere with a 2 MeV source in a cavity near the −x surface and a small steel pocket flush with the surface on the +x axis, ~1 m of concrete between them, the whole sphere enclosed in a vacuum-bounded box so ray-based solvers can sample uniformly over a rectangular domain. The workflow is five OpenMC runs: stochastic_slab MGXS generation → random ray FW-CADIS weight window generation → material_wise MGXS generation bootstrapped with those windows viaweight_windows_file→ a second FW-CADIS generation from the higher-fidelity library → a final continuous energy Monte Carlo solve using the improved windows, whose per-region flux tallies are compared against reference results with the standardTolerantPyAPITestHarnessstatepoint digest. An analog history reaches the pocket with probability ~4e-5 (measured), so the material-wise generation solve only produces nonzero steel cross sections because of the weight windows — and every stage of the workflow, including the weight window application in the final continuous energy solve, feeds the compared tallies. The example geometry is purpose-built for fast weight window testing: placing the cavity near the surface confines the deep shielding (and hence the wide weight-window dynamic range) to the small solid angle toward the pocket, so splitting stays cheap and convergent and the full five-run test takes ~15 s serial (~9 s at 2 threads).At 15 seconds serial this is definitely an expensive regression test, but it seems useful to have a full workout of the weight window and MGXS generation bootstrapping pipeline. If
Notes for reviewers
tests/regression_tests/weightwindows/local/results_true.datis a hashed tally digest and was regenerated. Cause (verified by reverting the dead band and rerunning): particles in that problem whose weights sat exactly on window-bound lattice values (the tie class described in Change 4) now classify as inside the window rather than playing roulette/split, so a few of the 1000 histories take different trajectories. Comparing tallies with and without the dead band: 246 of 3000 mesh bins change, predominantly low-statistics deep bins shifting at their per-history noise scale (this test deliberately runs only 500 particles x 2 batches with up to 200 splits per history). The change is a different statistical realization, not a bias; the rest of the weight window test family passes unmodified.Checklist