Skip to content

MGXS Bootstrapping#3965

Open
jtramm wants to merge 14 commits into
openmc-dev:developfrom
jtramm:bootsrap_mgxs
Open

MGXS Bootstrapping#3965
jtramm wants to merge 14 commits into
openmc-dev:developfrom
jtramm:bootsrap_mgxs

Conversation

@jtramm

@jtramm jtramm commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Bootstrap material-wise MGXS generation with weight windows

Problem / Motivation

OpenMC currently has the ability to take a continuous energy model and convert it to multigroup model with the model.convert_to_multigroup() function (more details here).

Notably, the function supports three different methods for generating the needed MGXS data, all of which utilize a short continuous energy MC solve. The highest fidelity method is the material_wise option, which runs a regular MC simulation on the provided geometry and tallies cross sections for each material. This works well in cases where an analog MC particles can easily visit all areas of the problem (as in a fission reactor simulation). However, it fails in shielding scenarios where some materials (e.g., a detector), require weight windows to receive any tallies.

To allow for MGXS generation for hard to reach materials, we also have the stochastic_slab method, which retains the material definition of the original problem but uses a surrogate geometry (a slab with many layers of randomly placed materials) in place of the real one and a source that is present throughout the entire slab. This still retains a lot of resonant self shielding effects but obviously is lower fidelity that the material_wise method given that spatial information is lost.

Thus the idea has come about to allow for a "bootstrapping" workflow for shielding problems. In this workflow, lower fidelity stochastic_slab MGXS are first generated and used to produce a first generation of weight windows with the random ray solver. These weight windows can then be used to produce a second generation of weight windows with the material_wise method, as with global FW-CADIS the MC solve should now be able to allow particles to reach all locations/materials. These improved second generation MGXS are then passed back in again to the random ray solver to produce a final set of weight windows that is potentially of a higher efficiency than the first.

Summary of Changes

This PR adds the ability to "bootstrap" the "material_wise" multigroup cross section (MGXS) generation method with weight windows, and fixes three weight window / random ray issues that were uncovered while developing and validating that feature. While the discovered issues might have been broken into other PRs, fixing them ended up being useful for validating that the current feature's regression test was running in a reproducible manner so I found it was best to include them here.

Changes:

  1. Bootstrapping machinery — a new weight_windows_file argument to Model.convert_to_multigroup().
  2. Random ray miss-rate / intersection reporting fix — reset the per-solve accumulators in RandomRaySimulation::simulate().
  3. Random ray weight-window application fix — make apply_weight_windows() a no-op in random ray mode.
  4. Weight window comparison dead band — remove the chaotic sensitivity of weight-window transport to bit-level differences in the window data.

It also adds a continuous-energy deep-shielding example (openmc.examples.sphere_with_shielded_pocket) designed for fast weight window testing, and an end-to-end regression test of the bootstrap workflow built on it. We didn't have a great fixed source test case for weight windows that was both (1) small/cheap to run and (2) actually needed weight windows, so I spent some time with Claude Fable cooking one up. Hopefully it should be useful for weight window testing as we add more features down the line as well.

Change 1 — weight_windows_file bootstrapping machinery

Model.convert_to_multigroup(..., weight_windows_file=...) loads and applies a weight windows file during the CE generation solve, pushing particles into far regions so that all materials are tallied. The intended workflow is:

  1. Cheaply generate problem-wide weight windows with the "stochastic_slab" method together with the random ray solver and FW-CADIS. ("stochastic_slab" tallies all materials regardless of location.)
  2. Pass the resulting weight_windows.h5 to a higher-fidelity "material_wise" generation via weight_windows_file=....
# 1. Generate weight windows (stochastic_slab + random ray FW-CADIS) -> weight_windows.h5
# 2. Bootstrap a high-fidelity material-wise library using them:
model.convert_to_multigroup(
    method="material_wise",
    weight_windows_file="weight_windows.h5",
    overwrite_mgxs_library=True,
)

The argument is only valid for "material_wise"; a warning is issued and it is ignored for "stochastic_slab" and "infinite_medium" (their surrogate geometries are incompatible with a weight-window mesh defined over the real geometry, and they already tally all materials). The path is resolved to absolute up front because generation runs in a temporary directory.

Documented in the random ray user guide under Bootstrapping Material-Wise MGXS with Weight Windows.

Change 2 — random ray miss-rate / intersection reporting fix

RandomRaySimulation::simulate() accumulates avg_miss_rate_ and total_geometric_intersections_ over the batch loop but did not reset them at the start of a solve. When simulate() runs twice on the same object — e.g. the forward then adjoint solves of an FW-CADIS weight window generation — the second solve added onto the first solve's running totals while still dividing by a single solve's batch count. This inflated the reported miss rate (sometimes above 100%) and the reported intersection count.

This is a reporting bug only — it has no effect on simulation results. Fixed by zeroing both accumulators at the top of simulate().

Change 3 — random ray weight-window application fix

apply_weight_windows() is now a no-op when solver_type == RANDOM_RAY.

Random ray "rays" are deterministic probes carrying unit statistical weight; they are not Monte Carlo particles. The random ray solver generates weight windows (FW-CADIS) but must never apply them to its own rays — rouletting or splitting a ray is meaningless and corrupts the transport sweep.

This could previously happen through an unfortunate interaction:

  • An on-the-fly weight window generator auto-enables settings::weight_windows_on.
  • If weight_window_checkpoints also enables the surface (or collision) checkpoint, the inherited Particle::event_cross_surface calls apply_weight_windows() on a ray.
  • In random ray mode WeightWindowsGenerator::update() fills the weight window bounds on the last batch of a solve. So the forward sweep transports against empty windows (no effect), but the subsequent adjoint sweep transports against the forward-flux windows and its rays get rouletted/killed.

The visible symptoms were a ~10x drop in the reported adjoint geometric intersection count and a runaway miss rate — but more importantly, the adjoint flux (and therefore the generated FW-CADIS weight windows) were silently corrupted. This is a latent bug independent of the bootstrapping feature: any random ray FW-CADIS run with surface weight-window checkpoints enabled hits it.

Change 4 — weight window comparison dead band

This was a pretty interesting problem I noticed when adding the new regression test. Even after just a few hundred particles I was getting macro scale divergence due to changing the thread counts which was prohibitive for regression tests to pass reliably. Basically, it turns out that even a 1 ulp change (a single tiny rounding difference) in the random ray solve (which happens due to floating point non-associativity when rays pass and update the scalar flux estimates in a source region in different orders depending on parallelism) is enough to cause frequent divergence of the subsequent MC solve that uses the weight windows. It turns out the that 1 ulp difference doesn't just cause divergence when a "one in a quadrillion" probability particle happens to have a weight that falls on a boundary.

In reality, weight window roulette survivors are assigned weights derived from the window bounds themselves (survival_ratio × lower), and subsequent integer splitting can land a particle's weight exactly back on a bound value — e.g. (3 × lower)/3 later compared against lower when the particle re-enters the same mesh cell (common, since the mean free path is often on the scale of the mesh size). The split/roulette branch taken is then decided by the last ulp of the bound. Window data inherently carries ulp-level noise (non-associative parallel reductions in whatever solve generated it), and a single flipped decision changes a secondary-generation count, shifting the global track counter that seeds subsequent secondaries — decorrelating the rest of the simulation.

So: to be clear - the proposed fix here is not just handling a rare edge case, but actually addresses a common case that the weight window routine forces on particles quite commonly by design.

apply_weight_window() now compares against the window with a small relative dead band (WEIGHT_WINDOW_REL_TOL = 1e-9 in constants.h): split above upper × (1 + ε), roulette below lower × (1 − ε). The split count is stabilized the same way — ceil(weight/upper) has tie points at every integer ratio, and the arithmetic manufactures exact integer ratios through a second pathway (a roulette survivor assigned weight × max_split, later split in a cell whose upper bound is an exact multiple of the same lower bound, e.g. 10/5 = 2), so the count is computed as ceil((1 − ε) × ratio) clamped to ≥2. This removes the sensitivity structurally — the thresholds no longer coincide with the small-rational lattice of weights the algorithm manufactures, and a bound perturbation moves a pinned weight and its threshold together. The band is statistically negligible and weight window games are unbiased regardless of threshold placement. One existing regression test reference shifted and was regenerated (weightwindows[False-local]; see notes below).

Testing

  • Regression test (tests/regression_tests/random_ray_auto_convert_bootstrap/): runs the complete bootstrap workflow on a new continuous-energy example, openmc.examples.sphere_with_shielded_pocket — a concrete sphere with a 2 MeV source in a cavity near the −x surface and a small steel pocket flush with the surface on the +x axis, ~1 m of concrete between them, the whole sphere enclosed in a vacuum-bounded box so ray-based solvers can sample uniformly over a rectangular domain. The workflow is five OpenMC runs: stochastic_slab MGXS generation → random ray FW-CADIS weight window generation → material_wise MGXS generation bootstrapped with those windows via weight_windows_file → a second FW-CADIS generation from the higher-fidelity library → a final continuous energy Monte Carlo solve using the improved windows, whose per-region flux tallies are compared against reference results with the standard TolerantPyAPITestHarness statepoint digest. An analog history reaches the pocket with probability ~4e-5 (measured), so the material-wise generation solve only produces nonzero steel cross sections because of the weight windows — and every stage of the workflow, including the weight window application in the final continuous energy solve, feeds the compared tallies. The example geometry is purpose-built for fast weight window testing: placing the cavity near the surface confines the deep shielding (and hence the wide weight-window dynamic range) to the small solid angle toward the pocket, so splitting stays cheap and convergent and the full five-run test takes ~15 s serial (~9 s at 2 threads).

At 15 seconds serial this is definitely an expensive regression test, but it seems useful to have a full workout of the weight window and MGXS generation bootstrapping pipeline. If

Notes for reviewers

  • tests/regression_tests/weightwindows/local/results_true.dat is a hashed tally digest and was regenerated. Cause (verified by reverting the dead band and rerunning): particles in that problem whose weights sat exactly on window-bound lattice values (the tie class described in Change 4) now classify as inside the window rather than playing roulette/split, so a few of the 1000 histories take different trajectories. Comparing tallies with and without the dead band: 246 of 3000 mesh bins change, predominantly low-statistics deep bins shifting at their per-history noise scale (this test deliberately runs only 500 particles x 2 batches with up to 200 splits per history). The change is a different statistical realization, not a bias; the rest of the weight window test family passes unmodified.

Checklist

  • I have performed a self-review of my own code
  • I have run clang-format (version 18) on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

John Tramm and others added 13 commits June 9, 2026 17:08
…bugs

Add a `weight_windows_file` argument to Model.convert_to_multigroup() and fix
two random ray solver bugs found while developing it.

Bootstrapping machinery:
- The "material_wise" MGXS generation method can now load and apply weight
  windows during its continuous energy solve via the new `weight_windows_file`
  argument. Materials far behind shielding -- which an analog solve cannot
  reach -- are then tallied and receive nonzero cross sections. The argument is
  only valid for "material_wise"; a warning is issued and it is ignored for the
  "stochastic_slab" and "infinite_medium" methods.

Random ray miss-rate / intersection reporting fix:
- RandomRaySimulation::simulate() now resets avg_miss_rate_ and
  total_geometric_intersections_ at the start of each solve. Previously a second
  solve on the same object (e.g. the adjoint solve during FW-CADIS weight window
  generation) accumulated onto the first solve's running totals, inflating the
  reported miss rate and intersection count. Reporting only; no effect on results.

Random ray weight-window application fix:
- apply_weight_windows() is now a no-op in random ray mode. Random ray rays
  carry unit weight and are not Monte Carlo particles, so subjecting them to
  weight-window roulette/splitting is meaningless and corrupted the transport
  sweep. This occurred when surface checkpoints were enabled and an on-the-fly
  weight window generator auto-enabled weight_windows_on, killing rays during
  the adjoint sweep.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add openmc.examples.sphere_with_shielded_pocket: a continuous energy
deep-shielding model with a steel pocket behind ~1 m of concrete, reached by
analog histories with probability ~4e-5. The concrete sphere is offset from
the source so the deep shielding (and thus the wide weight window range)
exists only in the small solid angle toward the pocket, keeping weight window
splitting cheap and the model fast enough for regression testing (~8 s serial
for the full three-stage bootstrap workflow). Intended for weight window and
variance reduction feature tests.

The new regression test runs the full bootstrap workflow on this model:
stochastic_slab MGXS -> random ray FW-CADIS weight windows -> material_wise
MGXS with weight_windows_file. The analog material_wise solve must produce
zero cross sections for the steel (negative control); the bootstrapped solve
must cover all materials. Verdicts verified robust over 10 Monte Carlo seeds
and 3 weight-window-generation seeds.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test as a TolerantPyAPITestHarness with
reference files so that subtle changes to the random ray or weight window
machinery are detected, not just binary coverage flips. The compared results
contain the FW-CADIS weight window bounds and the analog material-wise MGXS
data, both measured reproducible across runs and thread counts (bounds to
~5e-15; analog MGXS bitwise). The bootstrapped MGXS values are deliberately
not value-compared: a Monte Carlo solve with weight window splitting is not
run-to-run reproducible when multithreaded (same seed and thread count gave
few-percent differences, with tail energy groups flipping between zero and
nonzero), so only its binary material coverage -- the feature guarantee -- is
asserted. The coverage asserts are hard asserts inside the harness run so a
broken feature cannot be baked into references by an --update run.

Reference files generated with OPENMC_ENABLE_STRICT_FP=on (RelWithDebInfo) at
OMP_NUM_THREADS=2, matching CI; the test passes unchanged at 1, 2, and 4
threads and runs in ~8 s serial.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Correct and strengthen the MGXS bootstrap regression test. Measurement showed
the weight-window-split Monte Carlo solve IS reproducible across runs and
thread counts given a bit-identical weight window file (<= 4e-16, i.e. tally
summation ulps) -- the secondary particle sorting and PRNG stream control work
as designed. The instability previously attributed to it actually originates
upstream: the multithreaded random ray solve carries last-ulp lock-order
jitter in the window bounds (~6e-16 run-to-run at 2 threads, bitwise zero
serially), and weight window split/roulette decisions amplify even a one-ulp
bound perturbation into order-unity tally changes (measured 78%), because
particle weights are set from the bound values themselves and comparisons can
sit exactly on decision boundaries.

The window-producing stages (stochastic_slab MGXS and the random ray FW-CADIS
solve) are therefore pinned to a single thread, making the entire workflow
bit-reproducible at any ambient thread count. This allows the bootstrapped
MGXS library -- and with it the weight window application machinery -- to be
reference-compared as well: the results digest now contains the slab, analog,
and bootstrapped MGXS libraries plus the weight window bounds.

Reference files regenerated against the official NNDC HDF5 data library
(matching the regression suite requirement) with OPENMC_ENABLE_STRICT_FP=on at
OMP_NUM_THREADS=2; the test passes unchanged at 1, 2, and 4 threads (~10 s
serial). Analog pocket-reach probability re-measured with NNDC data: 4.6e-5,
i.e. <1% chance per seed of a false analog cover.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Weight window roulette survivors are assigned weights derived from the window
bounds themselves (survival_ratio x lower bound), and subsequent integer
splitting can land a particle's weight exactly back on a bound value -- e.g.
(3 x lower)/3 compared against lower. The split/roulette branch taken is then
decided by the last ulp of the bound. Since weight window data carries
ulp-level noise (for example from non-associative parallel reductions in the
random ray or Monte Carlo solves that generated it), transport results were
chaotically sensitive to bit-level differences in the weight window file: a
one-ulp perturbation of the bounds was measured to change deep-tally results
by 78%, because a single flipped decision changes a secondary generation
count, which shifts the global track counter used to seed subsequent
secondaries and decorrelates the remainder of the simulation.

Comparing against the window with a small relative dead band (1e-9) removes
the sensitivity structurally: the thresholds no longer coincide with the
small-rational lattice of weights the algorithm produces, and a bound
perturbation moves a pinned weight and its threshold together. The same
one-ulp window perturbation now changes results by <1e-15. The band is
statistically negligible (1e-9 of a window spanning a factor of several) and
weight window games are unbiased regardless of threshold placement.

Two regression test references regenerated for trajectory changes at former
exact ties: weightwindows[False-local] (single value shift of ~2e-5) and the
bootstrap test's MGXS digest.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Move the weight window comparison tolerance to constants.h as
WEIGHT_WINDOW_REL_TOL alongside its rationale, rather than burying it as a
local in apply_weight_window().

Stabilize the split count as well: ceil(weight/upper) has tie points at every
integer ratio, and the weight window arithmetic itself manufactures exact
integer ratios through a second pathway -- a roulette survivor assigned
weight * max_split (e.g. 10x a weight that sits on the bound lattice), later
split in a cell whose upper bound is an exact multiple of the same lower bound
(10/5 = 2). Computing ceil((1 - tol) * ratio), clamped to at least 2, makes
ratios within the dead band of an integer round down consistently.

With both protections, a weight window file perturbed at the ulp level now
changes bootstrap-style deep-tally results by <1e-15 in both observed tie
pathways (previously order unity). The bootstrap regression test no longer
pins its window-generation stages to a single thread: all stages run at the
ambient thread count and the test passes unchanged at 1, 2, and 4 threads
against one set of reference files, so it now also guards the robustness of
weight window transport against the ulp-level reduction noise inherent to
multithreaded solves. Full weight window test suite passes without further
reference changes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Add the versionadded directive for the new weight_windows_file argument,
tighten the WEIGHT_WINDOW_REL_TOL comment, and apply clang-format to the
touched region of weight_windows.cpp.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test to follow the random_ray_auto_convert
pattern: generate weight windows (stochastic_slab MGXS + random ray FW-CADIS),
bootstrap the material-wise library via weight_windows_file, then run a random
ray solve with the bootstrapped library and compare a per-region, per-group
flux tally through the standard statepoint digest. The pocket flux depends
directly on the bootstrapped steel cross sections, and every upstream stage
feeds the compared tallies, so the workflow is tested implicitly without
custom result digests, coverage asserts, or an analog control run. The
reference file shrinks from ~240 kB to ~300 bytes. The test still runs all
stages at the ambient thread count and passes unchanged at 1, 2, and 4
threads against one set of references.

Also express the example sphere's center and radius in terms of its x extents
(surface at x = 120 flush with the pocket, x = -12 behind the source cavity)
so the geometry parameters document themselves.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Restructure the bootstrap regression test as the complete five-run workflow:
stochastic_slab MGXS generation, random ray FW-CADIS weight window generation,
material_wise MGXS generation bootstrapped with those windows, a second
FW-CADIS generation from the higher-fidelity library, and a final continuous
energy Monte Carlo solve using the improved windows, whose per-region flux
tallies are compared against the reference results through the standard
harness. Every stage of the workflow feeds the compared tallies, including
the weight window application in the final continuous energy solve.

Also center the example sphere at the origin with the source cavity offset
toward the -x surface (a pure translation of the previous geometry), which
describes the asymmetric-shielding design more conventionally.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Random ray requires ray origins sampled uniformly in space over the entire
domain, and ray sampling does not honor source domain constraints with the
halton sample method. Enclose the example's concrete sphere in a vacuum-bounded
box with a void gap so that the default box ray source from
convert_to_random_ray is uniform over the whole domain, and drop the test's
ray source override (which had restricted sampling to a box inscribed in the
sphere, biasing the ray density).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Drop the monkeypatch-based plumbing test, which duplicated coverage now
provided end-to-end by the bootstrap regression test while coupling the test
suite to a private helper. The remaining unit tests cover only the documented
validation paths (wrong-method warning and missing-file error), which the
regression test does not exercise and which require no nuclear data.

Also fix the grammar of the weight window split comment that was extended by
an earlier commit.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The path resolution behavior is exercised by the bootstrap regression test
(which passes a relative path while generation runs in a temporary directory),
a nonexistent path errors at runtime regardless, and the wrong-method warning
is a documented one-line courtesy that does not warrant a test fixture.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jtramm jtramm requested a review from pshriwise as a code owner June 10, 2026 16:44
@jtramm jtramm mentioned this pull request Jun 10, 2026
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant