Physics-Based Machine Learning Methods for U-235 Forensics Signatures

Year
2024
Author(s)
Nageswara Rao - Oak Ridge National Laboratory
Caleb Redding - Oak Ridge National Laboratory
David Abrecht - Oak Ridge National Laboratory
David Hooper - Oak Ridge National Laboratory
Jennifer Ladd-Lively - Oak Ridge National Laboratory
Abstract
Signatures of low-intensity U-235 sources have been recently studied by utilizing  a variety of machine learning (ML) classifiers that utilize features extracted from NaI gamma spectral measurements. Their performance is tested using measurements collected under strategic configurations of NaI detectors  located at various distances from the source in the formation of two concentric circles and one spiral. These measurements have been collected at the Low Scatter Irradiator facility at Savannah River National Laboratory. The U-235 source is introduced via a shielded conduit into the facility where 21 NaI detectors are deployed over 6 x 6 meters area.  Multiple independent experimental runs have been conducted which provided data sets for training and testing of ML classifiers. The counts in gamma spectral regions associated with U-235 are estimated at 1 second intervals, and are used as classifier features. Previously, eight different well-known ML classifiers, based on five basic properties, namely, smooth, non-smooth, statistical, structural and hyper-parameter tuning methods, have been trained and tested using measurements collected over multiple runs with and without the source. In addition, four fusers, each utilizing a subset of these classifiers, have been previously trained and tested. These ML methods revealed complex classification performance, and in particular, some classifiers and fusers overfit the training data, which resulted in overly optimistic training error that is negated by much higher error revealed by independent test measurements. Furthermore, their performance is not directly explainable or relatable to the physical properties of the source, since their designs are mainly data-driven and opaque.We present a novel regression-based ML method that first estimates the inverse of the distance to the source, wherein the background is represented as a source located at infinite distance. Then, a lower-bound threshold on the inverse distance estimate is used to infer the presence of a source, since the estimate for background (with no source) is near zero. We study the Ensemble of Trees (EOT) and Gaussian Process Regression (GPR) methods, which fit non-smooth and smooth regression functions, respectively, and a hyper parameter auto-tuning and selection method (denoted by AUTO) that employs regression trees, neural network, and support vector machine in addition to EOT and GPR. Our results show that these ML methods avoid the over-fitting observed with several previous ML classifiers and fusers, such as EOT and classification trees, and  provide comparable classification error on independent test measurements, for example, in [-0.01, 0.08] range for GPR method compared to best ML classifiers. Their error is directly related to the accuracy of the estimate of physical distance to source. Furthermore, the estimates' precision determines the seperability of the signatures of source and background, which in turn determines the false alarm and missed detection rates of these methods. The monotonic decrease of the source strength with detector distance combined with the Poisson distribution of measurements is utilized to analytically validate these methods by deriving their ML generalization equations that mathematically characterize their performance on measurements beyond those used for training.