Feature Importance And Classification From Disparate Data Streams For Facility Pattern-of-life Characterization In The Context Of Nuclear Safeguards

Year

2021

Author(s)

Emily Casleton - Los Alamos National Laboratory

Paul M Mendoza - Los Alamos National Laboratory

Rosalyn Rael - Los Alamos National Laboratory

Jonathan Woodring - Los Alamos National Laboratory

Vlad Henzl - Los Alamos National Laboratory

File Attachment

a409.pdf615.81 KB

Abstract

Current practices for safeguards verification are largely based first on manual and independent interpretation of data streams from separate sensors followed by expert driven contextual interpretation. The work presented here develops a framework to integrate and utilize the wealth of information contained within large and heterogeneous datasets. The approach is demonstrated on persistent, disparate data collected from a nuclear training facility at Los Alamos National Laboratory. Features are first extracted from the individual data streams, then various feature-importance metrics are employed to down-select the feature matrix to a subset that is best able to characterize the facility operations, i.e. its “pattern of life”. The resulting features are then input into supervised learning methods to classify modes of facility operations. The fusion of these disparate data streams yields a more accurate characterization of facility operations than any data stream individually and with a rather high degree of confidence. In addition, through the criterion of feature importance we are able to rank the sensor modalities with respect to the information they provide to characterize facility operations. This approach can be useful in a limited sensor deployment scenario or in planning stages of safeguards measures implementation. The developed framework can be adapted and applied to any other type of facility or sensor, and the associated data streams.