A primary limitation of modern probabilistic risk assessment (PRA) is that, since the risk scenarios and system vulnerabilities are manually developed by analysts, they critically depend on the analysts’ qualifications, available information about the system, and ability to understand and “discover” the system vulnerabilities (as well as to properly describe them using Boolean logic). In other words, modern PRA is a method of documenting analysts’ discoveries rather than suggesting new, previously unknown risks. The paper describes a method for auto-detecting possible vulnerabilities in system designs, thus revealing previously unseen issues and reducing human error/costs by enabling analysts to focus on critical areas via intelligent, efficient sampling of the system’s parameter space.
For existing systems with available fault trees, we developed the proof-of-principle methodology, allowing the proposed novel methodology first stochastically generates large volumes of training data by “rewiring” fault trees of the target system and then learning the most important features of the training data. Rewiring includes randomly changing gate logic and the occurrence of fundamental events (i.e., basic or initiating events) in a fault tree. By rewiring existing target datasets, the training data are skewed toward existing systems, yet still provide the variation needed by generating millions of training examples. The combination of fault tree logic and Boolean variables representing initiating events can be regarded as a configuration usable as input vector for support vector machine (SVM) training.
During training, along with classifying each input data vector, the SVM algorithm finds support vectors (SVs) in the training data. By the very nature of the training algorithm, SVM focuses only on those points that are most difficult to tell apart. Because in our case the points are realizations of fault trees, SVM discovers the most similar fault trees from both classes, thus also pointing to the most “vulnerable” configurations. The SVs are the most important input data for separating the two classes (i.e., failure vs. non-failure), and, most notably, represent only a very small portion of the input data. Because in this case the input data are fault tree characterizations, the SVM training produces system configurations that are the “borderline” between failure and non-failure scenarios. These support trees are further scrutinized for insights into the system’s logical vulnerabilities and risks.
The primary outcome from this research is a new, broadly applicable methodology in which intelligently guided space sampling methods are used to drastically reduce the number of system configurations needing analyzed. This methodology will enable researchers to auto-detect possible vulnerabilities in system designs, devices, and networks, thus revealing previously unseen issues and reducing human error/costs by enabling analysts to focus on critical areas via intelligent, efficient sampling of the system’s parameter space.
|