基于子空间的可解释性多变量异常检测-Subspace-based multivariable anomaly detection with interpretability

基于子空间的可解释性多变量异常检测

2022,30(11):38-45

宋润葵, 郑扬飞, 郭红钰, 李倩

华北计算技术研究所

摘要：为了解决在多维特征数据中，部分异常点被分散的特征空间所掩盖而无法检出的问题，以及缓解当前异常检测方法的结果可解释性差或不具有可解释性的状况，提出了基于子空间的可解释性多变量异常检测算法；首先在多维特征空间中，对每一个维度的特征进行分布检验，在此基础上为每一个对象选择出一个特征空间的集合，进而为每个对象计算出异常值分数；在此过程中，高效利用算法的中间过程产物，来对算法结果加以解释，改善使用者对数据的理解；使用真实数据集，对算法进行了验证，实验结果表明其具有较好的准确性和运行时间，并很好的解释了异常点的异常性。

关键词：异常检测；可解释性；点异常；统计型数据；机器学习；无监督学习

Subspace-based multivariable anomaly detection with interpretability

宋润葵, 李倩

Abstract：To solve the problem that some outliers cannot be detected because they are covered by scattered feature space in multidimensional feature data, and alleviate the phenomenon that the results of current anomaly detection methods are poorly interpretable or not interpretable, a subspace-based multivariate anomaly detection algorithm with interpretability is proposed. Firstly, in the multi-dimensional feature space, the distribution test of the features of each dimension is carried out. On this foundation, a set of feature space is selected for each object, and then the outlier score is calculated for each object. In this process, the intermediate process products of the algorithm are efficiently used to interpret the algorithm results and improve users to understand the data. The experimental results show that it has good accuracy and running time, and well explains the abnormity of outliers.

Key words：anomaly detection; interpretability; point anomalies; static data; machine learning; unsupervised learning

收稿日期：2022-04-18

基金项目：科技创新2030—“新一代人工智能”重大项目 (2020AAA0105100)

下载PDF全文