Abstract:
Anomaly detection based on multidimensional or high-dimensional traffic data has important application scenarios in APT attack research. Traditional processing methods for high-dimensional traffic data fail to pay sufficient attention to the internal structure of the data and have low detection efficiency, thus a semi-supervised anomaly detection method (VAE-GMM) with a variational self-coding-Gaussian mixture model is proposed. The method inputted the traffic data into the generative network to obtain the corresponding low-dimensional representation and reconstruction errors and used them as the input of the subsequent estimation network. The estimation network used the Gaussian mixture model to predict the sample likelihood energy values and determined the abnormal traffic based on the likelihood energy values. The model redesigned the loss function and reconstruction error to get rid of the local optimum and optimize the detection results by joint end-to-end dual network training. The experimental results show that the model outperforms other anomaly detection models and improves the recall rate by 13% compared with the traditional DAEGMM method.