Abstract:
Bigdata technology brings new approach for password security analysis, so that typical semantics such as surname, commonly-used vocabulary and commonly-used number string have been found in passwords. However, there are still a large number of semantic features that cannot be categorized, which limits the success rate of password guessing. A password modeling method based on context semantics is proposed, which includes three main processing processes: reducing invalid semantics in password segmentation, establishing length distribution of unknown semantics, and introducing unknown semantics to context-free model. Attack experiments were carried out on a large password dataset CSDN. The results show that the proposed semantic password modeling method can improve the efficiency of the existing password guessing model to some extent.