基于改进SMOTE和AdaBoost的企业员工离职预测方法

AN EMPLOYEE ATTRITION PREDICTION METHOD BASED ON SMOTE AND ADABOOST

  • 摘要: 预测员工离职意愿是降低员工离职率的基础。员工离职数据具有非平衡性,所含离职样本(员工)通常远少于未离职样本。本文提出基于SMOTE和AdaBoost的非平衡数据员工离职预测方法SMOTE-AdaBoost。首先,构建改进SMOTE算法平衡数据。针对员工离职数据同时包含连续特征和离散特征的特点,改进SMOTE采用了新的距离指标和样本生成策略。接着,选用决策树为基分类模型,构建基于AdaBoost集成策略的员工离职预测模型。使用两组员工离职数据对算法性能进行了验证。结果表明,所提改进SMOTE算法能够极大提升模型对离职员工的预测效果。同时,所提SMOTE-AdaBoost算法具有显著好于多个经典分类算法的员工离职预测性能。

     

    Abstract: Decreasing the employee attrition rate for enterprises requires the prediction of employee attrition. The employee attrition data is unbalanced because it contains many more separated employees (instances) than active employees. This paper proposes an employee attrition prediction method (called SMOTE-AdaBoost) based on SMOTE and AdaBoost for unbalanced data. First, an improved SMOTE algorithm is established to balance the data. The proposed SMOTE algorithm uses a new distance measure and a new synthetic instance-generating strategy for the employee attrition data that have both continuous and discrete features. Then, an ensemble model with decision trees as base learning models is established with the AdaBoost strategy for employee attrition prediction. The proposed method is verified on two employee attrition datasets. The experimental results indicate that the proposed SMOTE algorithm significantly improves the prediction performance of the model for separated employees. Moreover, the proposed SMOTE-AdaBoost algorithm obtains significantly better performance for employee attrition than several typical classification algorithms.

     

/

返回文章
返回