基于代码特征的代码克隆搜索方法
A CODE CLONE SEARCH METHOD BASED ON CODE FEATURES
-
摘要: 当前代码克隆搜索的研究主要集中在了搜索实现方式相同或变化不大的代码克隆上,对于语义代码克隆搜索,当前的方法表现不佳。为了提高语义代码克隆搜索的准确性,提出一个基于代码特征的代码克隆搜索方法,对代码片段建立代码图,并在图中抽取关键节点来构造语义特征表示,使用倒排索引以及基于TF-IDF的评分算法进行搜索。实验结果表明,在语义代码克隆搜索能力上,所提出的方法比现有方法有较大提升。Abstract: Existing code clone search approaches focus on the search of code clones with the same or similar implementation, but do not perform well for searching semantic code clones. To improve the accuracy of semantic code clone search, this paper proposes a code clone search approach based on code features. This approach builds code graph for each code fragment, then extracts key nodes to construct semantic feature representation from the graph, and uses inverted index and the TF-IDF scoring algorithm to search semantic code clones. Experimental results show that the proposed approach is better than the existing approaches in semantic code clone search.