查询结果:   王帅,蔡磊鑫,顾倜,吕强.运用双向LSTM拟合RNA二级结构打分函数[J].计算机应用与软件,2017,34(9):232 - 239.
中文标题
运用双向LSTM拟合RNA二级结构打分函数
发表栏目
算法
摘要点击数
765
英文标题
FITTING THE RNA SECONDARY STRUCTURE OF SCORING FUNCTION WITH BIDIRECTIONAL LSTM
作 者
王帅 蔡磊鑫 顾倜 吕强 Wang Shuai Cai Leixin Gu Ti Lü Qiang
作者单位
苏州大学计算机科学与技术学院 江苏 苏州 215006 苏州大学江苏省计算机信息处理技术重点实验室 江苏 苏州 215006    
英文单位
School of Computer Science and Technology,Soochow University,Suzhou 215006,Jiangsu,China Provincial Key Laboratory for Computer Information Processing Technology,Soochow University,Suzhou 215006,Jiangsu,China    
关键词
RNA 打分函数 二级结构 双向LSTM
Keywords
RNA Scoring function Secondary structure Bidirectional LSTM
基金项目
国家自然科学基金项目(61170125)
作者资料
王帅,硕士生,主研领域:生物信息计算。蔡磊鑫,硕士生。顾倜,硕士生。吕强,教授。 。
文章摘要
RNA二级结构的打分函数在RNA二级结构预测中扮演着越来越重要的角色。目前对RNA二级结构的打分函数并没有很好地抓住RNA的折叠机制。我们认为递归神经网络层与层之间的信息传递方式和RNA 的折叠方式有相似之处。提出使用双向LSTM(Long Short term Memory)神经网络对RNA二级结构进行打分。在数据集ASE(长度小于500),以及CRW(大部分长度大于1 000)上,进行了三项实验。通过拟合SEN(Sensitivity)与PPV(Specificity)打分函数确定了在目标函数为mean_squared_error时拟合效果最好;进而对比较复杂的打分函数MCC(Matthews correlation coefficient)进行拟合;最后实验得出双层双向LSTM模型的结果优于单层双向LSTM模型的结果。通过实验,得到的打分函数包含了碱基序列的全局属性。实验结果表明LSTM深度神经网络模型可以很好地拟合RNA二级结构的打分函数。
Abstract
RNA Scoring Function plays a more and more important role in the RNA second structure prediction. At present, some scoring functions of RNA secondary structure do not have a good grasp of RNA folding mechanism. We believe that this mechanism and the way of information transmission between layers on recurrent neural network have similar aspects. Therefore, bidirectional Long Short Term Memory (LSTM) neural network was used to score the RNA secondary structure. We conducted three experiments based on the dataset ASE (length less than 1 000) and CRW (most of the length was greater than 1 000). By fitting the sensitivity (SEN) and specificity (PPV) scoring functions, it was determined that the fitting function was the best when the objective function is mean_squared_error. Then, we fitted the more complex scoring function Matthews Correlation Coefficient (MCC). Finally, the results of the two-layer bidirectional LSTM model were better than those of the single-layer bidirectional LSTM model. This article got the scoring function which contained global properties of the base sequence through experiments. Our approach shows that LSTM neural network model can fit the scoring function of RNA secondary structure well.
下载PDF全文