基于时间同步递归注意力机制的编码器-解码器语音识别

SPEECH RECOGNITION BASED ON TIME SYNCHRONOUS RECURSIVE ATTENTION MECHANISM FOR ENCODER-DECODER

摘要: 为了保证语音识别的精度与实时性,提出一种基于时间同步递归注意力机制的编码器-解码器语音识别方法。引入无窗口注意机制,不需要多次训练从而节省模型准备时间;使用时间同步递归更新规则而不是基于核函数平滑器的公式来获得上下文向量,进一步通过调整与注意力端点决策相关的标量阈值来控制延迟和性能之间的权衡;通过实验验证该方法既保证了识别精度,也能够实现在线识别。

Abstract: In order to ensure the accuracy and real-time of speech recognition, a speech recognition method based on time synchronization recursive attention mechanism is proposed. The windowless attention mechanism was introduced which did not require multiple training sessions to save model preparation time, and the context vector was obtained by using the time synchronization recursive update rule instead of the formula based on the kernel function smoother. The tradeoff between delay and performance was further controlled by adjusting the scalar threshold related to the attention endpoint decision. Experiments show that the proposed method can not only ensure the recognition accuracy, but also achieve online recognition.