RNSQL: 融合逆规范化的 Text2SQL 生成

RNSQL: TEXT2SQL GENERATION BASED ON REVERSE NORMALIZATION

  • 摘要: Text2SQL 是自然语言处理科研领域中的一项重要任务,在研究智能问答系统中发挥关键性的作用,其核心任务是将自然语言描述的问题自动转换为 SQL 查询语句。当前研究重点为提高 SQL 子句任务的匹配准确率,但忽略了 SQL 的句法生成的正确性,涉及多表连接的 SQL 生成仍存在大量错误。因此,提出一种基于神经网络的 Text2SQL 方法,该方法通过逆规范化技术,对数据库模式进行重构,关注 SQL 句法生成的正确性,称为逆规范化网络 (Reverse Normalization SQL, RNSQL)。经理论分析和在公共数据集 Spider 上实验验证,RNSQL 能有效提升 Text2SQL 任务的质量。

     

    Abstract: Text2SQL is an essential task in natural language processing scientific research. It plays a crucial role in studying intelligent question and answer systems, where the core task is to automatically convert questions described in natural language into SQL query statements. Current research focuses on improving the matching accuracy of SQL clause tasks. However, it ignores the correctness of syntactic generation of SQL, and the production of SQL involving multiple tables joining still suffers from a large number of errors. As a result, a neural network-based Text2SQL approach is proposed, which refactors the database schema to focus on the correctness of SQL syntax generation through an inverse normalization technique called RNSQL (Reverse Normalization SQL). Validated by theoretical analysis and experiments on the public dataset Spider, RNSQL can effectively improve the quality of Text2SQL tasks.

     

/

返回文章
返回