Abstract:
The numerous regulations and rules have caused confusion among faculty and students, making it difficult for them to fulfill their intended purpose. To address this need, a regulatory document question-answering large language model is developed. By using retrieval-augmented generation (RAG) technology, a campus regulations knowledge base was collected and constructed. A retriever and generator were built to implement a vertical domain model for campus regulations. Evaluation datasets were created for assessment, achieving a semantic similarity score of 0.922 1, answer relevance score of 0.806 0, and answer correctness of 0.600 6. The research model outperformed the baseline model by 0.027 1, 0.086 8, and 0.113 7 points, respectively. The research model effectively mitigated issues in vertical domains, such as domain-specific semantic comprehension, ineffective responses, and factual inaccuracies/ hallucinations in the base model. This research is of significant importance for advancing the study of university regulations governance and humanized interactive answers. It provides an innovative approach for promoting the digital transformation and intelligent management of universities.