知识库问答场景中大语言模型私有化研究与应用实践

丁咚; 杨文昌

doi:10.3969/j.issn.1000-386x.2025.10.018

知识库问答场景中大语言模型私有化研究与应用实践

丁咚,
杨文昌

RESEARCH AND APPLICATION PRACTICE OF LARGE LANGUAGE MODEL PRIVATE DEPLOYMENT IN KNOWLEDGE BASE QUESTION ANSWERING SCENARIOS

摘要

摘要: 针对企业在知识库问答场景中对大语言模型私有化部署的迫切需求,以上海卷烟厂为实践案例,详细探讨了运用检索增强生成(RAG)技术构建本地化知识库的方法与流程。研究结合开源的Qwen1.5-32B大模型与BAAI/BGE-large-zh-v1.5词嵌入模型,并在模型选型时充分权衡算力限制与模型性能表现,介绍了具体的构建方案。实验结果表明,所提出的RAG方案能够有效提升信息检索与问答的准确性,且INT4量化能在显著降低显存占用的同时,对问答效果影响较小,为企业在垂直领域安全、高效地进行大语言模型私有化部署与应用提供了具有参考价值的技术方案与实践指导。

Abstract: To address the urgent need for private deployment of large language models (LLMs) in enterprise knowledge base question answering (KBQA) scenarios, this study presents a detailed exploration of utilizing retrieval-augmented generation (RAG) technology to build a localized knowledge base, using the Shanghai Cigarette Factory as a practical case study. The research integrated the open-source Qwen1.5-32B LLM with the BAAI/BGE-large-zh-v1.5 embedding model. A specific construction scheme was introduced, with model selection thoroughly considering both computational resource limitations and model performance. The practice validates the effectiveness of the RAG technique in significantly improving the accuracy of information retrieval within a specific domain knowledge base. INT4 quantization can significantly reduce video memory usage while having a minimal impact on question answering performance. This work offers a valuable technical scheme and practical guidance for the secure and efficient private deployment and application of LLMs in vertical industries.

HTML全文

参考文献(0)

施引文献

资源附件(0)