Total: 1
The retrieval-augmented generation (RAG) technique enables generative AI models to extract accurate facts from external unstructureddata sources. For structured data, RAG is further augmented by function calls to query databases. This paper presents an industrialcase study that implements RAG in a large financial institution’s call center. The study showcases experiences and architecture for ascalable RAG deployment. It also introduces enhancements to RAG for retrieving facts from structured data sources using data embeddings, achieving low latency and high reliability. Our optimized production application demonstratesan average response time of only 7.33 seconds. Additionally, the paper compares various open-source and closed-source models for answer generation in an industrial context.