Experience with Large Language Model Applications for Information Retrieval from Enterprise Proprietary Data
Published in Product-Focused Software Process Improvement (PROFES 2024), 2024
Recommended citation: L Yu, E Alégroth, P Chatzipetrou, T Gorschek (2024). "Experience with Large Language Model Applications for Information Retrieval from Enterprise Proprietary Data." PROFES 2024. https://doi.org/10.1007/978-3-031-78386-9_7
Large Language Models (LLMs) offer promising capabilities for information retrieval and processing. However, the LLM deployment for querying proprietary enterprise data poses unique challenges, particularly for companies with strict data security policies. This study shares our experience in setting up a secure LLM environment within a FinTech company and utilizing it for enterprise information retrieval while adhering to data privacy protocols. We conducted three workshops and 30 interviews with industrial engineers to gather data and requirements. The interviews further enriched the insights collected from the workshops. We report the steps to deploy an LLM solution in an industrial sandboxed environment and lessons learned from the experience. These lessons contain LLM configuration (e.g., chunk_size and top_k settings), local document ingestion, and evaluating LLM outputs. Our lessons learned serve as a practical guide for practitioners seeking to use private data with LLMs to achieve better usability, improve user experiences, or explore new business opportunities.
