2505.07672

Total: 1

#1 OnPrem.LLM: A Privacy-Conscious Document Intelligence Toolkit [PDF1] [Copy] [Kimi4] [REL]

Author: Arun S. Maiya

We present OnPrem.LLM, a Python-based toolkit for applying large language models (LLMs) to sensitive, non-public data in offline or restricted environments. The system is designed for privacy-preserving use cases and provides prebuilt pipelines for document processing and storage, retrieval-augmented generation (RAG), information extraction, summarization, classification, and prompt/output processing with minimal configuration. OnPrem.LLM supports multiple LLM backends -- including llama.cpp, Ollama, vLLM, and Hugging Face Transformers -- with quantized model support, GPU acceleration, and seamless backend switching. Although designed for fully local execution, OnPrem.LLM also supports integration with a wide range of cloud LLM providers when permitted, enabling hybrid deployments that balance performance with data control. A no-code web interface extends accessibility to non-technical users.

Subjects: Computation and Language , Artificial Intelligence , Machine Learning

Publish: 2025-05-12 15:36:27 UTC