RAG Studio (Private-by-Default)
A local-first Retrieval-Augmented Generation (RAG) application with three interfaces: CLI queries, FastAPI web UI, and Streamlit UI. Designed for private document question-answering by default, with optional OpenAI generation in cloud deployments.
Overview
The app ingests documents from resources/documents/, builds embeddings with SentenceTransformers, and stores vectors in ChromaDB for retrieval. Responses can be generated with a local Llama GGUF model via llama-cpp-python or OpenAI. The Streamlit cloud mode supports safe public demos via environment toggles such as PUBLIC_DEMO_MODE and AUTO_BOOTSTRAP_DEMO.
Tech & Tools
- Python
- ChromaDB, SentenceTransformers
- llama-cpp-python (local GGUF) or OpenAI backend
- Streamlit + FastAPI + CLI tooling
Links
For local setup, use requirements_local.txt, run python ingest.py, then launch streamlit run streamlit_app.py or python web_app.py. Review privacy notes in the README to keep private documents and generated indexes out of git.