Retrieval-Augmented Generation (RAG) library on SQLite.
haiku.rag
is a Retrieval-Augmented Generation (RAG) library built to work on SQLite alone without the need for external vector databases. It uses sqlite-vec for storing the embeddings and performs semantic (vector) search as well as full-text search combined through Reciprocal Rank Fusion. Both open-source (Ollama) as well as commercial (OpenAI, VoyageAI) embedding providers are supported.
- Local SQLite: No external servers required
- Multiple embedding providers: Ollama, VoyageAI, OpenAI
- Hybrid search: Vector + full-text search with Reciprocal Rank Fusion
- Question answering: Built-in QA agents on your documents
- File monitoring: Auto-index files when run as server
- 40+ file formats: PDF, DOCX, HTML, Markdown, audio, URLs
- MCP server: Expose as tools for AI assistants
- CLI & Python API: Use from command line or Python
# Install
uv pip install haiku.rag
# Add documents
haiku-rag add "Your content here"
haiku-rag add-src document.pdf
# Search
haiku-rag search "query"
# Ask questions
haiku-rag ask "Who is the author of haiku.rag?"
# Start server with file monitoring
export MONITOR_DIRECTORIES="/path/to/docs"
haiku-rag serve
from haiku.rag.client import HaikuRAG
async with HaikuRAG("database.db") as client:
# Add document
doc = await client.create_document("Your content")
# Search
results = await client.search("query")
for chunk, score in results:
print(f"{score:.3f}: {chunk.content}")
# Ask questions
answer = await client.ask("Who is the author of haiku.rag?")
print(answer)
Use with AI assistants like Claude Desktop:
haiku-rag serve --stdio
Provides tools for document management and search directly in your AI assistant.
Full documentation at: https://ggozad.github.io/haiku.rag/
- Installation - Provider setup
- Configuration - Environment variables
- CLI - Command reference
- Python API - Complete API docs