-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Add RAGAnything processing to LightRAG's webui #2042
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@hzywhite Hi, I'm interested in using this feature. Are you ready to merge? |
|
Hi is there any reason why this PR has not yet been merged? The tests all passed and looks fine |
|
I found LightRAG from RAGAnything repo as Ready to use RAG solution that uses RAGAnything. I found RAGAnything while searching ready to use RAG with mineru backend for PDF processing, and very disappointing that LightRAG does not support RAGAnything out of the box. This MR is MUST HAVE for LightRAG, just add new environment variable to choose which backend to use. FIX: pip install "lightrag-hku[api] @ git+https://github.com/HKUDS/LightRAG.git@RAGAnything"
# Install upstream raganything after lightrag.
pip install "raganything[all] @ git+https://github.com/HKUDS/RAG-Anything.git"ERROR /documents/paginated HTTP/1.1 500
INFO: 127.0.0.1:53198 - "POST /documents/paginated HTTP/1.1" 500
ERROR: Error getting paginated documents: 1 validation error for DocStatusResponse
scheme_name
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/string_type
ERROR: Traceback (most recent call last):
File "/Users/appleroot/projects/RANY/.venv/lib/python3.13/site-packages/lightrag/api/routers/document_routes.py", line 2935, in get_documents_paginated
DocStatusResponse(
~~~~~~~~~~~~~~~~~^
id=doc_id,
^^^^^^^^^^
...<11 lines>...
multimodal_content=doc.multimodal_content,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/Users/appleroot/projects/RANY/.venv/lib/python3.13/site-packages/pydantic/main.py", line 253, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for DocStatusResponse
scheme_name
Input should be a valid string [type=string_type, input_value=None, input_type=NoneType]
For further information visit https://errors.pydantic.dev/2.11/v/string_typeFIX: # DELETE YOUR PREVIOUS RAG DATA
rm -rf inputs/ lightrag.log rag_storage/ERROR: FIX: Restart Browser or Right Click So, here are commands i did for clean install LightRAG with MinerU support. mkdir my-rag && cd my-rag
# Create python venv in new folder
python3 -m venv .venv
. .venv/bin/activate
# Install correct combination of packages
pip install "lightrag-hku[api] @ git+https://github.com/HKUDS/LightRAG.git@RAGAnything"
pip install "mineru[core] @ git+https://github.com/opendatalab/MinerU.git"
pip install "raganything[all] @ git+https://github.com/HKUDS/RAG-Anything.git@ui"
# save env.example file
wget "https://raw.githubusercontent.com/HKUDS/LightRAG/refs/heads/RAGAnything/env.example"
# copy and edit .env file
cp env.example .env
# nano .env
# launch server
lightrag-server |
|
@hzywhite also am waiting for this to be merged, but I'm curious lightrag/api/webui/assets whats with all these compiled assets. They don't look to be intentionally there, at least IMO they shouldn't. |
|
This is a highly anticipated feature, and I’ll be able to dedicate time to researching and testing it only after addressing my current tasks. Please resolve the conflicts with the main branch first. Thank you. |
This is to clone the repository and run the server without rebuilding the frontend project. |
@danielaskdd |
The CI pipeline-generated frontend build code cannot be directly added to the repository, correct? Are you suggesting that the CI pipeline should build the frontend assets and push them to PyPI instead? I don't have experience with this process—could you please share your insights? |
To keep this thread focused, I opened a new issue |
| raganything_error_message = None | ||
|
|
||
| try: | ||
| api_key = get_env_value("LLM_BINDING_API_KEY", "", str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested locally and I think this will override api_key = os.getenv("LIGHTRAG_API_KEY") or args.key and pass the wrong api_key to create_document_routes
|
@7frank I also tested this locally and I think you are incorrect. The api key does not get overridden and that edit is on purpose so that raganything uses the same llm binding as lightrag. My only comment would be that the queries don't utilize the vlm enhanced query from raganything and there should probably be two additional query routes for raganything queries. Other than that, this works great for me locally following the simple instructions included on the PR Lacking a merge on these is pretty inconvenient for the time being. Raganything enhances lightrag 1000x |
Possible. Did you try accessing LightRAG via the API or through the web frontend? In my case, I wanted to access the LightRAG API using a simple MCP. Before the merge, this worked without authentication. After the merge, all routes suddenly required an API key that was never set — My experience was as follows: Routes such as the documents route now use an API key for authentication. See here: LightRAG/lightrag/api/lightrag_server.py Lines 762 to 769 in 9bc5f15
The API key they use is LightRAG/lightrag/api/lightrag_server.py Line 199 in 9bc5f15
However, in the PR, it was overridden by this line: LightRAG/lightrag/api/lightrag_server.py Line 631 in 9bc5f15
As a result, when using LightRAG via the API (not the web frontend), it now returns 401 errors because the API requires a key, even though no I renamed the variable for the local scope, and the error disappeared. |
|
Really looking forward to the merger of raganything and the lightrag server |
|
Is it possible to help here to getting this done? Looking really forward. |
Overview
This document outlines the key differences between the current working branch and the main branch, focusing on the integration of RAGAnything functionality into the LightRAG server.
Operating Procedure (Message on September 16th, 2025)
1.Install RAG-Anything
2.Install the RAGAnything branch of Lightrag
3.Add .env file to lightrag and start running it
Modified Files
lightrag/api/lightrag_server.pyNew Imports
RAGManagerfromlightrag.ragmanagerRAGAnythingandRAGAnythingConfigfromraganythingKey Changes
1. Enhanced LightRAG Initialization
Change: Added
input_dirparameter to the LightRAG initialization.2. RAGAnything Configuration Setup
Purpose: Configures RAGAnything with comprehensive document processing capabilities including:
mineruordoclingparsers3. LLM Model Function Definition
Feature: Standardized LLM interaction using GPT-4o-mini with caching support.
4. Vision Model Function for Image Processing
Capability: Enhanced vision processing using GPT-4o for image analysis with base64 encoding support.
5. Embedding Function Configuration
Specifications:
6. RAGAnything Initialization
Integration:
7. Updated Route Creation
Enhancement: Document routes now receive both
ragandrag_anythinginstances for comprehensive document processing.Summary of New Capabilities
Enhanced Document Processing
Improved AI Integration
Architecture Improvements
Configuration Requirements
Environment Variables
LLM_BINDING_API_KEY: API key for LLM servicesLLM_BINDING_HOST: Base URL for LLM servicesDependencies
raganything: New dependency for enhanced document processingNotes