Crawl4AI RAG MCP Server

Web Crawling and RAG Capabilities for AI Agents and AI Coding Assistants

A powerful implementation of the Model Context Protocol (MCP) integrated with Crawl4AI and Supabase for providing AI agents and AI coding assistants with advanced web crawling and RAG capabilities.

With this MCP server, you can scrape anything and then use that knowledge anywhere for RAG.

The primary goal is to bring this MCP server into Archon as I evolve it to be more of a knowledge engine for AI coding assistants to build AI agents. This first version of the Crawl4AI/RAG MCP server will be improved upon greatly soon, especially making it more configurable so you can use different embedding models and run everything locally with Ollama.

Overview

This MCP server provides tools that enable AI agents to crawl websites, store content in a vector database (Supabase), and perform RAG over the crawled content. It follows the best practices for building MCP servers based on the Mem0 MCP server template I provided on my channel previously.

The server includes several advanced RAG strategies that can be enabled to enhance retrieval quality:

Contextual Embeddings for enriched semantic understanding
Hybrid Search combining vector and keyword search
Agentic RAG for specialized code example extraction
Reranking for improved result relevance using cross-encoder models

See the Configuration section below for details on how to enable and configure these strategies.

Vision

The Crawl4AI RAG MCP server is just the beginning. Here's where we're headed:

Integration with Archon: Building this system directly into Archon to create a comprehensive knowledge engine for AI coding assistants to build better AI agents.
Multiple Embedding Models: Expanding beyond OpenAI to support a variety of embedding models, including the ability to run everything locally with Ollama for complete control and privacy.
Advanced RAG Strategies: Implementing sophisticated retrieval techniques like contextual retrieval, late chunking, and others to move beyond basic "naive lookups" and significantly enhance the power and precision of the RAG system, especially as it integrates with Archon.
Enhanced Chunking Strategy: Implementing a Context 7-inspired chunking approach that focuses on examples and creates distinct, semantically meaningful sections for each chunk, improving retrieval precision.
Performance Optimization: Increasing crawling and indexing speed to make it more realistic to "quickly" index new documentation to then leverage it within the same prompt in an AI coding assistant.

Features

Smart URL Detection: Automatically detects and handles different URL types (regular webpages, sitemaps, text files)
Recursive Crawling: Follows internal links to discover content
Parallel Processing: Efficiently crawls multiple pages simultaneously
Content Chunking: Intelligently splits content by headers and size for better processing
Vector Search: Performs RAG over crawled content, optionally filtering by data source for precision
Source Retrieval: Retrieve sources available for filtering to guide the RAG process

Tools

The server provides essential web crawling and search tools:

Core Tools (Always Available)

crawl_single_page: Quickly crawl a single web page and store its content in the vector database
smart_crawl_url: Intelligently crawl a full website based on the type of URL provided (sitemap, llms-full.txt, or a regular webpage that needs to be crawled recursively)
get_available_sources: Get a list of all available sources (domains) in the database
perform_rag_query: Search for relevant content using semantic search with optional source filtering

Conditional Tools

search_code_examples (requires USE_AGENTIC_RAG=true): Search specifically for code examples and their summaries from crawled documentation. This tool provides targeted code snippet retrieval for AI coding assistants.

Prerequisites

Docker/Docker Desktop if running the MCP server as a container (recommended)
Python 3.12+ if running the MCP server directly through uv
Supabase (database for RAG)
OpenAI API key (for generating embeddings)

Installation

Using Docker (Recommended)

Clone this repository:

git clone https://github.com/coleam00/mcp-crawl4ai-rag.git
cd mcp-crawl4ai-rag

Build the Docker image:

docker build -t mcp/crawl4ai-rag --build-arg PORT=8051 .

Create a .env file based on the configuration section below

Using uv directly (no Docker)

Clone this repository:

git clone https://github.com/coleam00/mcp-crawl4ai-rag.git
cd mcp-crawl4ai-rag

Install uv if you don't have it:
```
pip install uv
```

Create and activate a virtual environment:

uv venv
.venv\Scripts\activate
# on Mac/Linux: source .venv/bin/activate

Install dependencies:
```
uv pip install -e .
crawl4ai-setup
```
Create a .env file based on the configuration section below

Database Setup

Before running the server, you need to set up the database with the pgvector extension:

Go to the SQL Editor in your Supabase dashboard (create a new project first if necessary)
Create a new query and paste the contents of crawled_pages.sql
Run the query to create the necessary tables and functions

Configuration

Create a .env file in the project root with the following variables:

# MCP Server Configuration
HOST=0.0.0.0
PORT=8051
TRANSPORT=sse

# OpenAI API Configuration
OPENAI_API_KEY=your_openai_api_key

# LLM for summaries and contextual embeddings
MODEL_CHOICE=gpt-4.1-nano

# RAG Strategies (set to "true" or "false", default to "false")
USE_CONTEXTUAL_EMBEDDINGS=false
USE_HYBRID_SEARCH=false
USE_AGENTIC_RAG=false
USE_RERANKING=false

# Supabase Configuration
SUPABASE_URL=your_supabase_project_url
SUPABASE_SERVICE_KEY=your_supabase_service_key

RAG Strategy Options

The Crawl4AI RAG MCP server supports four powerful RAG strategies that can be enabled independently:

1. USE_CONTEXTUAL_EMBEDDINGS

When enabled, this strategy enhances each chunk's embedding with additional context from the entire document. The system passes both the full document and the specific chunk to an LLM (configured via MODEL_CHOICE) to generate enriched context that gets embedded alongside the chunk content.

When to use: Enable this when you need high-precision retrieval where context matters, such as technical documentation where terms might have different meanings in different sections.
Trade-offs: Slower indexing due to LLM calls for each chunk, but significantly better retrieval accuracy.
Cost: Additional LLM API calls during indexing.

2. USE_HYBRID_SEARCH

Combines traditional keyword search with semantic vector search to provide more comprehensive results. The system performs both searches in parallel and intelligently merges results, prioritizing documents that appear in both result sets.

When to use: Enable this when users might search using specific technical terms, function names, or when exact keyword matches are important alongside semantic understanding.
Trade-offs: Slightly slower search queries but more robust results, especially for technical content.
Cost: No additional API costs, just computational overhead.

3. USE_AGENTIC_RAG

Enables specialized code example extraction and storage. When crawling documentation, the system identifies code blocks (≥300 characters), extracts them with surrounding context, generates summaries, and stores them in a separate vector database table specifically designed for code search.

When to use: Essential for AI coding assistants that need to find specific code examples, implementation patterns, or usage examples from documentation.
Trade-offs: Significantly slower crawling due to code extraction and summarization, requires more storage space.
Cost: Additional LLM API calls for summarizing each code example.
Benefits: Provides a dedicated search_code_examples tool that AI agents can use to find specific code implementations.

4. USE_RERANKING

Applies cross-encoder reranking to search results after initial retrieval. Uses a lightweight cross-encoder model (cross-encoder/ms-marco-MiniLM-L-6-v2) to score each result against the original query, then reorders results by relevance.

When to use: Enable this when search precision is critical and you need the most relevant results at the top. Particularly useful for complex queries where semantic similarity alone might not capture query intent.
Trade-offs: Adds ~100-200ms to search queries depending on result count, but significantly improves result ordering.
Cost: No additional API costs - uses a local model that runs on CPU.
Benefits: Better result relevance, especially for complex queries. Works with both regular RAG search and code example search.

Recommended Configurations

For general documentation RAG:

USE_CONTEXTUAL_EMBEDDINGS=false
USE_HYBRID_SEARCH=true
USE_AGENTIC_RAG=false
USE_RERANKING=true

For AI coding assistant with code examples:

USE_CONTEXTUAL_EMBEDDINGS=true
USE_HYBRID_SEARCH=true
USE_AGENTIC_RAG=true
USE_RERANKING=true

For fast, basic RAG:

USE_CONTEXTUAL_EMBEDDINGS=false
USE_HYBRID_SEARCH=true
USE_AGENTIC_RAG=false
USE_RERANKING=false

Running the Server

Using Docker

docker run --env-file .env -p 8051:8051 mcp/crawl4ai-rag

Using Python

uv run src/crawl4ai_mcp.py

The server will start and listen on the configured host and port.

Integration with MCP Clients

SSE Configuration

Once you have the server running with SSE transport, you can connect to it using this configuration:

{
  "mcpServers": {
    "crawl4ai-rag": {
      "transport": "sse",
      "url": "http://localhost:8051/sse"
    }
  }
}

Note for Windsurf users: Use serverUrl instead of url in your configuration:
{
  "mcpServers": {
    "crawl4ai-rag": {
      "transport": "sse",
      "serverUrl": "http://localhost:8051/sse"
    }
  }
}
Note for Docker users: Use host.docker.internal instead of localhost if your client is running in a different container. This will apply if you are using this MCP server within n8n!

Stdio Configuration

Add this server to your MCP configuration for Claude Desktop, Windsurf, or any other MCP client:

{
  "mcpServers": {
    "crawl4ai-rag": {
      "command": "python",
      "args": ["path/to/crawl4ai-mcp/src/crawl4ai_mcp.py"],
      "env": {
        "TRANSPORT": "stdio",
        "OPENAI_API_KEY": "your_openai_api_key",
        "SUPABASE_URL": "your_supabase_url",
        "SUPABASE_SERVICE_KEY": "your_supabase_service_key"
      }
    }
  }
}

Docker with Stdio Configuration

{
  "mcpServers": {
    "crawl4ai-rag": {
      "command": "docker",
      "args": ["run", "--rm", "-i", 
               "-e", "TRANSPORT", 
               "-e", "OPENAI_API_KEY", 
               "-e", "SUPABASE_URL", 
               "-e", "SUPABASE_SERVICE_KEY", 
               "mcp/crawl4ai"],
      "env": {
        "TRANSPORT": "stdio",
        "OPENAI_API_KEY": "your_openai_api_key",
        "SUPABASE_URL": "your_supabase_url",
        "SUPABASE_SERVICE_KEY": "your_supabase_service_key"
      }
    }
  }
}

Building Your Own Server

This implementation provides a foundation for building more complex MCP servers with web crawling capabilities. To build your own:

Add your own tools by creating methods with the @mcp.tool() decorator
Create your own lifespan function to add your own dependencies
Modify the utils.py file for any helper functions you need
Extend the crawling capabilities by adding more specialized crawlers

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
crawled_pages.sql		crawled_pages.sql
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Crawl4AI RAG MCP Server

Overview

Vision

Features

Tools

Core Tools (Always Available)

Conditional Tools

Prerequisites

Installation

Using Docker (Recommended)

Using uv directly (no Docker)

Database Setup

Configuration

RAG Strategy Options

1. USE_CONTEXTUAL_EMBEDDINGS

2. USE_HYBRID_SEARCH

3. USE_AGENTIC_RAG

4. USE_RERANKING

Recommended Configurations

Running the Server

Using Docker

Using Python

Integration with MCP Clients

SSE Configuration

Stdio Configuration

Docker with Stdio Configuration

Building Your Own Server

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

coleam00/mcp-crawl4ai-rag

Folders and files

Latest commit

History

Repository files navigation

Crawl4AI RAG MCP Server

Overview

Vision

Features

Tools

Core Tools (Always Available)

Conditional Tools

Prerequisites

Installation

Using Docker (Recommended)

Using uv directly (no Docker)

Database Setup

Configuration

RAG Strategy Options

1. USE_CONTEXTUAL_EMBEDDINGS

2. USE_HYBRID_SEARCH

3. USE_AGENTIC_RAG

4. USE_RERANKING

Recommended Configurations

Running the Server

Using Docker

Using Python

Integration with MCP Clients

SSE Configuration

Stdio Configuration

Docker with Stdio Configuration

Building Your Own Server

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages