diff --git a/README.md b/README.md index 9ecd17e..ae99317 100644 --- a/README.md +++ b/README.md @@ -33,3 +33,4 @@ To contribute to the tutorials, please check out our [Contributing Guidelines](. | [Retrieving a Context Window Around a Sentence](./tutorials/42_Sentence_Window_Retriever.ipynb) | [](https://colab.research.google.com/github/deepset-ai/haystack-tutorials/blob/main/tutorials/42_Sentence_Window_Retriever.ipynb) | | | [Build a Tool-Calling Agent](./tutorials/43_Building_a_Tool_Calling_Agent.ipynb) | [](https://colab.research.google.com/github/deepset-ai/haystack-tutorials/blob/main/tutorials/43_Building_a_Tool_Calling_Agent.ipynb) | | | [Creating Custom SuperComponents](./tutorials/44_Creating_Custom_SuperComponents.ipynb) | [](https://colab.research.google.com/github/deepset-ai/haystack-tutorials/blob/main/tutorials/44_Creating_Custom_SuperComponents.ipynb) | | +| [Creating a Multi-Agent System with Haystack](./tutorials/45_Creating_a_Multi_Agent_System.ipynb) | [](https://colab.research.google.com/github/deepset-ai/haystack-tutorials/blob/main/tutorials/45_Creating_a_Multi_Agent_System.ipynb) | | \ No newline at end of file diff --git a/index.toml b/index.toml index 990e312..d8a0c65 100644 --- a/index.toml +++ b/index.toml @@ -25,7 +25,6 @@ aliases = [] completion_time = "15 min" created_at = 2023-11-30 dependencies = ["colorama"] -featured = true [[tutorial]] title = "Serializing LLM Pipelines" @@ -218,3 +217,15 @@ completion_time = "20 min" created_at = 2025-04-22 dependencies = ["sentence-transformers>=4.1.0", "datasets", "accelerate"] featured = true + +[[tutorial]] +title = "Creating a Multi-Agent System with Haystack" +description = "Use agents specialized in specific tasks to build more complex, modular agent workflows" +level = "advanced" +weight = 10 +notebook = "45_Creating_a_Multi_Agent_System.ipynb" +aliases = [] +completion_time = "20 min" +created_at = 2025-06-02 +dependencies = [] +featured = true diff --git a/tutorials/45_Creating_a_Multi_Agent_System.ipynb b/tutorials/45_Creating_a_Multi_Agent_System.ipynb new file mode 100644 index 0000000..1369a74 --- /dev/null +++ b/tutorials/45_Creating_a_Multi_Agent_System.ipynb @@ -0,0 +1,1060 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "4cEpvzDt72dk" + }, + "source": [ + "# Tutorial: Creating a Multi-Agent System with Haystack\n", + "\n", + "- **Level**: Advanced\n", + "- **Time to complete**: 20 minutes\n", + "- **Components Used**: [`Agent`](https://docs.haystack.deepset.ai/docs/agent), [`DuckduckgoApiWebSearch`](https://haystack.deepset.ai/integrations/duckduckgo-api-websearch), [`OpenAIChatGenerator`](https://docs.haystack.deepset.ai/docs/openaichatgenerator), [`DocumentWriter`](https://docs.haystack.deepset.ai/docs/documentwriter)\n", + "- **Prerequisites**: You need an [OpenAI API Key](https://platform.openai.com/api-keys), and a [Notion Integration](https://developers.notion.com/docs/create-a-notion-integration#getting-started) set up beforehand\n", + "- **Goal**: After completing this tutorial, you'll have learned how to build a multi-agent system in Haystack where each agent is specialized for a specific task.\n", + "\n", + "## Overview\n", + "**Multi-agent** systems are made up of several intelligent agents that work together to solve complex tasks more effectively than a single agent alone. Each agent takes on a specific role or skill, allowing for distributed reasoning, task specialization, and smooth coordination within one unified system.\n", + "\n", + "What makes this possible in Haystack is the ability to use **agents as tools** for other agents. This powerful pattern allows you to compose modular, specialized agents and orchestrate them through a main agent that delegates tasks based on context.\n", + "\n", + "In this tutorial, you'll build a simple yet powerful multi-agent setup with one main agent and two sub-agents: one focused on researching information, and the other on saving it." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "t1OHZgmV8L8H" + }, + "source": [ + "## Preparing the Environment\n", + "\n", + "First, let's install required packages:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "8Q3xX8ppqrqC", + "outputId": "895d9478-fd39-41a0-b255-4078acb08c0b" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m514.7/514.7 kB\u001b[0m \u001b[31m8.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m96.7/96.7 kB\u001b[0m \u001b[31m8.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m74.5/74.5 kB\u001b[0m \u001b[31m5.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m3.3/3.3 MB\u001b[0m \u001b[31m55.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h" + ] + } + ], + "source": [ + "!pip install -q haystack-ai duckduckgo-api-haystack" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "d7SkhsYX8SQS" + }, + "source": [ + "### Enter API Keys\n", + "\n", + "Enter API keys required for this tutorial. Learn how to get your `NOTION_API_KEY` after creating a Notion integration [here](https://developers.notion.com/docs/create-a-notion-integration#get-your-api-secret)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "PU26Ayr8rI3t", + "outputId": "eea1c012-870f-4376-ac9c-879d4e4a14e2" + }, + "outputs": [], + "source": [ + "from getpass import getpass\n", + "import os\n", + "\n", + "if not os.environ.get(\"OPENAI_API_KEY\"):\n", + " os.environ[\"OPENAI_API_KEY\"] = getpass(\"Enter your OpenAI API key:\")\n", + "if not os.environ.get(\"NOTION_API_KEY\"):\n", + " os.environ[\"NOTION_API_KEY\"] = getpass(\"Enter your NOTION API key:\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "0pmZAVPSRVZi" + }, + "source": [ + "## Creating Tools for the Research Agent\n", + "\n", + "Let's set up the tools for your research agent. Its job is to gather information on a topic, and in this tutorial, it will use just two sources: the whole web and Wikipedia. In other scenarios, it could also connect to a retrieval or RAG pipeline linked to a document store.\n", + "\n", + "You'll create two [`ComponentTool`s](https://docs.haystack.deepset.ai/docs/componenttool) using the [DuckduckgoApiWebSearch](https://haystack.deepset.ai/integrations/duckduckgo-api-websearch) component: one for general web search, and one limited to Wikipedia by setting `allowed_domain`.\n", + "\n", + "One more step before wrapping up: `DuckduckgoApiWebSearch` returns a list of documents, but the `Agent` works best when tools return a single string in this setting. To handle that, define a `doc_to_string` function that converts the Document list into a string. This function, used as the `outputs_to_string` handler, can also add custom elements like filenames or links before returning the output." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "YRREx0yzRaE2" + }, + "outputs": [], + "source": [ + "from haystack.tools import ComponentTool\n", + "from haystack.components.websearch import DuckduckgoApiWebSearch\n", + "\n", + "\n", + "def doc_to_string(documents) -> str:\n", + " \"\"\"\n", + " Handles the tool output before conversion to ChatMessage.\n", + " \"\"\"\n", + " result_str = \"\"\n", + " for document in documents:\n", + " result_str += f\"File Content for {document.meta['link']}\\n\\n {document.content}\"\n", + "\n", + " if len(result_str) > 150_000: # trim if the content is too large\n", + " result_str = result_str[:150_000] + \"...(large file can't be fully displayed)\"\n", + "\n", + " return result_str\n", + "\n", + "\n", + "web_search = ComponentTool(\n", + " component=DuckduckgoApiWebSearch(top_k=5, backend=\"lite\"),\n", + " name=\"web_search\",\n", + " description=\"Search the web\",\n", + " outputs_to_string={\"source\": \"documents\", \"handler\": doc_to_string},\n", + ")\n", + "\n", + "wiki_search = ComponentTool(\n", + " component=DuckduckgoApiWebSearch(top_k=5, backend=\"lite\", allowed_domain=\"https://en.wikipedia.org\"),\n", + " name=\"wiki_search\",\n", + " description=\"Search Wikipedia\",\n", + " outputs_to_string={\"source\": \"documents\", \"handler\": doc_to_string},\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "C5qn6kyJ2hSX" + }, + "source": [ + "If you're hitting rate limits with DuckDuckGo, you can use [`SerperDevWebSearch`](https://docs.haystack.deepset.ai/docs/serperdevwebsearch) as your websearch component for these tools. You need to enter the free [Serper API Key](https://serper.dev/api-key) to use `SerperDevWebSearch`. " + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "4vnseY962j96" + }, + "outputs": [], + "source": [ + "# from getpass import getpass\n", + "# import os\n", + "\n", + "# from haystack.components.websearch import SerperDevWebSearch\n", + "# from haystack.tools import ComponentTool\n", + "\n", + "# if not os.environ.get(\"SERPERDEV_API_KEY\"):\n", + "# os.environ[\"SERPERDEV_API_KEY\"] = getpass(\"Enter your SERPER API key:\")\n", + "\n", + "# web_search = ComponentTool(\n", + "# component=SerperDevWebSearch(top_k=5),\n", + "# name=\"web_search\",\n", + "# description=\"Search the web\",\n", + "# outputs_to_string={\"source\": \"documents\", \"handler\": doc_to_string},\n", + "# )\n", + "\n", + "# wiki_search = ComponentTool(\n", + "# component=SerperDevWebSearch(top_k=5, allowed_domains=[\"https://www.wikipedia.org/\", \"https://en.wikipedia.org\"]),\n", + "# name=\"wiki_search\",\n", + "# description=\"Search Wikipedia\",\n", + "# outputs_to_string={\"source\": \"documents\", \"handler\": doc_to_string},\n", + "# )" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "03J-XZo5xfnm" + }, + "source": [ + "## Initializing the Research Agent\n", + "\n", + "Now it's time to bring your research agent to life. This agent will solely responsible for finding information. Use `OpenAIChatGenerator` or any other [chat generator](https://docs.haystack.deepset.ai/docs/generators) that supports function calling.\n", + "\n", + "Pass in the `web_search` and `wiki_search` tools you created earlier. To keep things transparent, enable streaming using the built-in `print_streaming_chunk` function. This will display the agent's tool calls and results in real time, so you can follow its actions step by step.\n", + "\n" + ] + }, + { + "cell_type": "code", + "execution_count": 61, + "metadata": { + "id": "nNWU3iwDxhsA" + }, + "outputs": [], + "source": [ + "from haystack.components.agents import Agent\n", + "from haystack.dataclasses import ChatMessage\n", + "from haystack.components.generators.utils import print_streaming_chunk\n", + "from haystack.components.generators.chat import OpenAIChatGenerator\n", + "\n", + "research_agent = Agent(\n", + " chat_generator=OpenAIChatGenerator(model=\"gpt-4o-mini\"),\n", + " system_prompt=\"\"\"\n", + " You are a research agent that can find information on web or specifically on wikipedia.\n", + " Use wiki_search tool if you need facts and use web_search tool for latest news on topics.\n", + " Use one tool at a time. Try different queries if you need more information.\n", + " Only use the retrieved context, do not use your own knowledge.\n", + " Summarize the all retrieved information before returning response to the user.\n", + " \"\"\",\n", + " tools=[web_search, wiki_search],\n", + " streaming_callback=print_streaming_chunk,\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 62, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "DdsLzu8F7mh-", + "outputId": "ea9230d7-33de-4496-e6dc-39566baa05e6" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: wiki_search \n", + "Arguments: {\"query\":\"Florence Nightingale contributions to nursing\"}\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "File Content for https://en.wikipedia.org/wiki/Florence_Nightingale\n", + "\n", + " She significantly reduced death rates by improving hygiene and living standards. Nightingale gave nursing a favourable reputation and became an icon of ...File Content for https://en.wikipedia.org/wiki/Nightingale_Pledge\n", + "\n", + " The Nightingale Pledge is a statement of the ethics and principles of the nursing profession in the United States, and it is not used outside the US.File Content for https://en.wikipedia.org/wiki/Florence_Nightingale_Faculty_of_Nursing_and_Midwifery\n", + "\n", + " Established on 9 July 1860 by Florence Nightingale, the founder of modern nursing, it was a model for many similar training schools through the UK, Commonwealth ...File Content for https://en.wikipedia.org/wiki/History_of_nursing\n", + "\n", + " The Crimean War was a significant development in nursing history when English nurse Florence Nightingale laid the foundations of professional nursing with the ...File Content for https://en.wikipedia.org/wiki/Nursing\n", + "\n", + " Florence Nightingale laid the foundations of professional nursing after the Crimean War, in light of a comprehensive statistical study she made of sanitation ...\n", + "\n", + "Florence Nightingale is widely recognized as the founder of modern nursing due to her significant contributions to the profession, particularly during and after the Crimean War. Here are the key aspects of her contributions:\n", + "\n", + "1. **Improvement of Hygiene and Standards**: Nightingale played a crucial role in reducing death rates in military hospitals by implementing better hygiene practices and improving living conditions for soldiers. Her emphasis on sanitation and cleanliness laid the groundwork for future nursing practices.\n", + "\n", + "2. **Establishment of Nursing Education**: In 1860, she founded the Nightingale School of Nursing at St. Thomas' Hospital in London, which became a model for nursing training programs worldwide. This institution was pivotal in professionalizing nursing and training skilled nurses.\n", + "\n", + "3. **Statistical Analysis**: Nightingale utilized statistics to demonstrate the impact of sanitation on health outcomes, making a compelling case for reform in healthcare settings. Her work in this area contributed to the development of modern evidence-based nursing practices.\n", + "\n", + "4. **Nightingale Pledge**: The Nightingale Pledge, akin to the Hippocratic Oath for doctors, was established to outline the ethics and principles that nursing professionals should uphold. This pledge is recognized in the United States and signifies a commitment to patient care and professional conduct.\n", + "\n", + "Through these contributions, Florence Nightingale not only transformed nursing into a respected profession but also left a lasting legacy that shaped healthcare practices and nursing education.\n", + "\n" + ] + } + ], + "source": [ + "result = research_agent.run(\n", + " messages=[ChatMessage.from_user(\"Can you tell me about Florence Nightingale's contributions to nursery?\")]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "vYiKvW9T--Yh" + }, + "source": [ + "Print the final answer of the agent through `result[\"last_message\"].text`" + ] + }, + { + "cell_type": "code", + "execution_count": 63, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "6L_-Sutd9TBk", + "outputId": "b2052351-f7f5-4fe5-ff0f-25b754f7b74d" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Final Answer: Florence Nightingale is widely recognized as the founder of modern nursing due to her significant contributions to the profession, particularly during and after the Crimean War. Here are the key aspects of her contributions:\n", + "\n", + "1. **Improvement of Hygiene and Standards**: Nightingale played a crucial role in reducing death rates in military hospitals by implementing better hygiene practices and improving living conditions for soldiers. Her emphasis on sanitation and cleanliness laid the groundwork for future nursing practices.\n", + "\n", + "2. **Establishment of Nursing Education**: In 1860, she founded the Nightingale School of Nursing at St. Thomas' Hospital in London, which became a model for nursing training programs worldwide. This institution was pivotal in professionalizing nursing and training skilled nurses.\n", + "\n", + "3. **Statistical Analysis**: Nightingale utilized statistics to demonstrate the impact of sanitation on health outcomes, making a compelling case for reform in healthcare settings. Her work in this area contributed to the development of modern evidence-based nursing practices.\n", + "\n", + "4. **Nightingale Pledge**: The Nightingale Pledge, akin to the Hippocratic Oath for doctors, was established to outline the ethics and principles that nursing professionals should uphold. This pledge is recognized in the United States and signifies a commitment to patient care and professional conduct.\n", + "\n", + "Through these contributions, Florence Nightingale not only transformed nursing into a respected profession but also left a lasting legacy that shaped healthcare practices and nursing education.\n" + ] + } + ], + "source": [ + "print(\"Final Answer:\", result[\"last_message\"].text)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "DQ_vsUkx0sLb" + }, + "source": [ + "## Creating Tools for the Writer Agent\n", + "\n", + "Next, let's set up tools for your second agent, the writer agent. Its job is to save content, either to Notion or to a document store (to `InMemoryDocumentStore` in this tutorial).\n", + "\n", + "### Notion Writer Tool\n", + "\n", + "Start by creating a [custom component](https://docs.haystack.deepset.ai/docs/custom-components) that can add a new page to a Notion workspace given the page title and content. To use it, you'll need an active [Notion integration](https://developers.notion.com/docs/create-a-notion-integration) with [access to a parent page](https://developers.notion.com/docs/create-a-notion-integration#give-your-integration-page-permissions) where new content will be stored. Grab the [`page_id`](https://developers.notion.com/docs/create-a-notion-integration#environment-variables) of that parent page from its URL, and pass it as an init parameter when initializing the component.\n", + "\n", + "Here's a basic implementation to get you started:" + ] + }, + { + "cell_type": "code", + "execution_count": 36, + "metadata": { + "id": "nBPs30VuyBiK" + }, + "outputs": [], + "source": [ + "from haystack import component\n", + "from typing import Optional\n", + "from haystack.utils import Secret\n", + "import requests\n", + "\n", + "\n", + "@component\n", + "class NotionPageCreator:\n", + " \"\"\"\n", + " Create a page in Notion using provided title and content.\n", + " \"\"\"\n", + "\n", + " def __init__(\n", + " self,\n", + " page_id: str,\n", + " notion_version: str = \"2022-06-28\",\n", + " api_key: Secret = Secret.from_env_var(\"NOTION_API_KEY\"), # to use the environment variable NOTION_API_KEY\n", + " ):\n", + " \"\"\"\n", + " Initialize with the target Notion database ID and API version.\n", + " \"\"\"\n", + " self.api_key = api_key\n", + " self.notion_version = notion_version\n", + " self.page_id = page_id\n", + "\n", + " @component.output_types(success=bool, status_code=int, error=Optional[str])\n", + " def run(self, title: str, content: str):\n", + " \"\"\"\n", + " :param title: The title of the Notion page.\n", + " :param content: The content of the Notion page.\n", + " \"\"\"\n", + " headers = {\n", + " \"Authorization\": f\"Bearer {self.api_key.resolve_value()}\",\n", + " \"Content-Type\": \"application/json\",\n", + " \"Notion-Version\": self.notion_version,\n", + " }\n", + "\n", + " payload = {\n", + " \"parent\": {\"page_id\": self.page_id},\n", + " \"properties\": {\"title\": [{\"text\": {\"content\": title}}]},\n", + " \"children\": [\n", + " {\n", + " \"object\": \"block\",\n", + " \"type\": \"paragraph\",\n", + " \"paragraph\": {\"rich_text\": [{\"type\": \"text\", \"text\": {\"content\": content}}]},\n", + " }\n", + " ],\n", + " }\n", + "\n", + " response = requests.post(\"https://api.notion.com/v1/pages\", headers=headers, json=payload)\n", + "\n", + " if response.status_code == 200 or response.status_code == 201:\n", + " return {\"success\": True, \"status_code\": response.status_code}\n", + " else:\n", + " return {\"success\": False, \"status_code\": response.status_code, \"error\": response.text}" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "n_OJKcG5E-I1" + }, + "source": [ + "Give the component a quick test to confirm it's working properly." + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "gq47BHAdFcxA", + "outputId": "ff1a2db6-7ae0-47bb-bece-9953e550a0e6" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'success': True, 'status_code': 200}" + ] + }, + "execution_count": 30, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "notion_writer = NotionPageCreator(page_id=\"<your_page_id>\")\n", + "notion_writer.run(title=\"My first page\", content=\"The content of my first page\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "OHA3nnEJIWEb" + }, + "source": [ + "> 💡 When turning a custom component into a tool using `ComponentTool`, make sure its input parameters are well-defined. You can do this in one of two ways:\n", + "1. Pass a `properties` dictionary to `ComponentTool`, or\n", + "2. Use parameter annotations in the `run` method's docstring, like so:\n", + "```python\n", + " def run(self, title: str, content: str):\n", + " \"\"\"\n", + " :param title: The title of the Notion page.\n", + " :param content: The content of the Notion page.\n", + " \"\"\"\n", + "```\n", + "This approach also applies to setting the tool's `description`.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 38, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "Dv2JvAs5HuHv", + "outputId": "da24f4b2-1964-4b3b-aa3b-9b8cfd8bc590" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'properties': {'title': {'description': 'The title of the Notion page.',\n", + " 'type': 'string'},\n", + " 'content': {'description': 'The content of the Notion page.',\n", + " 'type': 'string'}},\n", + " 'required': ['title', 'content'],\n", + " 'type': 'object'}" + ] + }, + "execution_count": 38, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from haystack.tools import ComponentTool\n", + "\n", + "notion_writer = ComponentTool(\n", + " component=NotionPageCreator(page_id=\"<your_page_id>\"),\n", + " name=\"notion_writer\",\n", + " description=\"Use this tool to write/save content to Notion.\",\n", + ")\n", + "notion_writer.parameters # see how parameters are automatically generated by the ComponentTool" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "3dG9twy3HgbE" + }, + "source": [ + "### Document Store Writer Tool\n", + "\n", + "Let's now build the other tool for the writer agent, this one will save content to an [InMemoryDocumentStore](https://docs.haystack.deepset.ai/docs/inmemorydocumentstore).\n", + "\n", + "To make this work, start by creating a pipeline that includes the custom `DocumentAdapter` compoenent along with the [DocumentWriter](https://docs.haystack.deepset.ai/docs/documentwriter). Once the pipeline is ready, wrap it in a `SuperComponent` and then convert it into a tool using `ComponentTool`.\n", + "\n", + "> 💡 Tip: You could also [create a tool](https://docs.haystack.deepset.ai/docs/tool#tool-initialization) from a simple function that runs the pipeline. However, the recommended approach is to use `SuperComponent` together with `ComponentTool`, especially if you plan to deploy the tool with [Hayhooks](https://docs.haystack.deepset.ai/docs/hayhooks), since this method supports better serialization. Learn more about `SuperComponents` in [Tutorial: Creating Custom SuperComponents](https://haystack.deepset.ai/tutorials/44_creating_custom_supercomponents)" + ] + }, + { + "cell_type": "code", + "execution_count": 47, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "XY3za4sKKEPU", + "outputId": "c5a696ad-5b39-48f2-f3f4-d4efa805d7dc" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "{'type': 'object',\n", + " 'properties': {'title': {'type': 'string',\n", + " 'description': 'The title of the Document'},\n", + " 'content': {'type': 'string', 'description': 'The content of the Document'}},\n", + " 'required': ['title', 'content']}" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "from haystack import Pipeline, component, Document, SuperComponent\n", + "from haystack.components.writers import DocumentWriter\n", + "from haystack.document_stores.in_memory import InMemoryDocumentStore\n", + "from typing import List\n", + "\n", + "\n", + "@component\n", + "class DocumentAdapter:\n", + " @component.output_types(documents=List[Document])\n", + " def run(self, content: str, title: str):\n", + " return {\"documents\": [Document(content=content, meta={\"title\": title})]}\n", + "\n", + "\n", + "document_store = InMemoryDocumentStore()\n", + "\n", + "doc_store_writer_pipeline = Pipeline()\n", + "doc_store_writer_pipeline.add_component(\"adapter\", DocumentAdapter())\n", + "doc_store_writer_pipeline.add_component(\"writer\", DocumentWriter(document_store=document_store))\n", + "doc_store_writer_pipeline.connect(\"adapter\", \"writer\")\n", + "\n", + "doc_store_writer = ComponentTool(\n", + " component=SuperComponent(doc_store_writer_pipeline),\n", + " name=\"doc_store_writer\",\n", + " description=\"Use this tool to write/save content to document store\",\n", + " parameters={\n", + " \"type\": \"object\",\n", + " \"properties\": {\n", + " \"title\": {\"type\": \"string\", \"description\": \"The title of the Document\"},\n", + " \"content\": {\"type\": \"string\", \"description\": \"The content of the Document\"},\n", + " },\n", + " \"required\": [\"title\", \"content\"],\n", + " },\n", + ")\n", + "doc_store_writer.parameters" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Yu4-L-XhBK1K" + }, + "source": [ + "## Initializing the Writer Agent\n", + "\n", + "Now let's bring the writer agent to life. Its job is to save or write information using the tools you've set up.\n", + "\n", + "Provide the `notion_writer` and `doc_writer` tools as inputs. In other use cases, you could include tools for other platforms like **Google Drive**, or connect to MCP servers using [MCPTool](https://docs.haystack.deepset.ai/docs/mcptool).\n", + "\n", + "Enable streaming with the built-in `print_streaming_chunk` function to see the agent's actions in real time. Be sure to write a clear and descriptive system prompt to guide the agent's behavior.\n", + "\n", + "Lastly, set `exit_conditions=[\"notion_writer\", \"doc_writer\"]` so the agent knows to stop once it calls one of these tools. Since this agent's job is to act, not reply, we don't want it to return a final response." + ] + }, + { + "cell_type": "code", + "execution_count": 48, + "metadata": { + "id": "4zl5YDywzUxq" + }, + "outputs": [], + "source": [ + "from haystack.components.agents import Agent\n", + "from haystack.dataclasses import ChatMessage\n", + "from haystack.components.generators.utils import print_streaming_chunk\n", + "from haystack.components.generators.chat import OpenAIChatGenerator\n", + "\n", + "writer_agent = Agent(\n", + " chat_generator=OpenAIChatGenerator(model=\"gpt-4o-mini\"),\n", + " system_prompt=\"\"\"\n", + " You are a writer agent that saves given information to different locations.\n", + " Do not change the provided content before saving.\n", + " Infer the title from the text if not provided.\n", + " When you need to save provided information to Notion, use notion_writer tool.\n", + " When you need to save provided information to document store, use doc_store_writer tool\n", + " If no location is mentioned, use notion_writer tool to save the information.\n", + " \"\"\",\n", + " tools=[doc_store_writer, notion_writer],\n", + " streaming_callback=print_streaming_chunk,\n", + " exit_conditions=[\"notion_writer\", \"doc_store_writer\"],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "twnKB4gs1AOb" + }, + "source": [ + "Let's test the Writer Agent" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "OGGK_OlGL6km", + "outputId": "3696e67a-f3c0-49fc-b755-a523d2186f71" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: notion_writer \n", + "Arguments: {\"title\":\"Florence Nightingale: Founder of Modern Nursing\",\"content\":\"Florence Nightingale is widely recognized as the founder of modern nursing, and her contributions significantly transformed the field. \\nFlorence Nightingale's legacy endures, as she set professional standards that have shaped nursing into a respected and essential component of the healthcare system. Her influence is still felt in nursing education and practice today.\"}\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "{'success': True, 'status_code': 200}\n", + "\n" + ] + } + ], + "source": [ + "result = writer_agent.run(\n", + " messages=[\n", + " ChatMessage.from_user(\n", + " \"\"\"\n", + "Save this text on Notion:\n", + "\n", + "Florence Nightingale is widely recognized as the founder of modern nursing, and her contributions significantly transformed the field.\n", + "Florence Nightingale's legacy endures, as she set professional standards that have shaped nursing into a respected and essential component of the healthcare system. Her influence is still felt in nursing education and practice today.\n", + "\"\"\"\n", + " )\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "H5gPNKUOTEhS" + }, + "source": [ + "## Creating the Multi-Agent System\n", + "\n", + "So far, you've built two sub-agents, one for research and one for writing, along with their respective tools. Now it's time to bring everything together into a **single multi-agent system**.\n", + "\n", + "To do this, wrap both `research_agent` and `writer_agent` with `ComponentTool`, then pass them as tools to your `main_agent`. This setup allows the main agent to coordinate the overall workflow by delegating tasks to the right sub-agent, each of which already knows how to handle its own tools." + ] + }, + { + "cell_type": "code", + "execution_count": 64, + "metadata": { + "id": "uEoKaKpXBXHC" + }, + "outputs": [], + "source": [ + "from haystack.components.agents import Agent\n", + "from haystack.components.generators.chat import OpenAIChatGenerator\n", + "from haystack.components.generators.utils import print_streaming_chunk\n", + "\n", + "research_tool = ComponentTool(\n", + " component=research_agent,\n", + " description=\"Use this tool to find information on web or specifically on wikipedia\",\n", + " name=\"research_tool\",\n", + ")\n", + "writer_tool = ComponentTool(\n", + " component=writer_agent,\n", + " description=\"Use this tool to write content into document store or Notion\",\n", + " name=\"writer_tool\",\n", + ")\n", + "\n", + "main_agent = Agent(\n", + " chat_generator=OpenAIChatGenerator(model=\"gpt-4o-mini\"),\n", + " system_prompt=\"\"\"\n", + " You are an assistant that has access to several tools.\n", + " Understand the user query and use relevant tool to answer the query.\n", + " You can use `research_tool` to make research on web and wikipedia and `writer_tool` to save information into the document store or Notion.\n", + " \"\"\",\n", + " streaming_callback=print_streaming_chunk,\n", + " tools=[research_tool, writer_tool],\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zhqb6-TDZh-o" + }, + "source": [ + "Let's test this multi-agent system!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "ug8DC_MYeavM", + "outputId": "e3768fa1-fbdf-4f5f-922f-cff8e3345688" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: research_tool \n", + "Arguments: {\"messages\":[{\"role\":\"user\",\"content\":[{\"text\":\"Can you provide an overview of the history of the Silk Road?\"}]}]}\n", + "\n", + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: wiki_search \n", + "Arguments: {\"query\":\"History of the Silk Road\"}\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "File Content for https://en.wikipedia.org/wiki/Silk_Road\n", + "\n", + " The Silk Road was a network of Asian trade routes active from the second century BCE until the mid-15th century. Spanning over 6,400 km (4,000 mi), ...File Content for https://en.wikipedia.org/wiki/Silk_Road_(marketplace)\n", + "\n", + " The name \"Silk Road\" comes from a historical network of trade routes started during the Han Dynasty (206 BCE – 220 CE) between Europe, India, China, and many ...File Content for https://en.wikipedia.org/wiki/The_Silk_Roads\n", + "\n", + " The Silk Roads: A New History of the World is a 2015 non-fiction book written by English historian Peter Frankopan, a historian at the University of Oxford.File Content for https://en.wikipedia.org/wiki/Cities_along_the_Silk_Road\n", + "\n", + " It came into existence in the 2nd century BCE, when Emperor Wu of the Han dynasty was in power, and lasted until the 15th century CE, when the Ottoman Empire ...File Content for https://en.wikipedia.org/wiki/Northern_Silk_Road\n", + "\n", + " The Northern Silk Road is a historic inland trade route in Northwest China and Central Asia originating in the ancient Chinese capital of Chang'an (modern ...\n", + "\n", + "The Silk Road was a network of trade routes that facilitated commerce and cultural exchange between various civilizations, particularly between Europe and Asia. It was active from the 2nd century BCE until the mid-15th century and spanned over 6,400 kilometers (approximately 4,000 miles).\n", + "\n", + "The name \"Silk Road\" is derived from the lucrative silk trade that was carried out along these routes, starting during the Han Dynasty (206 BCE – 220 CE) under Emperor Wu. The routes connected China with India, Persia, and further to Europe, allowing for the exchange of goods, ideas, and cultures.\n", + "\n", + "Historically, the Silk Road comprised various routes, including both overland and maritime paths. It played a significant role in the development of the civilizations that it connected by facilitating trade in not only silk but also other commodities like spices, textiles, and precious stones.\n", + "\n", + "The importance of the Silk Road diminished in the late 15th century due to the rise of maritime trade routes and the expansion of empires, such as the Ottoman Empire, which changed the dynamics of trade and cultural exchange in the regions it linked.\n", + "\n", + "Overall, the Silk Road is a crucial part of world history, representing an extensive system of trade that fostered interactions among diverse cultures over centuries.\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "{'messages': [{'role': 'system', 'meta': {}, 'name': None, 'content': [{'text': '\\n You are a research agent that can find information on web or specifically on wikipedia. \\n Use wiki_search tool if you need facts and use web_search tool for latest news on topics.\\n Use one tool at a time. Try different queries if you need more information.\\n Only use the retrieved context, do not use your own knowledge.\\n Summarize the all retrieved information before returning response to the user.\\n '}]}, {'role': 'user', 'meta': {}, 'name': None, 'content': [{'text': 'Can you provide an overview of the history of the Silk Road?'}]}, {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'tool_calls', 'completion_start_time': '2025-05-28T15:16:49.904424', 'usage': None}, 'name': None, 'content': [{'tool_call': {'tool_name': 'wiki_search', 'arguments': {'query': 'History of the Silk Road'}, 'id': 'call_pxDTCHYeS27s9drCvQ507ie8'}}]}, {'role': 'tool', 'meta': {}, 'name': None, 'content': [{'tool_call_result': {'result': 'File Content for https://en.wikipedia.org/wiki/Silk_Road\\n\\n The Silk Road was a network of Asian trade routes active from the second century BCE until the mid-15th century. Spanning over 6,400 km (4,000 mi), ...File Content for https://en.wikipedia.org/wiki/Silk_Road_(marketplace)\\n\\n The name \"Silk Road\" comes from a historical network of trade routes started during the Han Dynasty (206 BCE – 220 CE) between Europe, India, China, and many ...File Content for https://en.wikipedia.org/wiki/The_Silk_Roads\\n\\n The Silk Roads: A New History of the World is a 2015 non-fiction book written by English historian Peter Frankopan, a historian at the University of Oxford.File Content for https://en.wikipedia.org/wiki/Cities_along_the_Silk_Road\\n\\n It came into existence in the 2nd century BCE, when Emperor Wu of the Han dynasty was in power, and lasted until the 15th century CE, when the Ottoman Empire ...File Content for https://en.wikipedia.org/wiki/Northern_Silk_Road\\n\\n The Northern Silk Road is a historic inland trade route in Northwest China and Central Asia originating in the ancient Chinese capital of Chang\\'an (modern ...', 'origin': {'tool_name': 'wiki_search', 'arguments': {'query': 'History of the Silk Road'}, 'id': 'call_pxDTCHYeS27s9drCvQ507ie8'}, 'error': False}}]}, {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'stop', 'completion_start_time': '2025-05-28T15:16:51.627946', 'usage': None}, 'name': None, 'content': [{'text': 'The Silk Road was a network of trade routes that facilitated commerce and cultural exchange between various civilizations, particularly between Europe and Asia. It was active from the 2nd century BCE until the mid-15th century and spanned over 6,400 kilometers (approximately 4,000 miles).\\n\\nThe name \"Silk Road\" is derived from the lucrative silk trade that was carried out along these routes, starting during the Han Dynasty (206 BCE – 220 CE) under Emperor Wu. The routes connected China with India, Persia, and further to Europe, allowing for the exchange of goods, ideas, and cultures.\\n\\nHistorically, the Silk Road comprised various routes, including both overland and maritime paths. It played a significant role in the development of the civilizations that it connected by facilitating trade in not only silk but also other commodities like spices, textiles, and precious stones.\\n\\nThe importance of the Silk Road diminished in the late 15th century due to the rise of maritime trade routes and the expansion of empires, such as the Ottoman Empire, which changed the dynamics of trade and cultural exchange in the regions it linked.\\n\\nOverall, the Silk Road is a crucial part of world history, representing an extensive system of trade that fostered interactions among diverse cultures over centuries.'}]}], 'last_message': {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'stop', 'completion_start_time': '2025-05-28T15:16:51.627946', 'usage': None}, 'name': None, 'content': [{'text': 'The Silk Road was a network of trade routes that facilitated commerce and cultural exchange between various civilizations, particularly between Europe and Asia. It was active from the 2nd century BCE until the mid-15th century and spanned over 6,400 kilometers (approximately 4,000 miles).\\n\\nThe name \"Silk Road\" is derived from the lucrative silk trade that was carried out along these routes, starting during the Han Dynasty (206 BCE – 220 CE) under Emperor Wu. The routes connected China with India, Persia, and further to Europe, allowing for the exchange of goods, ideas, and cultures.\\n\\nHistorically, the Silk Road comprised various routes, including both overland and maritime paths. It played a significant role in the development of the civilizations that it connected by facilitating trade in not only silk but also other commodities like spices, textiles, and precious stones.\\n\\nThe importance of the Silk Road diminished in the late 15th century due to the rise of maritime trade routes and the expansion of empires, such as the Ottoman Empire, which changed the dynamics of trade and cultural exchange in the regions it linked.\\n\\nOverall, the Silk Road is a crucial part of world history, representing an extensive system of trade that fostered interactions among diverse cultures over centuries.'}]}}\n", + "\n", + "The Silk Road was a vast network of trade routes that facilitated commerce and cultural exchange among various civilizations, primarily between Europe and Asia. Here’s an overview of its history:\n", + "\n", + "- **Time Period**: The Silk Road was active from the 2nd century BCE until the mid-15th century, spanning over 6,400 kilometers (approximately 4,000 miles).\n", + "\n", + "- **Origin of the Name**: The term \"Silk Road\" comes from the lucrative silk trade that flourished along these routes, a practice that began during the Han Dynasty (206 BCE – 220 CE) under Emperor Wu.\n", + "\n", + "- **Geographical Connections**: The routes connected China with India, Persia, and extended to Europe, allowing not only for the trade of silk but also various commodities such as spices, textiles, and precious stones.\n", + "\n", + "- **Cultural Exchange**: The Silk Road served as a conduit for the exchange of goods, ideas, technologies, and cultures, significantly impacting the development of the civilizations involved.\n", + "\n", + "- **Dynamics Changes**: The significance of the Silk Road diminished in the late 15th century due to the advent of maritime trade routes and the rise of empires like the Ottoman Empire, which altered trade dynamics in the regions connected by the Silk Road.\n", + "\n", + "Overall, the Silk Road represents a crucial chapter in world history, illustrating how trade can foster interactions among diverse cultures over centuries.\n", + "\n" + ] + } + ], + "source": [ + "from haystack.dataclasses import ChatMessage\n", + "\n", + "result = main_agent.run(\n", + " messages=[\n", + " ChatMessage.from_user(\n", + " \"\"\"\n", + " Can you research the history of the Silk Road?\n", + " \"\"\"\n", + " )\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 66, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/", + "height": 143 + }, + "id": "2tuantjBeA1M", + "outputId": "d1eefc1f-d769-4954-9fa4-dfbd6e0ac837" + }, + "outputs": [ + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "'The Silk Road was a vast network of trade routes that facilitated commerce and cultural exchange among various civilizations, primarily between Europe and Asia. Here’s an overview of its history:\\n\\n- **Time Period**: The Silk Road was active from the 2nd century BCE until the mid-15th century, spanning over 6,400 kilometers (approximately 4,000 miles).\\n\\n- **Origin of the Name**: The term \"Silk Road\" comes from the lucrative silk trade that flourished along these routes, a practice that began during the Han Dynasty (206 BCE – 220 CE) under Emperor Wu.\\n\\n- **Geographical Connections**: The routes connected China with India, Persia, and extended to Europe, allowing not only for the trade of silk but also various commodities such as spices, textiles, and precious stones.\\n\\n- **Cultural Exchange**: The Silk Road served as a conduit for the exchange of goods, ideas, technologies, and cultures, significantly impacting the development of the civilizations involved.\\n\\n- **Dynamics Changes**: The significance of the Silk Road diminished in the late 15th century due to the advent of maritime trade routes and the rise of empires like the Ottoman Empire, which altered trade dynamics in the regions connected by the Silk Road.\\n\\nOverall, the Silk Road represents a crucial chapter in world history, illustrating how trade can foster interactions among diverse cultures over centuries.'" + ] + }, + "execution_count": 66, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "result[\"last_message\"].text" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "colab": { + "base_uri": "https://localhost:8080/" + }, + "id": "2vFzZA1d44N9", + "outputId": "a066a45a-4ea4-48ac-a029-918514c143eb" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: research_tool \n", + "Arguments: {\"messages\":[{\"role\":\"user\",\"content\":[{\"text\":\"Summarize how RAG pipelines work.\"}]}]}\n", + "\n", + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: wiki_search \n", + "Arguments: {\"query\":\"RAG pipelines\"}\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "File Content for https://en.wikipedia.org/wiki/Retrieval-augmented_generation\n", + "\n", + " Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information. With RAG, LLMs do ...File Content for https://en.wikipedia.org/wiki/Prompt_engineering\n", + "\n", + " GraphRAG (coined by Microsoft Research) is a technique that extends RAG with the use of a knowledge graph (usually, LLM-generated) to allow the model to ...File Content for https://en.wikipedia.org/wiki/Large_language_model\n", + "\n", + " Retrieval-augmented generation (RAG) is another approach that enhances LLMs by integrating them with document retrieval systems. Given a query, a document ...File Content for https://en.wikipedia.org/wiki/RagTime\n", + "\n", + " RagTime is a frame-oriented business publishing software which combines word processing, spreadsheets, simple drawings, image processing, and chartsFile Content for https://en.wikipedia.org/wiki/Zelten_oil_field\n", + "\n", + " SOC operates the Raguba field in the central part of the Sirte Basin. The field is connected by pipeline to the main line between the Zelten field and Brega.\n", + "\n", + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: wiki_search \n", + "Arguments: {\"query\":\"Retrieval-augmented generation\"}\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "File Content for https://en.wikipedia.org/wiki/Retrieval-augmented_generation\n", + "\n", + " Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information.File Content for https://en.wikipedia.org/wiki/Rag\n", + "\n", + " Retrieval-augmented generation, generative AI with the addition of information retrieval capabilities. See also. edit · All pages with titles beginning with Rag ...File Content for https://en.wikipedia.org/wiki/Prompt_engineering\n", + "\n", + " Retrieval-augmented generation (RAG) is a technique that enables generative ... Graph retrieval-augmented generation. edit. GraphRAG with a knowledge ...File Content for https://en.wikipedia.org/wiki/Large_language_model\n", + "\n", + " Retrieval-augmented generation (RAG) is another approach that enhances LLMs by integrating them with document retrieval systems. Given a query, a document ...File Content for https://en.wikipedia.org/wiki/Category:Information_retrieval_systems\n", + "\n", + " Retrieval-augmented generation · RetrievalWare · Reverse DNS lookup · Reverse telephone directory. S. List of search appliance vendors · Search engine ( ...\n", + "\n", + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: wiki_search \n", + "Arguments: {\"query\":\"how RAG pipelines work\"}\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "File Content for https://en.wikipedia.org/wiki/Retrieval-augmented_generation\n", + "\n", + " RAG improves large language models (LLMs) by incorporating information retrieval before generating responses. Unlike traditional LLMs that rely on static ...File Content for https://en.wikipedia.org/wiki/Prompt_engineering\n", + "\n", + " Prompt engineering is the process of structuring or crafting an instruction in order to produce the best possible output from a generative artificial ...File Content for https://en.wikipedia.org/wiki/RNA_therapeutics\n", + "\n", + " The main types of RNA therapeutics are those based on messenger RNA (mRNA), antisense RNA (asRNA), RNA interference (RNAi), RNA activation (RNAa) and RNA ...File Content for https://en.wikipedia.org/wiki/Aviation_Cadet_Training_Program_(USN)\n", + "\n", + " The US Navy had four programs (NavCad, NAP, AVMIDN, and MarCad) for the training of naval aviators.File Content for https://en.wikipedia.org/wiki/DataStax\n", + "\n", + " DataStax, Inc. is a real-time data for AI company based in Santa Clara, California. Its product Astra DB is a cloud database-as-a-service based on Apache ...\n", + "\n", + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: wiki_search \n", + "Arguments: {\"query\":\"Retrieval-augmented generation RAG mechanism\"}\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "File Content for https://en.wikipedia.org/wiki/Retrieval-augmented_generation\n", + "\n", + " Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information. With RAG ...File Content for https://en.wikipedia.org/wiki/Prompt_engineering\n", + "\n", + " Retrieval-augmented generation (RAG) is a technique that enables generative artificial intelligence (Gen AI) models to retrieve and incorporate new information.File Content for https://en.wikipedia.org/wiki/Large_language_model\n", + "\n", + " Retrieval-augmented generation (RAG) is another approach that enhances LLMs by integrating them with document retrieval systems. Given a query, a document ...File Content for https://en.wikipedia.org/wiki/Generative_artificial_intelligence\n", + "\n", + " Generative artificial intelligence is a subfield of artificial intelligence that uses generative models to produce text, images, videos, or other forms of ...File Content for https://en.wikipedia.org/wiki/Confidential_computing\n", + "\n", + " Confidential computing is a security and privacy-enhancing computational technique focused on protecting data in use.\n", + "\n", + "Retrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) by integrating the capability to retrieve external information before generating responses. Here’s a summary of how RAG pipelines work:\n", + "\n", + "1. **Information Retrieval**: The first step in a RAG pipeline involves querying an external document retrieval system. Given an input query, the system retrieves relevant documents or pieces of information from a larger corpus.\n", + "\n", + "2. **Response Generation**: After retrieving the relevant documents, the RAG leverages the LLM to generate a response. The LLM uses the external documents as context, allowing it to produce more accurate and informed outputs compared to traditional LLMs that rely purely on their pre-existing knowledge.\n", + "\n", + "3. **Integration of New Information**: By incorporating real-time data retrieved from documents, RAG systems can provide answers that are more relevant, up-to-date, and comprehensive. This bridging between retrieval systems and generative models allows for better-informed and nuanced responses.\n", + "\n", + "In essence, RAG works by utilizing a two-step process, combining the strengths of information retrieval with the generative capabilities of LLMs, to enhance the quality of generated content.\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "{'messages': [{'role': 'system', 'meta': {}, 'name': None, 'content': [{'text': '\\n You are a research agent that can find information on web or specifically on wikipedia. \\n Use wiki_search tool if you need facts and use web_search tool for latest news on topics.\\n Use one tool at a time. Try different queries if you need more information.\\n Only use the retrieved context, do not use your own knowledge.\\n Summarize the all retrieved information before returning response to the user.\\n '}]}, {'role': 'user', 'meta': {}, 'name': None, 'content': [{'text': 'Summarize how RAG pipelines work.'}]}, {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'tool_calls', 'completion_start_time': '2025-05-28T15:18:33.573590', 'usage': None}, 'name': None, 'content': [{'tool_call': {'tool_name': 'wiki_search', 'arguments': {'query': 'RAG pipelines'}, 'id': 'call_xnAEMUy7iCVrRSDbViUJHqXG'}}]}, {'role': 'tool', 'meta': {}, 'name': None, 'content': [{'tool_call_result': {'result': 'File Content for https://en.wikipedia.org/wiki/Retrieval-augmented_generation\\n\\n Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information. With RAG, LLMs do ...File Content for https://en.wikipedia.org/wiki/Prompt_engineering\\n\\n GraphRAG (coined by Microsoft Research) is a technique that extends RAG with the use of a knowledge graph (usually, LLM-generated) to allow the model to ...File Content for https://en.wikipedia.org/wiki/Large_language_model\\n\\n Retrieval-augmented generation (RAG) is another approach that enhances LLMs by integrating them with document retrieval systems. Given a query, a document ...File Content for https://en.wikipedia.org/wiki/RagTime\\n\\n RagTime is a frame-oriented business publishing software which combines word processing, spreadsheets, simple drawings, image processing, and chartsFile Content for https://en.wikipedia.org/wiki/Zelten_oil_field\\n\\n SOC operates the Raguba field in the central part of the Sirte Basin. The field is connected by pipeline to the main line between the Zelten field and Brega.', 'origin': {'tool_name': 'wiki_search', 'arguments': {'query': 'RAG pipelines'}, 'id': 'call_xnAEMUy7iCVrRSDbViUJHqXG'}, 'error': False}}]}, {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'tool_calls', 'completion_start_time': '2025-05-28T15:18:35.182539', 'usage': None}, 'name': None, 'content': [{'tool_call': {'tool_name': 'wiki_search', 'arguments': {'query': 'Retrieval-augmented generation'}, 'id': 'call_ECjlLz2bcH0JO0po00qG50LT'}}]}, {'role': 'tool', 'meta': {}, 'name': None, 'content': [{'tool_call_result': {'result': 'File Content for https://en.wikipedia.org/wiki/Retrieval-augmented_generation\\n\\n Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information.File Content for https://en.wikipedia.org/wiki/Rag\\n\\n Retrieval-augmented generation, generative AI with the addition of information retrieval capabilities. See also. edit · All pages with titles beginning with Rag ...File Content for https://en.wikipedia.org/wiki/Prompt_engineering\\n\\n Retrieval-augmented generation (RAG) is a technique that enables generative ... Graph retrieval-augmented generation. edit. GraphRAG with a knowledge ...File Content for https://en.wikipedia.org/wiki/Large_language_model\\n\\n Retrieval-augmented generation (RAG) is another approach that enhances LLMs by integrating them with document retrieval systems. Given a query, a document ...File Content for https://en.wikipedia.org/wiki/Category:Information_retrieval_systems\\n\\n Retrieval-augmented generation · RetrievalWare · Reverse DNS lookup · Reverse telephone directory. S. List of search appliance vendors · Search engine ( ...', 'origin': {'tool_name': 'wiki_search', 'arguments': {'query': 'Retrieval-augmented generation'}, 'id': 'call_ECjlLz2bcH0JO0po00qG50LT'}, 'error': False}}]}, {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'tool_calls', 'completion_start_time': '2025-05-28T15:18:36.804910', 'usage': None}, 'name': None, 'content': [{'tool_call': {'tool_name': 'wiki_search', 'arguments': {'query': 'how RAG pipelines work'}, 'id': 'call_fIBCEMoLaUUDoJnZw137JVSP'}}]}, {'role': 'tool', 'meta': {}, 'name': None, 'content': [{'tool_call_result': {'result': 'File Content for https://en.wikipedia.org/wiki/Retrieval-augmented_generation\\n\\n RAG improves large language models (LLMs) by incorporating information retrieval before generating responses. Unlike traditional LLMs that rely on static ...File Content for https://en.wikipedia.org/wiki/Prompt_engineering\\n\\n Prompt engineering is the process of structuring or crafting an instruction in order to produce the best possible output from a generative artificial ...File Content for https://en.wikipedia.org/wiki/RNA_therapeutics\\n\\n The main types of RNA therapeutics are those based on messenger RNA (mRNA), antisense RNA (asRNA), RNA interference (RNAi), RNA activation (RNAa) and RNA ...File Content for https://en.wikipedia.org/wiki/Aviation_Cadet_Training_Program_(USN)\\n\\n The US Navy had four programs (NavCad, NAP, AVMIDN, and MarCad) for the training of naval aviators.File Content for https://en.wikipedia.org/wiki/DataStax\\n\\n DataStax, Inc. is a real-time data for AI company based in Santa Clara, California. Its product Astra DB is a cloud database-as-a-service based on Apache ...', 'origin': {'tool_name': 'wiki_search', 'arguments': {'query': 'how RAG pipelines work'}, 'id': 'call_fIBCEMoLaUUDoJnZw137JVSP'}, 'error': False}}]}, {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'tool_calls', 'completion_start_time': '2025-05-28T15:18:38.455416', 'usage': None}, 'name': None, 'content': [{'tool_call': {'tool_name': 'wiki_search', 'arguments': {'query': 'Retrieval-augmented generation RAG mechanism'}, 'id': 'call_3GRCPA5myd3uwqFhqbvXn1qk'}}]}, {'role': 'tool', 'meta': {}, 'name': None, 'content': [{'tool_call_result': {'result': 'File Content for https://en.wikipedia.org/wiki/Retrieval-augmented_generation\\n\\n Retrieval-augmented generation (RAG) is a technique that enables large language models (LLMs) to retrieve and incorporate new information. With RAG ...File Content for https://en.wikipedia.org/wiki/Prompt_engineering\\n\\n Retrieval-augmented generation (RAG) is a technique that enables generative artificial intelligence (Gen AI) models to retrieve and incorporate new information.File Content for https://en.wikipedia.org/wiki/Large_language_model\\n\\n Retrieval-augmented generation (RAG) is another approach that enhances LLMs by integrating them with document retrieval systems. Given a query, a document ...File Content for https://en.wikipedia.org/wiki/Generative_artificial_intelligence\\n\\n Generative artificial intelligence is a subfield of artificial intelligence that uses generative models to produce text, images, videos, or other forms of ...File Content for https://en.wikipedia.org/wiki/Confidential_computing\\n\\n Confidential computing is a security and privacy-enhancing computational technique focused on protecting data in use.', 'origin': {'tool_name': 'wiki_search', 'arguments': {'query': 'Retrieval-augmented generation RAG mechanism'}, 'id': 'call_3GRCPA5myd3uwqFhqbvXn1qk'}, 'error': False}}]}, {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'stop', 'completion_start_time': '2025-05-28T15:18:41.265652', 'usage': None}, 'name': None, 'content': [{'text': 'Retrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) by integrating the capability to retrieve external information before generating responses. Here’s a summary of how RAG pipelines work:\\n\\n1. **Information Retrieval**: The first step in a RAG pipeline involves querying an external document retrieval system. Given an input query, the system retrieves relevant documents or pieces of information from a larger corpus.\\n\\n2. **Response Generation**: After retrieving the relevant documents, the RAG leverages the LLM to generate a response. The LLM uses the external documents as context, allowing it to produce more accurate and informed outputs compared to traditional LLMs that rely purely on their pre-existing knowledge.\\n\\n3. **Integration of New Information**: By incorporating real-time data retrieved from documents, RAG systems can provide answers that are more relevant, up-to-date, and comprehensive. This bridging between retrieval systems and generative models allows for better-informed and nuanced responses.\\n\\nIn essence, RAG works by utilizing a two-step process, combining the strengths of information retrieval with the generative capabilities of LLMs, to enhance the quality of generated content.'}]}], 'last_message': {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'stop', 'completion_start_time': '2025-05-28T15:18:41.265652', 'usage': None}, 'name': None, 'content': [{'text': 'Retrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) by integrating the capability to retrieve external information before generating responses. Here’s a summary of how RAG pipelines work:\\n\\n1. **Information Retrieval**: The first step in a RAG pipeline involves querying an external document retrieval system. Given an input query, the system retrieves relevant documents or pieces of information from a larger corpus.\\n\\n2. **Response Generation**: After retrieving the relevant documents, the RAG leverages the LLM to generate a response. The LLM uses the external documents as context, allowing it to produce more accurate and informed outputs compared to traditional LLMs that rely purely on their pre-existing knowledge.\\n\\n3. **Integration of New Information**: By incorporating real-time data retrieved from documents, RAG systems can provide answers that are more relevant, up-to-date, and comprehensive. This bridging between retrieval systems and generative models allows for better-informed and nuanced responses.\\n\\nIn essence, RAG works by utilizing a two-step process, combining the strengths of information retrieval with the generative capabilities of LLMs, to enhance the quality of generated content.'}]}}\n", + "\n", + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: writer_tool \n", + "Arguments: {\"messages\":[{\"role\":\"user\",\"content\":[{\"text\":\"Summary of how RAG pipelines work:\\n\\nRetrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) by integrating the capability to retrieve external information before generating responses. Here’s a summary of how RAG pipelines work:\\n\\n1. **Information Retrieval**: The first step in a RAG pipeline involves querying an external document retrieval system. Given an input query, the system retrieves relevant documents or pieces of information from a larger corpus.\\n\\n2. **Response Generation**: After retrieving the relevant documents, the RAG leverages the LLM to generate a response. The LLM uses the external documents as context, allowing it to produce more accurate and informed outputs compared to traditional LLMs that rely purely on their pre-existing knowledge.\\n\\n3. **Integration of New Information**: By incorporating real-time data retrieved from documents, RAG systems can provide answers that are more relevant, up-to-date, and comprehensive. This bridging between retrieval systems and generative models allows for better-informed and nuanced responses.\\n\\nIn essence, RAG works by utilizing a two-step process, combining the strengths of information retrieval with the generative capabilities of LLMs, to enhance the quality of generated content.\"}]}]}\n", + "\n", + "\n", + "\n", + "[TOOL CALL]\n", + "Tool: notion_writer \n", + "Arguments: {\"title\":\"How RAG Pipelines Work\",\"content\":\"Retrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) by integrating the capability to retrieve external information before generating responses. Here’s a summary of how RAG pipelines work:\\n\\n1. **Information Retrieval**: The first step in a RAG pipeline involves querying an external document retrieval system. Given an input query, the system retrieves relevant documents or pieces of information from a larger corpus.\\n\\n2. **Response Generation**: After retrieving the relevant documents, the RAG leverages the LLM to generate a response. The LLM uses the external documents as context, allowing it to produce more accurate and informed outputs compared to traditional LLMs that rely purely on their pre-existing knowledge.\\n\\n3. **Integration of New Information**: By incorporating real-time data retrieved from documents, RAG systems can provide answers that are more relevant, up-to-date, and comprehensive. This bridging between retrieval systems and generative models allows for better-informed and nuanced responses.\\n\\nIn essence, RAG works by utilizing a two-step process, combining the strengths of information retrieval with the generative capabilities of LLMs, to enhance the quality of generated content.\"}\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "{'success': True, 'status_code': 200}\n", + "\n", + "\n", + "\n", + "[TOOL RESULT]\n", + "{'messages': [{'role': 'system', 'meta': {}, 'name': None, 'content': [{'text': '\\n You are a writer agent that saves given information to different locations.\\n Do not change the provided content before saving.\\n Infer the title from the text if not provided. \\n When you need to save provided information to Notion, use notion_writer tool.\\n When you need to save provided information to document store, use doc_store_writer tool\\n If no location is mentioned, use notion_writer tool to save the information.\\n '}]}, {'role': 'user', 'meta': {}, 'name': None, 'content': [{'text': 'Summary of how RAG pipelines work:\\n\\nRetrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) by integrating the capability to retrieve external information before generating responses. Here’s a summary of how RAG pipelines work:\\n\\n1. **Information Retrieval**: The first step in a RAG pipeline involves querying an external document retrieval system. Given an input query, the system retrieves relevant documents or pieces of information from a larger corpus.\\n\\n2. **Response Generation**: After retrieving the relevant documents, the RAG leverages the LLM to generate a response. The LLM uses the external documents as context, allowing it to produce more accurate and informed outputs compared to traditional LLMs that rely purely on their pre-existing knowledge.\\n\\n3. **Integration of New Information**: By incorporating real-time data retrieved from documents, RAG systems can provide answers that are more relevant, up-to-date, and comprehensive. This bridging between retrieval systems and generative models allows for better-informed and nuanced responses.\\n\\nIn essence, RAG works by utilizing a two-step process, combining the strengths of information retrieval with the generative capabilities of LLMs, to enhance the quality of generated content.'}]}, {'role': 'assistant', 'meta': {'model': 'gpt-4o-mini-2024-07-18', 'index': 0, 'finish_reason': 'tool_calls', 'completion_start_time': '2025-05-28T15:18:50.908456', 'usage': None}, 'name': None, 'content': [{'tool_call': {'tool_name': 'notion_writer', 'arguments': {'title': 'How RAG Pipelines Work', 'content': 'Retrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) by integrating the capability to retrieve external information before generating responses. Here’s a summary of how RAG pipelines work:\\n\\n1. **Information Retrieval**: The first step in a RAG pipeline involves querying an external document retrieval system. Given an input query, the system retrieves relevant documents or pieces of information from a larger corpus.\\n\\n2. **Response Generation**: After retrieving the relevant documents, the RAG leverages the LLM to generate a response. The LLM uses the external documents as context, allowing it to produce more accurate and informed outputs compared to traditional LLMs that rely purely on their pre-existing knowledge.\\n\\n3. **Integration of New Information**: By incorporating real-time data retrieved from documents, RAG systems can provide answers that are more relevant, up-to-date, and comprehensive. This bridging between retrieval systems and generative models allows for better-informed and nuanced responses.\\n\\nIn essence, RAG works by utilizing a two-step process, combining the strengths of information retrieval with the generative capabilities of LLMs, to enhance the quality of generated content.'}, 'id': 'call_t8V2bODfsxlOiTeCdVWWznir'}}]}, {'role': 'tool', 'meta': {}, 'name': None, 'content': [{'tool_call_result': {'result': \"{'success': True, 'status_code': 200}\", 'origin': {'tool_name': 'notion_writer', 'arguments': {'title': 'How RAG Pipelines Work', 'content': 'Retrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) by integrating the capability to retrieve external information before generating responses. Here’s a summary of how RAG pipelines work:\\n\\n1. **Information Retrieval**: The first step in a RAG pipeline involves querying an external document retrieval system. Given an input query, the system retrieves relevant documents or pieces of information from a larger corpus.\\n\\n2. **Response Generation**: After retrieving the relevant documents, the RAG leverages the LLM to generate a response. The LLM uses the external documents as context, allowing it to produce more accurate and informed outputs compared to traditional LLMs that rely purely on their pre-existing knowledge.\\n\\n3. **Integration of New Information**: By incorporating real-time data retrieved from documents, RAG systems can provide answers that are more relevant, up-to-date, and comprehensive. This bridging between retrieval systems and generative models allows for better-informed and nuanced responses.\\n\\nIn essence, RAG works by utilizing a two-step process, combining the strengths of information retrieval with the generative capabilities of LLMs, to enhance the quality of generated content.'}, 'id': 'call_t8V2bODfsxlOiTeCdVWWznir'}, 'error': False}}]}], 'last_message': {'role': 'tool', 'meta': {}, 'name': None, 'content': [{'tool_call_result': {'result': \"{'success': True, 'status_code': 200}\", 'origin': {'tool_name': 'notion_writer', 'arguments': {'title': 'How RAG Pipelines Work', 'content': 'Retrieval-Augmented Generation (RAG) is a method that enhances large language models (LLMs) by integrating the capability to retrieve external information before generating responses. Here’s a summary of how RAG pipelines work:\\n\\n1. **Information Retrieval**: The first step in a RAG pipeline involves querying an external document retrieval system. Given an input query, the system retrieves relevant documents or pieces of information from a larger corpus.\\n\\n2. **Response Generation**: After retrieving the relevant documents, the RAG leverages the LLM to generate a response. The LLM uses the external documents as context, allowing it to produce more accurate and informed outputs compared to traditional LLMs that rely purely on their pre-existing knowledge.\\n\\n3. **Integration of New Information**: By incorporating real-time data retrieved from documents, RAG systems can provide answers that are more relevant, up-to-date, and comprehensive. This bridging between retrieval systems and generative models allows for better-informed and nuanced responses.\\n\\nIn essence, RAG works by utilizing a two-step process, combining the strengths of information retrieval with the generative capabilities of LLMs, to enhance the quality of generated content.'}, 'id': 'call_t8V2bODfsxlOiTeCdVWWznir'}, 'error': False}}]}}\n", + "\n", + "The summary of how Retrieval-Augmented Generation (RAG) pipelines work has been successfully saved in Notion. If you need anything else, feel free to ask!\n", + "\n" + ] + } + ], + "source": [ + "result = main_agent.run(\n", + " messages=[\n", + " ChatMessage.from_user(\n", + " \"\"\"\n", + " Summarize how RAG pipelines work and save it in Notion\n", + " \"\"\"\n", + " )\n", + " ]\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "iK5SgKoXep5F" + }, + "source": [ + "## What's next\n", + "\n", + "🎉 Congratulations! You've just built a multi-agent system with Haystack, where specialized agents work together to research and write, each with their own tools and responsibilities. You now have a flexible foundation for building more complex, modular agent workflows.\n", + "\n", + "Curious to keep exploring? Here are a few great next steps:\n", + "\n", + "* [DevOps Support Agent with Human in the Loop](https://haystack.deepset.ai/cookbook/agent_with_human_in_the_loop)\n", + "* [Building an Agentic RAG with Fallback to Websearch](https://haystack.deepset.ai/tutorials/36_building_fallbacks_with_conditional_routing)\n", + "* [Introduction to Multimodal Text Generation](https://haystack.deepset.ai/cookbook/multimodal_intro)\n", + "\n", + "To stay up to date on the latest Haystack developments, you can [sign up for our newsletter](https://landing.deepset.ai/haystack-community-updates) or [join Haystack discord community](https://discord.gg/Dr63fr9NDS).\n", + "\n" + ] + } + ], + "metadata": { + "colab": { + "provenance": [] + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + }, + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +}