openai · Apr 23, 2025 · Apr 23, 2025 · Apr 23, 2025 · Apr 24, 2025 · Apr 24, 2025
diff --git a/.github/workflows/issues.yml b/.github/workflows/issues.yml
@@ -21,6 +21,7 @@ jobs:
           days-before-pr-stale: 10
           days-before-pr-close: 7
           stale-pr-label: "stale"
+          exempt-issue-labels: "skip-stale"
           stale-pr-message: "This PR is stale because it has been open for 10 days with no activity."
           close-pr-message: "This PR was closed because it has been inactive for 7 days since being marked as stale."
           repo-token: ${{ secrets.GITHUB_TOKEN }}
diff --git a/.vscode/launch.json b/.vscode/launch.json
@@ -0,0 +1,14 @@
+{
+    // Use IntelliSense to learn about possible attributes.
+    // Hover to view descriptions of existing attributes.
+    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
+    "version": "0.2.0",
+    "configurations": [
+        {
+            "name": "Python Debugger: Python File",
+            "type": "debugpy",
+            "request": "launch",
+            "program": "${file}"
+        }
+    ]
+}
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,71 @@
+Welcome to the OpenAI Agents SDK repository. This file contains the main points for new contributors.
+
+## Repository overview
+
+- **Source code**: `src/agents/` contains the implementation.
+- **Tests**: `tests/` with a short guide in `tests/README.md`.
+- **Examples**: under `examples/`.
+- **Documentation**: markdown pages live in `docs/` with `mkdocs.yml` controlling the site.
+- **Utilities**: developer commands are defined in the `Makefile`.
+- **PR template**: `.github/PULL_REQUEST_TEMPLATE/pull_request_template.md` describes the information every PR must include.
+
+## Local workflow
+
+1. Format, lint and type‑check your changes:
+
+   ```bash
+   make format
+   make lint
+   make mypy
+   ```
+
+2. Run the tests:
+
+   ```bash
+   make tests
+   ```
+
+   To run a single test, use `uv run pytest -s -k <test_name>`.
+
+3. Build the documentation (optional but recommended for docs changes):
+
+   ```bash
+   make build-docs
+   ```
+
+   Coverage can be generated with `make coverage`.
+
+All python commands should be run via `uv run python ...`
+
+## Snapshot tests
+
+Some tests rely on inline snapshots. See `tests/README.md` for details on updating them:
+
+```bash
+make snapshots-fix      # update existing snapshots
+make snapshots-create   # create new snapshots
+```
+
+Run `make tests` again after updating snapshots to ensure they pass.
+
+## Style notes
+
+- Write comments as full sentences and end them with a period.
+
+## Pull request expectations
+
+PRs should use the template located at `.github/PULL_REQUEST_TEMPLATE/pull_request_template.md`. Provide a summary, test plan and issue number if applicable, then check that:
+
+- New tests are added when needed.
+- Documentation is updated.
+- `make lint` and `make format` have been run.
+- The full test suite passes.
+
+Commit messages should be concise and written in the imperative mood. Small, focused commits are preferred.
+
+## What reviewers look for
+
+- Tests covering new behaviour.
+- Consistent style: code formatted with `uv run ruff format`, imports sorted, and type hints passing `uv run mypy .`.
+- Clear documentation for any public API changes.
+- Clean history and a helpful PR description.
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -0,0 +1 @@
+Read the AGENTS.md file for instructions.
diff --git a/Makefile b/Makefile
@@ -7,6 +7,10 @@ format:
 	uv run ruff format
 	uv run ruff check --fix
 
+.PHONY: format-check
+format-check:
+	uv run ruff format --check
+
 .PHONY: lint
 lint: 
 	uv run ruff check
@@ -55,5 +59,5 @@ serve-docs:
 deploy-docs:
 	uv run mkdocs gh-deploy --force --verbose
 
-	
-	
+.PHONY: check
+check: format-check lint mypy tests
diff --git a/README.md b/README.md
@@ -4,27 +4,146 @@ The OpenAI Agents SDK is a lightweight yet powerful framework for building multi
 
 <img src="https://cdn.openai.com/API/docs/images/orchestration.png" alt="Image of the Agents Tracing UI" style="max-height: 803px;">
 
+> [!NOTE]
+> Looking for the JavaScript/TypeScript version? Check out [Agents SDK JS/TS](https://github.com/openai/openai-agents-js).
+
 ### Core concepts:
 
 1. [**Agents**](https://openai.github.io/openai-agents-python/agents): LLMs configured with instructions, tools, guardrails, and handoffs
 2. [**Handoffs**](https://openai.github.io/openai-agents-python/handoffs/): A specialized tool call used by the Agents SDK for transferring control between agents
 3. [**Guardrails**](https://openai.github.io/openai-agents-python/guardrails/): Configurable safety checks for input and output validation
-4. [**Tracing**](https://openai.github.io/openai-agents-python/tracing/): Built-in tracking of agent runs, allowing you to view, debug and optimize your workflows
+4. [**Sessions**](#sessions): Automatic conversation history management across agent runs
+5. [**Tracing**](https://openai.github.io/openai-agents-python/tracing/): Built-in tracking of agent runs, allowing you to view, debug and optimize your workflows
 
 Explore the [examples](examples) directory to see the SDK in action, and read our [documentation](https://openai.github.io/openai-agents-python/) for more details.
 
+## Sessions
+
+The Agents SDK provides built-in session memory to automatically maintain conversation history across multiple agent runs, eliminating the need to manually handle `.to_input_list()` between turns.
+
+### Quick start
+
+```python
+from agents import Agent, Runner, SQLiteSession
+
+# Create agent
+agent = Agent(
+    name="Assistant",
+    instructions="Reply very concisely.",
+)
+
+# Create a session instance
+session = SQLiteSession("conversation_123")
+
+# First turn
+result = await Runner.run(
+    agent,
+    "What city is the Golden Gate Bridge in?",
+    session=session
+)
+print(result.final_output)  # "San Francisco"
+
+# Second turn - agent automatically remembers previous context
+result = await Runner.run(
+    agent,
+    "What state is it in?",
+    session=session
+)
+print(result.final_output)  # "California"
+
+# Also works with synchronous runner
+result = Runner.run_sync(
+    agent,
+    "What's the population?",
+    session=session
+)
+print(result.final_output)  # "Approximately 39 million"
+```
+
+### Session options
+
+-   **No memory** (default): No session memory when session parameter is omitted
+-   **`session: Session = DatabaseSession(...)`**: Use a Session instance to manage conversation history
+
+```python
+from agents import Agent, Runner, SQLiteSession
+
+# Custom SQLite database file
+session = SQLiteSession("user_123", "conversations.db")
+agent = Agent(name="Assistant")
+
+# Different session IDs maintain separate conversation histories
+result1 = await Runner.run(
+    agent,
+    "Hello",
+    session=session
+)
+result2 = await Runner.run(
+    agent,
+    "Hello",
+    session=SQLiteSession("user_456", "conversations.db")
+)
+```
+
+### Custom session implementations
+
+You can implement your own session memory by creating a class that follows the `Session` protocol:
+
+```python
+from agents.memory import Session
+from typing import List
+
+class MyCustomSession:
+    """Custom session implementation following the Session protocol."""
+
+    def __init__(self, session_id: str):
+        self.session_id = session_id
+        # Your initialization here
+
+    async def get_items(self, limit: int | None = None) -> List[dict]:
+        # Retrieve conversation history for the session
+        pass
+
+    async def add_items(self, items: List[dict]) -> None:
+        # Store new items for the session
+        pass
+
+    async def pop_item(self) -> dict | None:
+        # Remove and return the most recent item from the session
+        pass
+
+    async def clear_session(self) -> None:
+        # Clear all items for the session
+        pass
+
+# Use your custom session
+agent = Agent(name="Assistant")
+result = await Runner.run(
+    agent,
+    "Hello",
+    session=MyCustomSession("my_session")
+)
+```
+
 ## Get started
 
 1. Set up your Python environment
 
-```
+- Option A: Using venv (traditional method)
+```bash
 python -m venv env
-source env/bin/activate
+source env/bin/activate  # On Windows: env\Scripts\activate
+```
+
+- Option B: Using uv (recommended)
+```bash
+uv venv
+source .venv/bin/activate  # On Windows: .venv\Scripts\activate
 ```
 
 2. Install Agents SDK
 
-```
+```bash
 pip install openai-agents
 ```
 
@@ -47,7 +166,7 @@ print(result.final_output)
 
 (_If running this, ensure you set the `OPENAI_API_KEY` environment variable_)
 
-(_For Jupyter notebook users, see [hello_world_jupyter.py](examples/basic/hello_world_jupyter.py)_)
+(_For Jupyter notebook users, see [hello_world_jupyter.ipynb](examples/basic/hello_world_jupyter.ipynb)_)
 
 ## Handoffs example
 
@@ -160,10 +279,16 @@ make sync
 
 2. (After making changes) lint/test
 
+```
+make check # run tests linter and typechecker
+```
+
+Or to run them individually:
 ```
 make tests  # run tests
 make mypy   # run typechecker
 make lint   # run linter
+make format-check # run style checker
 ```
 
 ## Acknowledgements

diff --git a/docs/agents.md b/docs/agents.md
@@ -6,6 +6,7 @@ Agents are the core building block in your apps. An agent is a large language mo
 
 The most common properties of an agent you'll configure are:
 
+-   `name`: A required string that identifies your agent.
 -   `instructions`: also known as a developer message or system prompt.
 -   `model`: which LLM to use, and optional `model_settings` to configure model tuning parameters like temperature, top_p, etc.
 -   `tools`: Tools that the agent can use to achieve its tasks.

diff --git a/docs/context.md b/docs/context.md
@@ -38,7 +38,8 @@ class UserInfo:  # (1)!
 
 @function_tool
 async def fetch_user_age(wrapper: RunContextWrapper[UserInfo]) -> str:  # (2)!
-    return f"User {wrapper.context.name} is 47 years old"
+    """Fetch the age of the user. Call this function to get user's age information."""
+    return f"The user {wrapper.context.name} is 47 years old"
 
 async def main():
     user_info = UserInfo(name="John", uid=123)

diff --git a/docs/guardrails.md b/docs/guardrails.md
@@ -23,7 +23,7 @@ Input guardrails run in 3 steps:
 
 Output guardrails run in 3 steps:
 
-1. First, the guardrail receives the same input passed to the agent.
+1. First, the guardrail receives the output produced by the agent.
 2. Next, the guardrail function runs to produce a [`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput], which is then wrapped in an [`OutputGuardrailResult`][agents.guardrail.OutputGuardrailResult]
 3. Finally, we check if [`.tripwire_triggered`][agents.guardrail.GuardrailFunctionOutput.tripwire_triggered] is true. If true, an [`OutputGuardrailTripwireTriggered`][agents.exceptions.OutputGuardrailTripwireTriggered] exception is raised, so you can appropriately respond to the user or handle the exception.
 

diff --git a/docs/index.md b/docs/index.md
@@ -5,6 +5,7 @@ The [OpenAI Agents SDK](https://github.com/openai/openai-agents-python) enables
 -   **Agents**, which are LLMs equipped with instructions and tools
 -   **Handoffs**, which allow agents to delegate to other agents for specific tasks
 -   **Guardrails**, which enable the inputs to agents to be validated
+-   **Sessions**, which automatically maintains conversation history across agent runs
 
 In combination with Python, these primitives are powerful enough to express complex relationships between tools and agents, and allow you to build real-world applications without a steep learning curve. In addition, the SDK comes with built-in **tracing** that lets you visualize and debug your agentic flows, as well as evaluate them and even fine-tune models for your application.
 
@@ -21,6 +22,7 @@ Here are the main features of the SDK:
 -   Python-first: Use built-in language features to orchestrate and chain agents, rather than needing to learn new abstractions.
 -   Handoffs: A powerful feature to coordinate and delegate between multiple agents.
 -   Guardrails: Run input validations and checks in parallel to your agents, breaking early if the checks fail.
+-   Sessions: Automatic conversation history management across agent runs, eliminating manual state handling.
 -   Function tools: Turn any Python function into a tool, with automatic schema generation and Pydantic-powered validation.
 -   Tracing: Built-in tracing that lets you visualize, debug and monitor your workflows, as well as use the OpenAI suite of evaluation, fine-tuning and distillation tools.
 

diff --git a/docs/ja/guardrails.md b/docs/ja/guardrails.md
@@ -4,44 +4,44 @@ search:
 ---
 # ガードレール
 
-ガードレールは エージェント と _並列_ に実行され、 ユーザー入力 のチェックとバリデーションを行います。たとえば、顧客からのリクエストを支援するために非常に賢い (そのため遅く / 高価な) モデルを使うエージェントがあるとします。悪意のある ユーザー がモデルに数学の宿題を手伝わせようとするのは避けたいですよね。その場合、 高速 / 低コスト のモデルでガードレールを実行できます。ガードレールが悪意のある利用を検知した場合、即座にエラーを送出して高価なモデルの実行を停止し、時間と費用を節約できます。
+ガードレールは エージェント と _並行して_ 実行され、ユーザー入力のチェックとバリデーションを行えます。例えば、とても賢い（つまり遅く/高価な）モデルを使用してカスタマーリクエストを処理するエージェントがあるとします。悪意のある ユーザー がモデルに数学の宿題を手伝わせようとするのは避けたいでしょう。そこで、速く/安価なモデルで動くガードレールを実行できます。ガードレールが悪意のある利用を検知すると、直ちにエラーを送出して高価なモデルの実行を停止し、時間とコストを節約できます。
 
-ガードレールには 2 種類あります。
+ガードレールには 2 種類あります:
 
-1. Input ガードレールは最初の ユーザー入力 に対して実行されます  
-2. Output ガードレールは最終的なエージェント出力に対して実行されます  
+1. 入力ガードレール は初期 ユーザー 入力に対して実行されます  
+2. 出力ガードレール は最終的なエージェント出力に対して実行されます  
 
-## Input ガードレール
+## 入力ガードレール
 
-Input ガードレールは 3 つのステップで実行されます。
+入力ガードレールは 3 ステップで実行されます:
 
 1. まず、ガードレールはエージェントに渡されたものと同じ入力を受け取ります。  
-2. 次に、ガードレール関数が実行され [`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput] を生成し、それが [`InputGuardrailResult`][agents.guardrail.InputGuardrailResult] でラップされます。  
-3. 最後に [`.tripwire_triggered`][agents.guardrail.GuardrailFunctionOutput.tripwire_triggered] が true かどうかを確認します。true の場合、[`InputGuardrailTripwireTriggered`][agents.exceptions.InputGuardrailTripwireTriggered] 例外が送出されるので、 ユーザー への適切な応答や例外処理を行えます。  
+2. 次に、ガードレール関数が実行され [`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput] を生成し、それが [`InputGuardrailResult`][agents.guardrail.InputGuardrailResult] にラップされます。  
+3. 最後に [.tripwire_triggered][agents.guardrail.GuardrailFunctionOutput.tripwire_triggered] が true かどうかを確認します。true の場合、[`InputGuardrailTripwireTriggered`][agents.exceptions.InputGuardrailTripwireTriggered] 例外が送出されるので、適切に ユーザー に応答したり例外を処理できます。  
 
 !!! Note
 
-    Input ガードレールは ユーザー入力 に対して実行されることを想定しているため、エージェントのガードレールが実行されるのはそのエージェントが *最初* のエージェントである場合だけです。「なぜ `guardrails` プロパティがエージェントにあり、 `Runner.run` に渡さないのか？」と思うかもしれません。ガードレールは実際の エージェント に密接に関連する場合が多く、エージェントごとに異なるガードレールを実行するため、コードを同じ場所に置くことで可読性が向上するからです。
+    入力ガードレールは ユーザー 入力に対して実行されることを意図しているため、ガードレールは *最初* のエージェントでのみ実行されます。「なぜ `guardrails` プロパティがエージェントにあり、`Runner.run` に渡さないのか」と疑問に思うかもしれません。これは、ガードレールが実際の エージェント と密接に関連していることが多いからです。異なるエージェントには異なるガードレールを実行するため、コードを同じ場所に置くことで可読性が向上します。
 
-## Output ガードレール
+## 出力ガードレール
 
-Output ガードレールは 3 つのステップで実行されます。
+出力ガードレールは 3 ステップで実行されます:
 
-1. まず、ガードレールはエージェントに渡されたものと同じ入力を受け取ります。  
-2. 次に、ガードレール関数が実行され [`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput] を生成し、それが [`OutputGuardrailResult`][agents.guardrail.OutputGuardrailResult] でラップされます。  
-3. 最後に [`.tripwire_triggered`][agents.guardrail.GuardrailFunctionOutput.tripwire_triggered] が true かどうかを確認します。true の場合、[`OutputGuardrailTripwireTriggered`][agents.exceptions.OutputGuardrailTripwireTriggered] 例外が送出されるので、 ユーザー への適切な応答や例外処理を行えます。  
+1. まず、ガードレールはエージェントが生成した出力を受け取ります。  
+2. 次に、ガードレール関数が実行され [`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput] を生成し、それが [`OutputGuardrailResult`][agents.guardrail.OutputGuardrailResult] にラップされます。  
+3. 最後に [.tripwire_triggered][agents.guardrail.GuardrailFunctionOutput.tripwire_triggered] が true かどうかを確認します。true の場合、[`OutputGuardrailTripwireTriggered`][agents.exceptions.OutputGuardrailTripwireTriggered] 例外が送出されるので、適切に ユーザー に応答したり例外を処理できます。  
 
 !!! Note
 
-    Output ガードレールは最終的なエージェント出力に対して実行されることを想定しているため、エージェントのガードレールが実行されるのはそのエージェントが *最後* のエージェントである場合だけです。Input ガードレール同様、ガードレールは実際の エージェント に密接に関連するため、コードを同じ場所に置くことで可読性が向上します。
+    出力ガードレールは最終的なエージェント出力に対して実行されることを意図しているため、ガードレールは *最後* のエージェントでのみ実行されます。入力ガードレールの場合と同様、ガードレールが実際の エージェント と密接に関連していることが多いため、コードを同じ場所に置くことで可読性が向上します。
 
-## トリップワイヤ
+## トリップワイヤー
 
-入力または出力がガードレールに失敗した場合、ガードレールはトリップワイヤを用いてそれを通知できます。ガードレールがトリップワイヤを発火したことを検知すると、ただちに `{Input,Output}GuardrailTripwireTriggered` 例外を送出してエージェントの実行を停止します。
+入力または出力がガードレールを通過できなかった場合、ガードレールはトリップワイヤーでそれを示すことができます。トリップワイヤーがトリガーされたガードレールを検知した時点で、直ちに `{Input,Output}GuardrailTripwireTriggered` 例外を送出し、エージェントの実行を停止します。
 
 ## ガードレールの実装
 
-入力を受け取り、[`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput] を返す関数を用意する必要があります。次の例では、内部で エージェント を実行してこれを行います。
+入力を受け取り、[`GuardrailFunctionOutput`][agents.guardrail.GuardrailFunctionOutput] を返す関数を提供する必要があります。この例では、内部で エージェント を実行してこれを行います。
 
 ```python
 from pydantic import BaseModel
@@ -94,12 +94,12 @@ async def main():
         print("Math homework guardrail tripped")
 ```
 
-1. この エージェント をガードレール関数内で使用します。  
-2. これはエージェントの入力 / コンテキストを受け取り、結果を返すガードレール関数です。  
+1. このエージェントをガードレール関数内で使用します。  
+2. これはエージェントの入力/コンテキストを受け取り、結果を返すガードレール関数です。  
 3. ガードレール結果に追加情報を含めることができます。  
 4. これはワークフローを定義する実際のエージェントです。  
 
-Output ガードレールも同様です。
+出力ガードレールも同様です。
 
 ```python
 from pydantic import BaseModel
@@ -155,4 +155,4 @@ async def main():
 1. これは実際のエージェントの出力型です。  
 2. これはガードレールの出力型です。  
 3. これはエージェントの出力を受け取り、結果を返すガードレール関数です。  
-4. これはワークフローを定義する実際のエージェントです。
+4. これはワークフローを定義する実際のエージェントです。  
diff --git a/docs/ja/mcp.md b/docs/ja/mcp.md
@@ -12,23 +12,31 @@ Agents SDK は MCP をサポートしており、これにより幅広い MCP
 
 ## MCP サーバー
 
-現在、MCP 仕様では使用するトランスポート方式に基づき 2 種類のサーバーが定義されています。
+現在、MCP 仕様では使用するトランスポート方式に基づき 3 種類のサーバーが定義されています。
 
-1. **stdio** サーバー: アプリケーションのサブプロセスとして実行されます。ローカルで動かすイメージです。  
+1. **stdio** サーバー: アプリケーションのサブプロセスとして実行されます。ローカルで動かすイメージです。
 2. **HTTP over SSE** サーバー: リモートで動作し、 URL 経由で接続します。
+3. **Streamable HTTP** サーバー: MCP 仕様に定義された Streamable HTTP トランスポートを使用してリモートで動作します。
 
-これらのサーバーへは [`MCPServerStdio`][agents.mcp.server.MCPServerStdio] と [`MCPServerSse`][agents.mcp.server.MCPServerSse] クラスを使用して接続できます。
+これらのサーバーへは [`MCPServerStdio`][agents.mcp.server.MCPServerStdio]、[`MCPServerSse`][agents.mcp.server.MCPServerSse]、[`MCPServerStreamableHttp`][agents.mcp.server.MCPServerStreamableHttp] クラスを使用して接続できます。
 
 たとえば、[公式 MCP filesystem サーバー](https://www.npmjs.com/package/@modelcontextprotocol/server-filesystem)を利用する場合は次のようになります。
 
 ```python
+from agents.run_context import RunContextWrapper
+
 async with MCPServerStdio(
     params={
         "command": "npx",
         "args": ["-y", "@modelcontextprotocol/server-filesystem", samples_dir],
     }
 ) as server:
-    tools = await server.list_tools()
+    # 注意：実際には通常は MCP サーバーをエージェントに追加し、
+    # フレームワークがツール一覧の取得を自動的に処理するようにします。
+    # list_tools() への直接呼び出しには run_context と agent パラメータが必要です。
+    run_context = RunContextWrapper(context=None)
+    agent = Agent(name="test", instructions="test")
+    tools = await server.list_tools(run_context, agent)
 ```
 
 ## MCP サーバーの利用
@@ -46,7 +54,7 @@ agent=Agent(
 
 ## キャッシュ
 
-エージェントが実行されるたびに、MCP サーバーへ `list_tools()` が呼び出されます。サーバーがリモートの場合は特にレイテンシが発生します。ツール一覧を自動でキャッシュしたい場合は、[`MCPServerStdio`][agents.mcp.server.MCPServerStdio] と [`MCPServerSse`][agents.mcp.server.MCPServerSse] の両方に `cache_tools_list=True` を渡してください。ツール一覧が変更されないと確信できる場合のみ使用してください。
+エージェントが実行されるたびに、MCP サーバーへ `list_tools()` が呼び出されます。サーバーがリモートの場合は特にレイテンシが発生します。ツール一覧を自動でキャッシュしたい場合は、[`MCPServerStdio`][agents.mcp.server.MCPServerStdio]、[`MCPServerSse`][agents.mcp.server.MCPServerSse]、[`MCPServerStreamableHttp`][agents.mcp.server.MCPServerStreamableHttp] の各クラスに `cache_tools_list=True` を渡してください。ツール一覧が変更されないと確信できる場合のみ使用してください。
 
 キャッシュを無効化したい場合は、サーバーで `invalidate_tools_cache()` を呼び出します。
 

diff --git a/docs/ja/models.md b/docs/ja/models.md
diff --git a/docs/ja/models/index.md b/docs/ja/models/index.md
@@ -4,21 +4,52 @@ search:
 ---
 # モデル
 
-Agents SDK には、標準で 2 種類の OpenAI モデルサポートが含まれています。
+Agents SDK は OpenAI モデルを 2 つの形態で即利用できます。
 
-- **推奨**: [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] — 新しい [Responses API](https://platform.openai.com/docs/api-reference/responses) を利用して OpenAI API を呼び出します。  
-- [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel] — [Chat Completions API](https://platform.openai.com/docs/api-reference/chat) を利用して OpenAI API を呼び出します。
+- **推奨**: [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] は、新しい [Responses API](https://platform.openai.com/docs/api-reference/responses) を使用して OpenAI API を呼び出します。  
+- [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel] は、[Chat Completions API](https://platform.openai.com/docs/api-reference/chat) を使用して OpenAI API を呼び出します。
+
+## 非 OpenAI モデル
+
+ほとんどの非 OpenAI モデルは [LiteLLM インテグレーション](./litellm.md) 経由で利用できます。まず、litellm 依存グループをインストールします:
+
+```bash
+pip install "openai-agents[litellm]"
+```
+
+次に、`litellm/` 接頭辞を付けて任意の [サポート対象モデル](https://docs.litellm.ai/docs/providers) を使用します:
+
+```python
+claude_agent = Agent(model="litellm/anthropic/claude-3-5-sonnet-20240620", ...)
+gemini_agent = Agent(model="litellm/gemini/gemini-2.5-flash-preview-04-17", ...)
+```
+
+### 非 OpenAI モデルを利用するその他の方法
+
+他の LLM プロバイダーを統合する方法は、あと 3 つあります（[こちら](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/) に例があります）。
+
+1. [`set_default_openai_client`][agents.set_default_openai_client]  
+   `AsyncOpenAI` インスタンスを LLM クライアントとしてグローバルに使用したい場合に便利です。LLM プロバイダーが OpenAI 互換の API エンドポイントを持ち、`base_url` と `api_key` を設定できる場合に使用します。設定例は [examples/model_providers/custom_example_global.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_global.py) にあります。
+2. [`ModelProvider`][agents.models.interface.ModelProvider]  
+   `Runner.run` レベルでカスタムモデルプロバイダーを指定できます。これにより「この run のすべてのエージェントでカスタムプロバイダーを使う」と宣言できます。設定例は [examples/model_providers/custom_example_provider.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_provider.py) にあります。
+3. [`Agent.model`][agents.agent.Agent.model]  
+   特定のエージェントインスタンスにモデルを指定できます。エージェントごとに異なるプロバイダーを組み合わせることが可能です。設定例は [examples/model_providers/custom_example_agent.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_agent.py) にあります。ほとんどのモデルを簡単に利用する方法として [LiteLLM インテグレーション](./litellm.md) を利用できます。
+
+`platform.openai.com` の API キーを持っていない場合は、`set_tracing_disabled()` でトレーシングを無効化するか、[別のトレーシングプロセッサー](../tracing.md) を設定することをお勧めします。
+
+!!! note
+    これらの例では、Responses API をまだサポートしていない LLM プロバイダーが多いため、Chat Completions API/モデルを使用しています。LLM プロバイダーが Responses API をサポートしている場合は、Responses を使用することを推奨します。
 
 ## モデルの組み合わせ
 
-1 つのワークフロー内で、エージェントごとに異なるモデルを使用したい場合があります。たとえば、振り分けには小さく高速なモデルを、複雑なタスクには大きく高性能なモデルを使う、といった使い分けです。[`Agent`][agents.Agent] を設定する際は、以下のいずれかで特定のモデルを指定できます。
+1 つのワークフロー内でエージェントごとに異なるモデルを使用したい場合があります。たとえば、振り分けには小さく高速なモデルを、複雑なタスクには大きく高性能なモデルを使用するといったケースです。[`Agent`][agents.Agent] を設定する際、次のいずれかの方法でモデルを選択できます。
 
-1. OpenAI モデル名を直接渡す  
-2. 任意のモデル名と、それを `Model` インスタンスへマッピングできる [`ModelProvider`][agents.models.interface.ModelProvider] を渡す  
+1. モデル名を直接指定する  
+2. 任意のモデル名と、その名前を Model インスタンスへマッピングできる [`ModelProvider`][agents.models.interface.ModelProvider] を指定する  
 3. [`Model`][agents.models.interface.Model] 実装を直接渡す  
 
 !!!note
-    SDK は [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] と [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel] の両方の形に対応していますが、ワークフローごとに 1 つのモデル形を使用することを推奨します。2 つの形ではサポートする機能・ツールが異なるためです。どうしても混在させる場合は、利用するすべての機能が両方で利用可能であることを確認してください。
+    SDK は [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIResponsesModel] と [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel] の両形態をサポートしていますが、各ワークフローで 1 つのモデル形態に統一することを推奨します。2 つの形態はサポートする機能とツールが異なるためです。混在させる場合は、使用する機能が双方で利用可能かを必ず確認してください。
 
 ```python
 from agents import Agent, Runner, AsyncOpenAI, OpenAIChatCompletionsModel
@@ -51,10 +82,10 @@ async def main():
     print(result.final_output)
 ```
 
-1. OpenAI モデル名を直接指定  
+1. OpenAI のモデル名を直接設定  
 2. [`Model`][agents.models.interface.Model] 実装を提供  
 
-エージェントで使用するモデルをさらに細かく設定したい場合は、`temperature` などのオプションを指定できる [`ModelSettings`][agents.models.interface.ModelSettings] を渡します。
+エージェントで使用するモデルをさらに構成したい場合は、`temperature` などのオプションパラメーターを指定できる [`ModelSettings`][agents.models.interface.ModelSettings] を渡せます。
 
 ```python
 from agents import Agent, ModelSettings
@@ -67,50 +98,58 @@ english_agent = Agent(
 )
 ```
 
-## 他の LLM プロバイダーの利用
-
-他の LLM プロバイダーは 3 通りの方法で利用できます（コード例は [こちら](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/)）。
-
-1. [`set_default_openai_client`][agents.set_default_openai_client]  
-   OpenAI 互換の API エンドポイントを持つ場合に、`AsyncOpenAI` インスタンスをグローバルに LLM クライアントとして設定できます。`base_url` と `api_key` を設定するケースです。設定例は [examples/model_providers/custom_example_global.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_global.py)。  
+OpenAI の Responses API を使用する場合、`user` や `service_tier` など[その他のオプションパラメーター](https://platform.openai.com/docs/api-reference/responses/create) があります。トップレベルで指定できない場合は、`extra_args` で渡してください。
 
-2. [`ModelProvider`][agents.models.interface.ModelProvider]  
-   `Runner.run` レベルで「この実行中のすべてのエージェントにカスタムモデルプロバイダーを使う」と宣言できます。設定例は [examples/model_providers/custom_example_provider.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_provider.py)。  
-
-3. [`Agent.model`][agents.agent.Agent.model]  
-   特定の Agent インスタンスにモデルを指定できます。エージェントごとに異なるプロバイダーを組み合わせられます。設定例は [examples/model_providers/custom_example_agent.py](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/custom_example_agent.py)。多くのモデルを簡単に使う方法として [LiteLLM 連携](./litellm.md) があります。  
-
-`platform.openai.com` の API キーを持たない場合は、`set_tracing_disabled()` でトレーシングを無効化するか、[別のトレーシングプロセッサー](../tracing.md) を設定することを推奨します。
+```python
+from agents import Agent, ModelSettings
 
-!!! note
-    これらの例では Chat Completions API/モデルを使用しています。多くの LLM プロバイダーがまだ Responses API をサポートしていないためです。もしプロバイダーが Responses API をサポートしている場合は、Responses の使用を推奨します。
+english_agent = Agent(
+    name="English agent",
+    instructions="You only speak English",
+    model="gpt-4o",
+    model_settings=ModelSettings(
+        temperature=0.1,
+        extra_args={"service_tier": "flex", "user": "user_12345"},
+    ),
+)
+```
 
-## 他の LLM プロバイダーでよくある問題
+## 他の LLM プロバイダー使用時の一般的な問題
 
 ### Tracing クライアントの 401 エラー
 
-トレースは OpenAI サーバーへアップロードされるため、OpenAI API キーがない場合にエラーになります。解決策は次の 3 つです。
+Tracing 関連のエラーが発生する場合、トレースは OpenAI サーバーへアップロードされるため、OpenAI API キーが必要です。対応方法は次の 3 つです。
 
 1. トレーシングを完全に無効化する: [`set_tracing_disabled(True)`][agents.set_tracing_disabled]  
-2. トレーシング用の OpenAI キーを設定する: [`set_tracing_export_api_key(...)`][agents.set_tracing_export_api_key]  
-   このキーはトレースのアップロードにのみ使用され、[platform.openai.com](https://platform.openai.com/) のものが必要です。  
-3. OpenAI 以外のトレースプロセッサーを使う。詳しくは [tracing ドキュメント](../tracing.md#custom-tracing-processors) を参照してください。  
+2. トレース用に OpenAI キーを設定する: [`set_tracing_export_api_key(...)`][agents.set_tracing_export_api_key]  
+   この API キーはトレースのアップロードのみに使用され、[platform.openai.com](https://platform.openai.com/) で取得したものが必要です。  
+3. OpenAI 以外のトレースプロセッサーを使用する。詳細は [tracing のドキュメント](../tracing.md#custom-tracing-processors) を参照してください。
 
-### Responses API サポート
+### Responses API のサポート
 
-SDK は既定で Responses API を使用しますが、多くの LLM プロバイダーはまだ対応していません。そのため 404 などのエラーが発生する場合があります。対処方法は 2 つです。
+SDK はデフォルトで Responses API を使用しますが、ほとんどの LLM プロバイダーはまだ非対応です。その結果、404 などのエラーが発生することがあります。対処方法は次の 2 つです。
 
 1. [`set_default_openai_api("chat_completions")`][agents.set_default_openai_api] を呼び出す  
-   環境変数 `OPENAI_API_KEY` と `OPENAI_BASE_URL` を設定している場合に機能します。  
+   `OPENAI_API_KEY` と `OPENAI_BASE_URL` を環境変数で設定している場合に有効です。  
 2. [`OpenAIChatCompletionsModel`][agents.models.openai_chatcompletions.OpenAIChatCompletionsModel] を使用する  
-   コード例は [こちら](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/) にあります。  
+   例は [こちら](https://github.com/openai/openai-agents-python/tree/main/examples/model_providers/) にあります。
 
 ### structured outputs のサポート
 
 一部のモデルプロバイダーは [structured outputs](https://platform.openai.com/docs/guides/structured-outputs) をサポートしていません。その場合、次のようなエラーが発生することがあります。
 
 ```
+
 BadRequestError: Error code: 400 - {'error': {'message': "'response_format.type' : value is not one of the allowed values ['text','json_object']", 'type': 'invalid_request_error'}}
+
 ```
 
-これは一部プロバイダーの制限で、JSON 出力はサポートしていても `json_schema` を指定できません。現在修正に取り組んでいますが、JSON スキーマ出力をサポートしているプロバイダーを利用することを推奨します。そうでない場合、不正な JSON によりアプリが頻繁に壊れる可能性があります。
+これは一部プロバイダーの制限で、JSON 出力自体はサポートしていても `json_schema` を指定できないことが原因です。修正に向けて取り組んでいますが、JSON スキーマ出力をサポートしているプロバイダーを使用することをお勧めします。そうでないと、不正な JSON が返されてアプリが頻繁に壊れる可能性があります。
+
+## プロバイダーを跨いだモデルの組み合わせ
+
+モデルプロバイダーごとの機能差に注意しないと、エラーが発生します。たとえば OpenAI は structured outputs、マルチモーダル入力、ホスト型の file search や web search をサポートしていますが、多くの他プロバイダーは非対応です。以下の制限に留意してください。
+
+- 対応していないプロバイダーには未サポートの `tools` を送らない  
+- テキストのみのモデルを呼び出す前にマルチモーダル入力を除外する  
+- structured JSON 出力をサポートしていないプロバイダーでは、不正な JSON が返ることがある点に注意する
diff --git a/docs/ja/repl.md b/docs/ja/repl.md
@@ -0,0 +1,22 @@
+---
+search:
+  exclude: true
+---
+# REPL ユーティリティ
+
+`run_demo_loop` を使うと、ターミナルから手軽にエージェントを試せます。
+
+```python
+import asyncio
+from agents import Agent, run_demo_loop
+
+async def main() -> None:
+    agent = Agent(name="Assistant", instructions="あなたは親切なアシスタントです")
+    await run_demo_loop(agent)
+
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+
+`run_demo_loop` は入力を繰り返し受け取り、会話履歴を保持したままエージェントを実行します。既定ではストリーミング出力を表示します。
+`quit` または `exit` と入力するか `Ctrl-D` を押すと終了します。
diff --git a/docs/ja/tools.md b/docs/ja/tools.md
@@ -17,6 +17,10 @@ OpenAI は [`OpenAIResponsesModel`][agents.models.openai_responses.OpenAIRespons
 -   [`WebSearchTool`][agents.tool.WebSearchTool] はエージェントに Web 検索を行わせます。
 -   [`FileSearchTool`][agents.tool.FileSearchTool] は OpenAI ベクトルストアから情報を取得します。
 -   [`ComputerTool`][agents.tool.ComputerTool] はコンピュータ操作タスクを自動化します。
+-   [`CodeInterpreterTool`][agents.tool.CodeInterpreterTool] はサンドボックス環境でコードを実行します。
+-   [`HostedMCPTool`][agents.tool.HostedMCPTool] はリモート MCP サーバーのツールをモデルから直接利用できるようにします。
+-   [`ImageGenerationTool`][agents.tool.ImageGenerationTool] はプロンプトから画像を生成します。
+-   [`LocalShellTool`][agents.tool.LocalShellTool] はローカルマシンでシェルコマンドを実行します。
 
 ```python
 from agents import Agent, FileSearchTool, Runner, WebSearchTool

diff --git a/docs/ja/tracing.md b/docs/ja/tracing.md
@@ -119,4 +119,5 @@ async def main():
 -   [Comet Opik](https://www.comet.com/docs/opik/tracing/integrations/openai_agents)
 -   [Langfuse](https://langfuse.com/docs/integrations/openaiagentssdk/openai-agents)
 -   [Langtrace](https://docs.langtrace.ai/supported-integrations/llm-frameworks/openai-agents-sdk)
--   [Okahu‑Monocle](https://github.com/monocle2ai/monocle)
+-   [Okahu‑Monocle](https://github.com/monocle2ai/monocle)
+-   [Portkey AI](https://portkey.ai/docs/integrations/agents/openai-agents)
diff --git a/docs/mcp.md b/docs/mcp.md
@@ -4,27 +4,35 @@ The [Model context protocol](https://modelcontextprotocol.io/introduction) (aka
 
 > MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP like a USB-C port for AI applications. Just as USB-C provides a standardized way to connect your devices to various peripherals and accessories, MCP provides a standardized way to connect AI models to different data sources and tools.
 
-The Agents SDK has support for MCP. This enables you to use a wide range of MCP servers to provide tools to your Agents.
+The Agents SDK has support for MCP. This enables you to use a wide range of MCP servers to provide tools and prompts to your Agents.
 
 ## MCP servers
 
-Currently, the MCP spec defines two kinds of servers, based on the transport mechanism they use:
+Currently, the MCP spec defines three kinds of servers, based on the transport mechanism they use:
 
 1. **stdio** servers run as a subprocess of your application. You can think of them as running "locally".
 2. **HTTP over SSE** servers run remotely. You connect to them via a URL.
+3. **Streamable HTTP** servers run remotely using the Streamable HTTP transport defined in the MCP spec.
 
-You can use the [`MCPServerStdio`][agents.mcp.server.MCPServerStdio] and [`MCPServerSse`][agents.mcp.server.MCPServerSse] classes to connect to these servers.
+You can use the [`MCPServerStdio`][agents.mcp.server.MCPServerStdio], [`MCPServerSse`][agents.mcp.server.MCPServerSse], and [`MCPServerStreamableHttp`][agents.mcp.server.MCPServerStreamableHttp] classes to connect to these servers.
 
 For example, this is how you'd use the [official MCP filesystem server](https://www.npmjs.com/package/@modelcontextprotocol/server-filesystem).
 
 ```python
+from agents.run_context import RunContextWrapper
+
 async with MCPServerStdio(
     params={
         "command": "npx",
         "args": ["-y", "@modelcontextprotocol/server-filesystem", samples_dir],
     }
 ) as server:
-    tools = await server.list_tools()
+    # Note: In practice, you typically add the server to an Agent
+    # and let the framework handle tool listing automatically.
+    # Direct calls to list_tools() require run_context and agent parameters.
+    run_context = RunContextWrapper(context=None)
+    agent = Agent(name="test", instructions="test")
+    tools = await server.list_tools(run_context, agent)
 ```
 
 ## Using MCP servers
@@ -40,9 +48,128 @@ agent=Agent(
 )
 ```
 
+## Tool filtering
+
+You can filter which tools are available to your Agent by configuring tool filters on MCP servers. The SDK supports both static and dynamic tool filtering.
+
+### Static tool filtering
+
+For simple allow/block lists, you can use static filtering:
+
+```python
+from agents.mcp import create_static_tool_filter
+
+# Only expose specific tools from this server
+server = MCPServerStdio(
+    params={
+        "command": "npx",
+        "args": ["-y", "@modelcontextprotocol/server-filesystem", samples_dir],
+    },
+    tool_filter=create_static_tool_filter(
+        allowed_tool_names=["read_file", "write_file"]
+    )
+)
+
+# Exclude specific tools from this server
+server = MCPServerStdio(
+    params={
+        "command": "npx", 
+        "args": ["-y", "@modelcontextprotocol/server-filesystem", samples_dir],
+    },
+    tool_filter=create_static_tool_filter(
+        blocked_tool_names=["delete_file"]
+    )
+)
+
+```
+
+**When both `allowed_tool_names` and `blocked_tool_names` are configured, the processing order is:**
+1. First apply `allowed_tool_names` (allowlist) - only keep the specified tools
+2. Then apply `blocked_tool_names` (blocklist) - exclude specified tools from the remaining tools
+
+For example, if you configure `allowed_tool_names=["read_file", "write_file", "delete_file"]` and `blocked_tool_names=["delete_file"]`, only `read_file` and `write_file` tools will be available.
+
+### Dynamic tool filtering
+
+For more complex filtering logic, you can use dynamic filters with functions:
+
+```python
+from agents.mcp import ToolFilterContext
+
+# Simple synchronous filter
+def custom_filter(context: ToolFilterContext, tool) -> bool:
+    """Example of a custom tool filter."""
+    # Filter logic based on tool name patterns
+    return tool.name.startswith("allowed_prefix")
+
+# Context-aware filter
+def context_aware_filter(context: ToolFilterContext, tool) -> bool:
+    """Filter tools based on context information."""
+    # Access agent information
+    agent_name = context.agent.name
+
+    # Access server information  
+    server_name = context.server_name
+
+    # Implement your custom filtering logic here
+    return some_filtering_logic(agent_name, server_name, tool)
+
+# Asynchronous filter
+async def async_filter(context: ToolFilterContext, tool) -> bool:
+    """Example of an asynchronous filter."""
+    # Perform async operations if needed
+    result = await some_async_check(context, tool)
+    return result
+
+server = MCPServerStdio(
+    params={
+        "command": "npx",
+        "args": ["-y", "@modelcontextprotocol/server-filesystem", samples_dir],
+    },
+    tool_filter=custom_filter  # or context_aware_filter or async_filter
+)
+```
+
+The `ToolFilterContext` provides access to:
+- `run_context`: The current run context
+- `agent`: The agent requesting the tools 
+- `server_name`: The name of the MCP server
+
+## Prompts
+
+MCP servers can also provide prompts that can be used to dynamically generate agent instructions. This allows you to create reusable instruction templates that can be customized with parameters.
+
+### Using prompts
+
+MCP servers that support prompts provide two key methods:
+
+- `list_prompts()`: Lists all available prompts on the server
+- `get_prompt(name, arguments)`: Gets a specific prompt with optional parameters
+
+```python
+# List available prompts
+prompts_result = await server.list_prompts()
+for prompt in prompts_result.prompts:
+    print(f"Prompt: {prompt.name} - {prompt.description}")
+
+# Get a specific prompt with parameters
+prompt_result = await server.get_prompt(
+    "generate_code_review_instructions",
+    {"focus": "security vulnerabilities", "language": "python"}
+)
+instructions = prompt_result.messages[0].content.text
+
+# Use the prompt-generated instructions with an Agent
+agent = Agent(
+    name="Code Reviewer",
+    instructions=instructions,  # Instructions from MCP prompt
+    mcp_servers=[server]
+)
+```
+
 ## Caching
 
-Every time an Agent runs, it calls `list_tools()` on the MCP server. This can be a latency hit, especially if the server is a remote server. To automatically cache the list of tools, you can pass `cache_tools_list=True` to both [`MCPServerStdio`][agents.mcp.server.MCPServerStdio] and [`MCPServerSse`][agents.mcp.server.MCPServerSse]. You should only do this if you're certain the tool list will not change.
+Every time an Agent runs, it calls `list_tools()` on the MCP server. This can be a latency hit, especially if the server is a remote server. To automatically cache the list of tools, you can pass `cache_tools_list=True` to [`MCPServerStdio`][agents.mcp.server.MCPServerStdio], [`MCPServerSse`][agents.mcp.server.MCPServerSse], and [`MCPServerStreamableHttp`][agents.mcp.server.MCPServerStreamableHttp]. You should only do this if you're certain the tool list will not change.
 
 If you want to invalidate the cache, you can call `invalidate_tools_cache()` on the servers.
 

diff --git a/docs/models/index.md b/docs/models/index.md
@@ -93,6 +93,22 @@ english_agent = Agent(
 )
 ```
 
+Also, when you use OpenAI's Responses API, [there are a few other optional parameters](https://platform.openai.com/docs/api-reference/responses/create) (e.g., `user`, `service_tier`, and so on). If they are not available at the top level, you can use `extra_args` to pass them as well.
+
+```python
+from agents import Agent, ModelSettings
+
+english_agent = Agent(
+    name="English agent",
+    instructions="You only speak English",
+    model="gpt-4o",
+    model_settings=ModelSettings(
+        temperature=0.1,
+        extra_args={"service_tier": "flex", "user": "user_12345"},
+    ),
+)
+```
+
 ## Common issues with using other LLM providers
 
 ### Tracing client error 401

diff --git a/docs/ref/memory.md b/docs/ref/memory.md
@@ -0,0 +1,8 @@
+# Memory
+
+::: agents.memory
+
+    options:
+        members:
+            - Session
+            - SQLiteSession
diff --git a/docs/ref/repl.md b/docs/ref/repl.md
@@ -0,0 +1,6 @@
+# `repl`
+
+::: agents.repl
+    options:
+        members:
+            - run_demo_loop
diff --git a/docs/release.md b/docs/release.md
@@ -0,0 +1,24 @@
+# Release process/changelog
+
+The project follows a slightly modified version of semantic versioning using the form `0.Y.Z`. The leading `0` indicates the SDK is still evolving rapidly. Increment the components as follows:
+
+## Minor (`Y`) versions
+
+We will increase minor versions `Y` for **breaking changes** to any public interfaces that are not marked as beta. For example, going from `0.0.x` to `0.1.x` might include breaking changes.
+
+If you don't want breaking changes, we recommend pinning to `0.0.x` versions in your project.
+
+## Patch (`Z`) versions
+
+We will increment `Z` for non-breaking changes:
+
+-   Bug fixes
+-   New features
+-   Changes to private interfaces
+-   Updates to beta features
+
+## Breaking change changelog
+
+### 0.1.0
+
+In this version, [`MCPServer.list_tools()`][agents.mcp.server.MCPServer] has two new params: `run_context` and `agent`. You'll need to add these params to any classes that subclass `MCPServer`.
diff --git a/docs/repl.md b/docs/repl.md
@@ -0,0 +1,19 @@
+# REPL utility
+
+The SDK provides `run_demo_loop` for quick interactive testing.
+
+```python
+import asyncio
+from agents import Agent, run_demo_loop
+
+async def main() -> None:
+    agent = Agent(name="Assistant", instructions="You are a helpful assistant.")
+    await run_demo_loop(agent)
+
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+
+`run_demo_loop` prompts for user input in a loop, keeping the conversation
+history between turns. By default it streams model output as it is produced.
+Type `quit` or `exit` (or press `Ctrl-D`) to leave the loop.
diff --git a/docs/running_agents.md b/docs/running_agents.md
@@ -65,7 +65,9 @@ Calling any of the run methods can result in one or more agents running (and hen
 
 At the end of the agent run, you can choose what to show to the user. For example, you might show the user every new item generated by the agents, or just the final output. Either way, the user might then ask a followup question, in which case you can call the run method again.
 
-You can use the base [`RunResultBase.to_input_list()`][agents.result.RunResultBase.to_input_list] method to get the inputs for the next turn.
+### Manual conversation management
+
+You can manually manage conversation history using the [`RunResultBase.to_input_list()`][agents.result.RunResultBase.to_input_list] method to get the inputs for the next turn:
 
 ```python
 async def main():
@@ -84,6 +86,39 @@ async def main():
         # California
 ```
 
+### Automatic conversation management with Sessions
+
+For a simpler approach, you can use [Sessions](sessions.md) to automatically handle conversation history without manually calling `.to_input_list()`:
+
+```python
+from agents import Agent, Runner, SQLiteSession
+
+async def main():
+    agent = Agent(name="Assistant", instructions="Reply very concisely.")
+
+    # Create session instance
+    session = SQLiteSession("conversation_123")
+
+    with trace(workflow_name="Conversation", group_id=thread_id):
+        # First turn
+        result = await Runner.run(agent, "What city is the Golden Gate Bridge in?", session=session)
+        print(result.final_output)
+        # San Francisco
+
+        # Second turn - agent automatically remembers previous context
+        result = await Runner.run(agent, "What state is it in?", session=session)
+        print(result.final_output)
+        # California
+```
+
+Sessions automatically:
+
+-   Retrieves conversation history before each run
+-   Stores new messages after each run
+-   Maintains separate conversations for different session IDs
+
+See the [Sessions documentation](sessions.md) for more details.
+
 ## Exceptions
 
 The SDK raises exceptions in certain cases. The full list is in [`agents.exceptions`][]. As an overview:

diff --git a/docs/scripts/translate_docs.py b/docs/scripts/translate_docs.py
@@ -1,5 +1,7 @@
 # ruff: noqa
 import os
+import sys
+import argparse
 from openai import OpenAI
 from concurrent.futures import ThreadPoolExecutor
 
@@ -263,24 +265,47 @@ def translate_single_source_file(file_path: str) -> None:
 
 
 def main():
-    # Traverse the source directory
-    for root, _, file_names in os.walk(source_dir):
-        # Skip the target directories
-        if any(lang in root for lang in languages):
-            continue
-        # Increasing this will make the translation faster; you can decide considering the model's capacity
-        concurrency = 6
-        with ThreadPoolExecutor(max_workers=concurrency) as executor:
-            futures = []
-            for file_name in file_names:
-                filepath = os.path.join(root, file_name)
-                futures.append(executor.submit(translate_single_source_file, filepath))
-                if len(futures) >= concurrency:
-                    for future in futures:
-                        future.result()
-                    futures.clear()
-
-    print("Translation completed.")
+    parser = argparse.ArgumentParser(description="Translate documentation files")
+    parser.add_argument(
+        "--file", type=str, help="Specific file to translate (relative to docs directory)"
+    )
+    args = parser.parse_args()
+
+    if args.file:
+        # Translate a single file
+        # Handle both "foo.md" and "docs/foo.md" formats
+        if args.file.startswith("docs/"):
+            # Remove "docs/" prefix if present
+            relative_file = args.file[5:]
+        else:
+            relative_file = args.file
+
+        file_path = os.path.join(source_dir, relative_file)
+        if os.path.exists(file_path):
+            translate_single_source_file(file_path)
+            print(f"Translation completed for {relative_file}")
+        else:
+            print(f"Error: File {file_path} does not exist")
+            sys.exit(1)
+    else:
+        # Traverse the source directory (original behavior)
+        for root, _, file_names in os.walk(source_dir):
+            # Skip the target directories
+            if any(lang in root for lang in languages):
+                continue
+            # Increasing this will make the translation faster; you can decide considering the model's capacity
+            concurrency = 6
+            with ThreadPoolExecutor(max_workers=concurrency) as executor:
+                futures = []
+                for file_name in file_names:
+                    filepath = os.path.join(root, file_name)
+                    futures.append(executor.submit(translate_single_source_file, filepath))
+                    if len(futures) >= concurrency:
+                        for future in futures:
+                            future.result()
+                        futures.clear()
+
+        print("Translation completed.")
 
 
 if __name__ == "__main__":

diff --git a/docs/sessions.md b/docs/sessions.md
@@ -0,0 +1,319 @@
+# Sessions
+
+The Agents SDK provides built-in session memory to automatically maintain conversation history across multiple agent runs, eliminating the need to manually handle `.to_input_list()` between turns.
+
+Sessions stores conversation history for a specific session, allowing agents to maintain context without requiring explicit manual memory management. This is particularly useful for building chat applications or multi-turn conversations where you want the agent to remember previous interactions.
+
+## Quick start
+
+```python
+from agents import Agent, Runner, SQLiteSession
+
+# Create agent
+agent = Agent(
+    name="Assistant",
+    instructions="Reply very concisely.",
+)
+
+# Create a session instance with a session ID
+session = SQLiteSession("conversation_123")
+
+# First turn
+result = await Runner.run(
+    agent,
+    "What city is the Golden Gate Bridge in?",
+    session=session
+)
+print(result.final_output)  # "San Francisco"
+
+# Second turn - agent automatically remembers previous context
+result = await Runner.run(
+    agent,
+    "What state is it in?",
+    session=session
+)
+print(result.final_output)  # "California"
+
+# Also works with synchronous runner
+result = Runner.run_sync(
+    agent,
+    "What's the population?",
+    session=session
+)
+print(result.final_output)  # "Approximately 39 million"
+```
+
+## How it works
+
+When session memory is enabled:
+
+1. **Before each run**: The runner automatically retrieves the conversation history for the session and prepends it to the input items.
+2. **After each run**: All new items generated during the run (user input, assistant responses, tool calls, etc.) are automatically stored in the session.
+3. **Context preservation**: Each subsequent run with the same session includes the full conversation history, allowing the agent to maintain context.
+
+This eliminates the need to manually call `.to_input_list()` and manage conversation state between runs.
+
+## Memory operations
+
+### Basic operations
+
+Sessions supports several operations for managing conversation history:
+
+```python
+from agents import SQLiteSession
+
+session = SQLiteSession("user_123", "conversations.db")
+
+# Get all items in a session
+items = await session.get_items()
+
+# Add new items to a session
+new_items = [
+    {"role": "user", "content": "Hello"},
+    {"role": "assistant", "content": "Hi there!"}
+]
+await session.add_items(new_items)
+
+# Remove and return the most recent item
+last_item = await session.pop_item()
+print(last_item)  # {"role": "assistant", "content": "Hi there!"}
+
+# Clear all items from a session
+await session.clear_session()
+```
+
+### Using pop_item for corrections
+
+The `pop_item` method is particularly useful when you want to undo or modify the last item in a conversation:
+
+```python
+from agents import Agent, Runner, SQLiteSession
+
+agent = Agent(name="Assistant")
+session = SQLiteSession("correction_example")
+
+# Initial conversation
+result = await Runner.run(
+    agent,
+    "What's 2 + 2?",
+    session=session
+)
+print(f"Agent: {result.final_output}")
+
+# User wants to correct their question
+user_item = await session.pop_item()  # Remove user's question
+assistant_item = await session.pop_item()  # Remove agent's response
+
+# Ask a corrected question
+result = await Runner.run(
+    agent,
+    "What's 2 + 3?",
+    session=session
+)
+print(f"Agent: {result.final_output}")
+```
+
+## Memory options
+
+### No memory (default)
+
+```python
+# Default behavior - no session memory
+result = await Runner.run(agent, "Hello")
+```
+
+### SQLite memory
+
+```python
+from agents import SQLiteSession
+
+# In-memory database (lost when process ends)
+session = SQLiteSession("user_123")
+
+# Persistent file-based database
+session = SQLiteSession("user_123", "conversations.db")
+
+# Use the session
+result = await Runner.run(
+    agent,
+    "Hello",
+    session=session
+)
+```
+
+### Multiple sessions
+
+```python
+from agents import Agent, Runner, SQLiteSession
+
+agent = Agent(name="Assistant")
+
+# Different sessions maintain separate conversation histories
+session_1 = SQLiteSession("user_123", "conversations.db")
+session_2 = SQLiteSession("user_456", "conversations.db")
+
+result1 = await Runner.run(
+    agent,
+    "Hello",
+    session=session_1
+)
+result2 = await Runner.run(
+    agent,
+    "Hello",
+    session=session_2
+)
+```
+
+## Custom memory implementations
+
+You can implement your own session memory by creating a class that follows the [`Session`][agents.memory.session.Session] protocol:
+
+````python
+from agents.memory import Session
+from typing import List
+
+class MyCustomSession:
+    """Custom session implementation following the Session protocol."""
+
+    def __init__(self, session_id: str):
+        self.session_id = session_id
+        # Your initialization here
+
+    async def get_items(self, limit: int | None = None) -> List[dict]:
+        """Retrieve conversation history for this session."""
+        # Your implementation here
+        pass
+
+    async def add_items(self, items: List[dict]) -> None:
+        """Store new items for this session."""
+        # Your implementation here
+        pass
+
+    async def pop_item(self) -> dict | None:
+        """Remove and return the most recent item from this session."""
+        # Your implementation here
+        pass
+
+    async def clear_session(self) -> None:
+        """Clear all items for this session."""
+        # Your implementation here
+        pass
+
+# Use your custom session
+agent = Agent(name="Assistant")
+result = await Runner.run(
+    agent,
+    "Hello",
+    session=MyCustomSession("my_session")
+)
+
+## Session management
+
+### Session ID naming
+
+Use meaningful session IDs that help you organize conversations:
+
+-   User-based: `"user_12345"`
+-   Thread-based: `"thread_abc123"`
+-   Context-based: `"support_ticket_456"`
+
+### Memory persistence
+
+-   Use in-memory SQLite (`SQLiteSession("session_id")`) for temporary conversations
+-   Use file-based SQLite (`SQLiteSession("session_id", "path/to/db.sqlite")`) for persistent conversations
+-   Consider implementing custom session backends for production systems (Redis, PostgreSQL, etc.)
+
+### Session management
+
+```python
+# Clear a session when conversation should start fresh
+await session.clear_session()
+
+# Different agents can share the same session
+support_agent = Agent(name="Support")
+billing_agent = Agent(name="Billing")
+session = SQLiteSession("user_123")
+
+# Both agents will see the same conversation history
+result1 = await Runner.run(
+    support_agent,
+    "Help me with my account",
+    session=session
+)
+result2 = await Runner.run(
+    billing_agent,
+    "What are my charges?",
+    session=session
+)
+````
+
+## Complete example
+
+Here's a complete example showing session memory in action:
+
+```python
+import asyncio
+from agents import Agent, Runner, SQLiteSession
+
+
+async def main():
+    # Create an agent
+    agent = Agent(
+        name="Assistant",
+        instructions="Reply very concisely.",
+    )
+
+    # Create a session instance that will persist across runs
+    session = SQLiteSession("conversation_123", "conversation_history.db")
+
+    print("=== Sessions Example ===")
+    print("The agent will remember previous messages automatically.\n")
+
+    # First turn
+    print("First turn:")
+    print("User: What city is the Golden Gate Bridge in?")
+    result = await Runner.run(
+        agent,
+        "What city is the Golden Gate Bridge in?",
+        session=session
+    )
+    print(f"Assistant: {result.final_output}")
+    print()
+
+    # Second turn - the agent will remember the previous conversation
+    print("Second turn:")
+    print("User: What state is it in?")
+    result = await Runner.run(
+        agent,
+        "What state is it in?",
+        session=session
+    )
+    print(f"Assistant: {result.final_output}")
+    print()
+
+    # Third turn - continuing the conversation
+    print("Third turn:")
+    print("User: What's the population of that state?")
+    result = await Runner.run(
+        agent,
+        "What's the population of that state?",
+        session=session
+    )
+    print(f"Assistant: {result.final_output}")
+    print()
+
+    print("=== Conversation Complete ===")
+    print("Notice how the agent remembered the context from previous turns!")
+    print("Sessions automatically handles conversation history.")
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
+```
+
+## API Reference
+
+For detailed API documentation, see:
+
+-   [`Session`][agents.memory.Session] - Protocol interface
+-   [`SQLiteSession`][agents.memory.SQLiteSession] - SQLite implementation
diff --git a/docs/tools.md b/docs/tools.md
@@ -13,6 +13,10 @@ OpenAI offers a few built-in tools when using the [`OpenAIResponsesModel`][agent
 -   The [`WebSearchTool`][agents.tool.WebSearchTool] lets an agent search the web.
 -   The [`FileSearchTool`][agents.tool.FileSearchTool] allows retrieving information from your OpenAI Vector Stores.
 -   The [`ComputerTool`][agents.tool.ComputerTool] allows automating computer use tasks.
+-   The [`CodeInterpreterTool`][agents.tool.CodeInterpreterTool] lets the LLM execute code in a sandboxed environment.
+-   The [`HostedMCPTool`][agents.tool.HostedMCPTool] exposes a remote MCP server's tools to the model.
+-   The [`ImageGenerationTool`][agents.tool.ImageGenerationTool] generates images from a prompt.
+-   The [`LocalShellTool`][agents.tool.LocalShellTool] runs shell commands on your machine.
 
 ```python
 from agents import Agent, FileSearchTool, Runner, WebSearchTool
@@ -266,7 +270,7 @@ The `agent.as_tool` function is a convenience method to make it easy to turn an
 ```python
 @function_tool
 async def run_my_agent() -> str:
-  """A tool that runs the agent with custom configs".
+    """A tool that runs the agent with custom configs"""
 
     agent = Agent(name="My agent", instructions="...")
 
@@ -280,6 +284,33 @@ async def run_my_agent() -> str:
     return str(result.final_output)
 ```
 
+### Custom output extraction
+
+In certain cases, you might want to modify the output of the tool-agents before returning it to the central agent. This may be useful if you want to:
+
+- Extract a specific piece of information (e.g., a JSON payload) from the sub-agent's chat history.
+- Convert or reformat the agent’s final answer (e.g., transform Markdown into plain text or CSV).
+- Validate the output or provide a fallback value when the agent’s response is missing or malformed.
+
+You can do this by supplying the `custom_output_extractor` argument to the `as_tool` method:
+
+```python
+async def extract_json_payload(run_result: RunResult) -> str:
+    # Scan the agent’s outputs in reverse order until we find a JSON-like message from a tool call.
+    for item in reversed(run_result.new_items):
+        if isinstance(item, ToolCallOutputItem) and item.output.strip().startswith("{"):
+            return item.output.strip()
+    # Fallback to an empty JSON object if nothing was found
+    return "{}"
+
+
+json_tool = data_agent.as_tool(
+    tool_name="get_data_json",
+    tool_description="Run the data agent and return only its JSON payload",
+    custom_output_extractor=extract_json_payload,
+)
+```
+
 ## Handling errors in function tools
 
 When you create a function tool via `@function_tool`, you can pass a `failure_error_function`. This is a function that provides an error response to the LLM in case the tool call crashes.

diff --git a/docs/tracing.md b/docs/tracing.md
@@ -101,6 +101,7 @@ To customize this default setup, to send traces to alternative or additional bac
 
 -   [Weights & Biases](https://weave-docs.wandb.ai/guides/integrations/openai_agents)
 -   [Arize-Phoenix](https://docs.arize.com/phoenix/tracing/integrations-tracing/openai-agents-sdk)
+-   [Future AGI](https://docs.futureagi.com/future-agi/products/observability/auto-instrumentation/openai_agents)
 -   [MLflow (self-hosted/OSS](https://mlflow.org/docs/latest/tracing/integrations/openai-agent)
 -   [MLflow (Databricks hosted](https://docs.databricks.com/aws/en/mlflow/mlflow-tracing#-automatic-tracing)
 -   [Braintrust](https://braintrust.dev/docs/guides/traces/integrations#openai-agents-sdk)
@@ -114,3 +115,5 @@ To customize this default setup, to send traces to alternative or additional bac
 -   [Langfuse](https://langfuse.com/docs/integrations/openaiagentssdk/openai-agents)
 -   [Langtrace](https://docs.langtrace.ai/supported-integrations/llm-frameworks/openai-agents-sdk)
 -   [Okahu-Monocle](https://github.com/monocle2ai/monocle)
+-   [Galileo](https://v2docs.galileo.ai/integrations/openai-agent-integration#openai-agent-integration)
+-   [Portkey AI](https://portkey.ai/docs/integrations/agents/openai-agents)
diff --git a/examples/agent_patterns/input_guardrails.py b/examples/agent_patterns/input_guardrails.py
@@ -20,7 +20,7 @@
 Guardrails are checks that run in parallel to the agent's execution.
 They can be used to do things like:
 - Check if input messages are off-topic
-- Check that output messages don't violate any policies
+- Check that input messages don't violate any policies
 - Take over control of the agent's execution if an unexpected input is detected
 
 In this example, we'll setup an input guardrail that trips if the user is asking to do math homework.

diff --git a/examples/agent_patterns/llm_as_a_judge.py b/examples/agent_patterns/llm_as_a_judge.py
@@ -32,7 +32,7 @@ class EvaluationFeedback:
     instructions=(
         "You evaluate a story outline and decide if it's good enough."
         "If it's not good enough, you provide feedback on what needs to be improved."
-        "Never give it a pass on the first try."
+        "Never give it a pass on the first try. After 5 attempts, you can give it a pass if story outline is good enough - do not go for perfection"
     ),
     output_type=EvaluationFeedback,
 )

diff --git a/examples/basic/agent_lifecycle_example.py b/examples/basic/agent_lifecycle_example.py
@@ -101,12 +101,10 @@ async def main() -> None:
 ### (Start Agent) 1: Agent Start Agent started
 ### (Start Agent) 2: Agent Start Agent started tool random_number
 ### (Start Agent) 3: Agent Start Agent ended tool random_number with result 37
-### (Start Agent) 4: Agent Start Agent started
-### (Start Agent) 5: Agent Start Agent handed off to Multiply Agent
+### (Start Agent) 4: Agent Start Agent handed off to Multiply Agent
 ### (Multiply Agent) 1: Agent Multiply Agent started
 ### (Multiply Agent) 2: Agent Multiply Agent started tool multiply_by_two
 ### (Multiply Agent) 3: Agent Multiply Agent ended tool multiply_by_two with result 74
-### (Multiply Agent) 4: Agent Multiply Agent started
-### (Multiply Agent) 5: Agent Multiply Agent ended with output number=74
+### (Multiply Agent) 4: Agent Multiply Agent ended with output number=74
 Done!
 """
diff --git a/examples/basic/hello_world_jupyter.ipynb b/examples/basic/hello_world_jupyter.ipynb
@@ -0,0 +1,45 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "8a77ee2e-22f2-409c-837d-b994978b0aa2",
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "A function calls self,  \n",
+      "Unraveling layers deep,  \n",
+      "Base case ends the quest.  \n",
+      "\n",
+      "Infinite loops lurk,  \n",
+      "Mind the base condition well,  \n",
+      "Or it will not work.  \n",
+      "\n",
+      "Trees and lists unfold,  \n",
+      "Elegant solutions bloom,  \n",
+      "Recursion's art told.\n"
+     ]
+    }
+   ],
+   "source": [
+    "from agents import Agent, Runner\n",
+    "\n",
+    "agent = Agent(name=\"Assistant\", instructions=\"You are a helpful assistant\")\n",
+    "\n",
+    "# Intended for Jupyter notebooks where there's an existing event loop\n",
+    "result = await Runner.run(agent, \"Write a haiku about recursion in programming.\")  # type: ignore[top-level-await]  # noqa: F704\n",
+    "print(result.final_output)"
+   ]
+  }
+ ],
+ "metadata": {
+  "language_info": {
+   "name": "python"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/examples/basic/hello_world_jupyter.py b/examples/basic/hello_world_jupyter.py
diff --git a/examples/basic/lifecycle_example.py b/examples/basic/lifecycle_example.py
@@ -105,14 +105,12 @@ async def main() -> None:
 Enter a max number: 250
 ### 1: Agent Start Agent started. Usage: 0 requests, 0 input tokens, 0 output tokens, 0 total tokens
 ### 2: Tool random_number started. Usage: 1 requests, 148 input tokens, 15 output tokens, 163 total tokens
-### 3: Tool random_number ended with result 101. Usage: 1 requests, 148 input tokens, 15 output tokens, 163 total tokens
-### 4: Agent Start Agent started. Usage: 1 requests, 148 input tokens, 15 output tokens, 163 total tokens
-### 5: Handoff from Start Agent to Multiply Agent. Usage: 2 requests, 323 input tokens, 30 output tokens, 353 total tokens
-### 6: Agent Multiply Agent started. Usage: 2 requests, 323 input tokens, 30 output tokens, 353 total tokens
-### 7: Tool multiply_by_two started. Usage: 3 requests, 504 input tokens, 46 output tokens, 550 total tokens
-### 8: Tool multiply_by_two ended with result 202. Usage: 3 requests, 504 input tokens, 46 output tokens, 550 total tokens
-### 9: Agent Multiply Agent started. Usage: 3 requests, 504 input tokens, 46 output tokens, 550 total tokens
-### 10: Agent Multiply Agent ended with output number=202. Usage: 4 requests, 714 input tokens, 63 output tokens, 777 total tokens
+### 3: Tool random_number ended with result 101. Usage: 1 requests, 148 input tokens, 15 output tokens, 163 total token
+### 4: Handoff from Start Agent to Multiply Agent. Usage: 2 requests, 323 input tokens, 30 output tokens, 353 total tokens
+### 5: Agent Multiply Agent started. Usage: 2 requests, 323 input tokens, 30 output tokens, 353 total tokens
+### 6: Tool multiply_by_two started. Usage: 3 requests, 504 input tokens, 46 output tokens, 550 total tokens
+### 7: Tool multiply_by_two ended with result 202. Usage: 3 requests, 504 input tokens, 46 output tokens, 550 total tokens
+### 8: Agent Multiply Agent ended with output number=202. Usage: 4 requests, 714 input tokens, 63 output tokens, 777 total tokens
 Done!
 
 """
diff --git a/examples/basic/prompt_template.py b/examples/basic/prompt_template.py
@@ -0,0 +1,79 @@
+import argparse
+import asyncio
+import random
+
+from agents import Agent, GenerateDynamicPromptData, Runner
+
+"""
+NOTE: This example will not work out of the box, because the default prompt ID will not be available
+in your project.
+
+To use it, please:
+1. Go to https://platform.openai.com/playground/prompts
+2. Create a new prompt variable, `poem_style`.
+3. Create a system prompt with the content:
+```
+Write a poem in {{poem_style}}
+```
+4. Run the example with the `--prompt-id` flag.
+"""
+
+DEFAULT_PROMPT_ID = "pmpt_6850729e8ba481939fd439e058c69ee004afaa19c520b78b"
+
+
+class DynamicContext:
+    def __init__(self, prompt_id: str):
+        self.prompt_id = prompt_id
+        self.poem_style = random.choice(["limerick", "haiku", "ballad"])
+        print(f"[debug] DynamicContext initialized with poem_style: {self.poem_style}")
+
+
+async def _get_dynamic_prompt(data: GenerateDynamicPromptData):
+    ctx: DynamicContext = data.context.context
+    return {
+        "id": ctx.prompt_id,
+        "version": "1",
+        "variables": {
+            "poem_style": ctx.poem_style,
+        },
+    }
+
+
+async def dynamic_prompt(prompt_id: str):
+    context = DynamicContext(prompt_id)
+
+    agent = Agent(
+        name="Assistant",
+        prompt=_get_dynamic_prompt,
+    )
+
+    result = await Runner.run(agent, "Tell me about recursion in programming.", context=context)
+    print(result.final_output)
+
+
+async def static_prompt(prompt_id: str):
+    agent = Agent(
+        name="Assistant",
+        prompt={
+            "id": prompt_id,
+            "version": "1",
+            "variables": {
+                "poem_style": "limerick",
+            },
+        },
+    )
+
+    result = await Runner.run(agent, "Tell me about recursion in programming.")
+    print(result.final_output)
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--dynamic", action="store_true")
+    parser.add_argument("--prompt-id", type=str, default=DEFAULT_PROMPT_ID)
+    args = parser.parse_args()
+
+    if args.dynamic:
+        asyncio.run(dynamic_prompt(args.prompt_id))
+    else:
+        asyncio.run(static_prompt(args.prompt_id))
diff --git a/examples/basic/session_example.py b/examples/basic/session_example.py
@@ -0,0 +1,77 @@
+"""
+Example demonstrating session memory functionality.
+
+This example shows how to use session memory to maintain conversation history
+across multiple agent runs without manually handling .to_input_list().
+"""
+
+import asyncio
+
+from agents import Agent, Runner, SQLiteSession
+
+
+async def main():
+    # Create an agent
+    agent = Agent(
+        name="Assistant",
+        instructions="Reply very concisely.",
+    )
+
+    # Create a session instance that will persist across runs
+    session_id = "conversation_123"
+    session = SQLiteSession(session_id)
+
+    print("=== Session Example ===")
+    print("The agent will remember previous messages automatically.\n")
+
+    # First turn
+    print("First turn:")
+    print("User: What city is the Golden Gate Bridge in?")
+    result = await Runner.run(
+        agent,
+        "What city is the Golden Gate Bridge in?",
+        session=session,
+    )
+    print(f"Assistant: {result.final_output}")
+    print()
+
+    # Second turn - the agent will remember the previous conversation
+    print("Second turn:")
+    print("User: What state is it in?")
+    result = await Runner.run(agent, "What state is it in?", session=session)
+    print(f"Assistant: {result.final_output}")
+    print()
+
+    # Third turn - continuing the conversation
+    print("Third turn:")
+    print("User: What's the population of that state?")
+    result = await Runner.run(
+        agent,
+        "What's the population of that state?",
+        session=session,
+    )
+    print(f"Assistant: {result.final_output}")
+    print()
+
+    print("=== Conversation Complete ===")
+    print("Notice how the agent remembered the context from previous turns!")
+    print("Sessions automatically handles conversation history.")
+
+    # Demonstrate the limit parameter - get only the latest 2 items
+    print("\n=== Latest Items Demo ===")
+    latest_items = await session.get_items(limit=2)
+    print("Latest 2 items:")
+    for i, msg in enumerate(latest_items, 1):
+        role = msg.get("role", "unknown")
+        content = msg.get("content", "")
+        print(f"  {i}. {role}: {content}")
+
+    print(f"\nFetched {len(latest_items)} out of total conversation history.")
+
+    # Get all items to show the difference
+    all_items = await session.get_items()
+    print(f"Total items in session: {len(all_items)}")
+
+
+if __name__ == "__main__":
+    asyncio.run(main())
diff --git a/examples/financial_research_agent/main.py b/examples/financial_research_agent/main.py
@@ -4,7 +4,7 @@
 
 
 # Entrypoint for the financial bot example.
-# Run this as `python -m examples.financial_bot.main` and enter a
+# Run this as `python -m examples.financial_research_agent.main` and enter a
 # financial research query, for example:
 # "Write up an analysis of Apple Inc.'s most recent quarter."
 async def main() -> None:

diff --git a/examples/hosted_mcp/__init__.py b/examples/hosted_mcp/__init__.py
diff --git a/examples/hosted_mcp/approvals.py b/examples/hosted_mcp/approvals.py
@@ -0,0 +1,61 @@
+import argparse
+import asyncio
+
+from agents import (
+    Agent,
+    HostedMCPTool,
+    MCPToolApprovalFunctionResult,
+    MCPToolApprovalRequest,
+    Runner,
+)
+
+"""This example demonstrates how to use the hosted MCP support in the OpenAI Responses API, with
+approval callbacks."""
+
+
+def approval_callback(request: MCPToolApprovalRequest) -> MCPToolApprovalFunctionResult:
+    answer = input(f"Approve running the tool `{request.data.name}`? (y/n) ")
+    result: MCPToolApprovalFunctionResult = {"approve": answer == "y"}
+    if not result["approve"]:
+        result["reason"] = "User denied"
+    return result
+
+
+async def main(verbose: bool, stream: bool):
+    agent = Agent(
+        name="Assistant",
+        tools=[
+            HostedMCPTool(
+                tool_config={
+                    "type": "mcp",
+                    "server_label": "gitmcp",
+                    "server_url": "https://gitmcp.io/openai/codex",
+                    "require_approval": "always",
+                },
+                on_approval_request=approval_callback,
+            )
+        ],
+    )
+
+    if stream:
+        result = Runner.run_streamed(agent, "Which language is this repo written in?")
+        async for event in result.stream_events():
+            if event.type == "run_item_stream_event":
+                print(f"Got event of type {event.item.__class__.__name__}")
+        print(f"Done streaming; final result: {result.final_output}")
+    else:
+        res = await Runner.run(agent, "Which language is this repo written in?")
+        print(res.final_output)
+
+    if verbose:
+        for item in res.new_items:
+            print(item)
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--verbose", action="store_true", default=False)
+    parser.add_argument("--stream", action="store_true", default=False)
+    args = parser.parse_args()
+
+    asyncio.run(main(args.verbose, args.stream))
diff --git a/examples/hosted_mcp/simple.py b/examples/hosted_mcp/simple.py
@@ -0,0 +1,47 @@
+import argparse
+import asyncio
+
+from agents import Agent, HostedMCPTool, Runner
+
+"""This example demonstrates how to use the hosted MCP support in the OpenAI Responses API, with
+approvals not required for any tools. You should only use this for trusted MCP servers."""
+
+
+async def main(verbose: bool, stream: bool):
+    agent = Agent(
+        name="Assistant",
+        tools=[
+            HostedMCPTool(
+                tool_config={
+                    "type": "mcp",
+                    "server_label": "gitmcp",
+                    "server_url": "https://gitmcp.io/openai/codex",
+                    "require_approval": "never",
+                }
+            )
+        ],
+    )
+
+    if stream:
+        result = Runner.run_streamed(agent, "Which language is this repo written in?")
+        async for event in result.stream_events():
+            if event.type == "run_item_stream_event":
+                print(f"Got event of type {event.item.__class__.__name__}")
+        print(f"Done streaming; final result: {result.final_output}")
+    else:
+        res = await Runner.run(agent, "Which language is this repo written in?")
+        print(res.final_output)
+        # The repository is primarily written in multiple languages, including Rust and TypeScript...
+
+    if verbose:
+        for item in res.new_items:
+            print(item)
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--verbose", action="store_true", default=False)
+    parser.add_argument("--stream", action="store_true", default=False)
+    args = parser.parse_args()
+
+    asyncio.run(main(args.verbose, args.stream))
diff --git a/examples/mcp/prompt_server/README.md b/examples/mcp/prompt_server/README.md
@@ -0,0 +1,29 @@
+# MCP Prompt Server Example
+
+This example uses a local MCP prompt server in [server.py](server.py).
+
+Run the example via:
+
+```
+uv run python examples/mcp/prompt_server/main.py
+```
+
+## Details
+
+The example uses the `MCPServerStreamableHttp` class from `agents.mcp`. The server runs in a sub-process at `http://localhost:8000/mcp` and provides user-controlled prompts that generate agent instructions.
+
+The server exposes prompts like `generate_code_review_instructions` that take parameters such as focus area and programming language. The agent calls these prompts to dynamically generate its system instructions based on user-provided parameters.
+
+## Workflow
+
+The example demonstrates two key functions:
+
+1. **`show_available_prompts`** - Lists all available prompts on the MCP server, showing users what prompts they can select from. This demonstrates the discovery aspect of MCP prompts.
+
+2. **`demo_code_review`** - Shows the complete user-controlled prompt workflow:
+   - Calls `generate_code_review_instructions` with specific parameters (focus: "security vulnerabilities", language: "python")
+   - Uses the generated instructions to create an Agent with specialized code review capabilities
+   - Runs the agent against vulnerable sample code (command injection via `os.system`)
+   - The agent analyzes the code and provides security-focused feedback using available tools
+
+This pattern allows users to dynamically configure agent behavior through MCP prompts rather than hardcoded instructions. 
diff --git a/examples/mcp/prompt_server/main.py b/examples/mcp/prompt_server/main.py
@@ -0,0 +1,110 @@
+import asyncio
+import os
+import shutil
+import subprocess
+import time
+from typing import Any
+
+from agents import Agent, Runner, gen_trace_id, trace
+from agents.mcp import MCPServer, MCPServerStreamableHttp
+from agents.model_settings import ModelSettings
+
+
+async def get_instructions_from_prompt(mcp_server: MCPServer, prompt_name: str, **kwargs) -> str:
+    """Get agent instructions by calling MCP prompt endpoint (user-controlled)"""
+    print(f"Getting instructions from prompt: {prompt_name}")
+
+    try:
+        prompt_result = await mcp_server.get_prompt(prompt_name, kwargs)
+        content = prompt_result.messages[0].content
+        if hasattr(content, 'text'):
+            instructions = content.text
+        else:
+            instructions = str(content)
+        print("Generated instructions")
+        return instructions
+    except Exception as e:
+        print(f"Failed to get instructions: {e}")
+        return f"You are a helpful assistant. Error: {e}"
+
+
+async def demo_code_review(mcp_server: MCPServer):
+    """Demo: Code review with user-selected prompt"""
+    print("=== CODE REVIEW DEMO ===")
+
+    # User explicitly selects prompt and parameters
+    instructions = await get_instructions_from_prompt(
+        mcp_server,
+        "generate_code_review_instructions",
+        focus="security vulnerabilities",
+        language="python",
+    )
+
+    agent = Agent(
+        name="Code Reviewer Agent",
+        instructions=instructions,  # Instructions from MCP prompt
+        model_settings=ModelSettings(tool_choice="auto"),
+    )
+
+    message = """Please review this code:
+
+def process_user_input(user_input):
+    command = f"echo {user_input}"
+    os.system(command)
+    return "Command executed"
+
+"""
+
+    print(f"Running: {message[:60]}...")
+    result = await Runner.run(starting_agent=agent, input=message)
+    print(result.final_output)
+    print("\n" + "=" * 50 + "\n")
+
+
+async def show_available_prompts(mcp_server: MCPServer):
+    """Show available prompts for user selection"""
+    print("=== AVAILABLE PROMPTS ===")
+
+    prompts_result = await mcp_server.list_prompts()
+    print("User can select from these prompts:")
+    for i, prompt in enumerate(prompts_result.prompts, 1):
+        print(f"  {i}. {prompt.name} - {prompt.description}")
+    print()
+
+
+async def main():
+    async with MCPServerStreamableHttp(
+        name="Simple Prompt Server",
+        params={"url": "http://localhost:8000/mcp"},
+    ) as server:
+        trace_id = gen_trace_id()
+        with trace(workflow_name="Simple Prompt Demo", trace_id=trace_id):
+            print(f"Trace: https://platform.openai.com/traces/trace?trace_id={trace_id}\n")
+
+            await show_available_prompts(server)
+            await demo_code_review(server)
+
+
+if __name__ == "__main__":
+    if not shutil.which("uv"):
+        raise RuntimeError("uv is not installed")
+
+    process: subprocess.Popen[Any] | None = None
+    try:
+        this_dir = os.path.dirname(os.path.abspath(__file__))
+        server_file = os.path.join(this_dir, "server.py")
+
+        print("Starting Simple Prompt Server...")
+        process = subprocess.Popen(["uv", "run", server_file])
+        time.sleep(3)
+        print("Server started\n")
+    except Exception as e:
+        print(f"Error starting server: {e}")
+        exit(1)
+
+    try:
+        asyncio.run(main())
+    finally:
+        if process:
+            process.terminate()
+            print("Server terminated.")
diff --git a/examples/mcp/prompt_server/server.py b/examples/mcp/prompt_server/server.py
@@ -0,0 +1,37 @@
+from mcp.server.fastmcp import FastMCP
+
+# Create server
+mcp = FastMCP("Prompt Server")
+
+
+# Instruction-generating prompts (user-controlled)
+@mcp.prompt()
+def generate_code_review_instructions(
+    focus: str = "general code quality", language: str = "python"
+) -> str:
+    """Generate agent instructions for code review tasks"""
+    print(f"[debug-server] generate_code_review_instructions({focus}, {language})")
+
+    return f"""You are a senior {language} code review specialist. Your role is to provide comprehensive code analysis with focus on {focus}.
+
+INSTRUCTIONS:
+- Analyze code for quality, security, performance, and best practices
+- Provide specific, actionable feedback with examples
+- Identify potential bugs, vulnerabilities, and optimization opportunities
+- Suggest improvements with code examples when applicable
+- Be constructive and educational in your feedback
+- Focus particularly on {focus} aspects
+
+RESPONSE FORMAT:
+1. Overall Assessment
+2. Specific Issues Found
+3. Security Considerations
+4. Performance Notes
+5. Recommended Improvements
+6. Best Practices Suggestions
+
+Use the available tools to check current time if you need timestamps for your analysis."""
+
+
+if __name__ == "__main__":
+    mcp.run(transport="streamable-http")
diff --git a/examples/mcp/streamablehttp_example/README.md b/examples/mcp/streamablehttp_example/README.md
@@ -0,0 +1,13 @@
+# MCP Streamable HTTP Example
+
+This example uses a local Streamable HTTP server in [server.py](server.py).
+
+Run the example via:
+
+```
+uv run python examples/mcp/streamablehttp_example/main.py
+```
+
+## Details
+
+The example uses the `MCPServerStreamableHttp` class from `agents.mcp`. The server runs in a sub-process at `https://localhost:8000/mcp`.
diff --git a/examples/mcp/streamablehttp_example/main.py b/examples/mcp/streamablehttp_example/main.py
@@ -0,0 +1,83 @@
+import asyncio
+import os
+import shutil
+import subprocess
+import time
+from typing import Any
+
+from agents import Agent, Runner, gen_trace_id, trace
+from agents.mcp import MCPServer, MCPServerStreamableHttp
+from agents.model_settings import ModelSettings
+
+
+async def run(mcp_server: MCPServer):
+    agent = Agent(
+        name="Assistant",
+        instructions="Use the tools to answer the questions.",
+        mcp_servers=[mcp_server],
+        model_settings=ModelSettings(tool_choice="required"),
+    )
+
+    # Use the `add` tool to add two numbers
+    message = "Add these numbers: 7 and 22."
+    print(f"Running: {message}")
+    result = await Runner.run(starting_agent=agent, input=message)
+    print(result.final_output)
+
+    # Run the `get_weather` tool
+    message = "What's the weather in Tokyo?"
+    print(f"\n\nRunning: {message}")
+    result = await Runner.run(starting_agent=agent, input=message)
+    print(result.final_output)
+
+    # Run the `get_secret_word` tool
+    message = "What's the secret word?"
+    print(f"\n\nRunning: {message}")
+    result = await Runner.run(starting_agent=agent, input=message)
+    print(result.final_output)
+
+
+async def main():
+    async with MCPServerStreamableHttp(
+        name="Streamable HTTP Python Server",
+        params={
+            "url": "http://localhost:8000/mcp",
+        },
+    ) as server:
+        trace_id = gen_trace_id()
+        with trace(workflow_name="Streamable HTTP Example", trace_id=trace_id):
+            print(f"View trace: https://platform.openai.com/traces/trace?trace_id={trace_id}\n")
+            await run(server)
+
+
+if __name__ == "__main__":
+    # Let's make sure the user has uv installed
+    if not shutil.which("uv"):
+        raise RuntimeError(
+            "uv is not installed. Please install it: https://docs.astral.sh/uv/getting-started/installation/"
+        )
+
+    # We'll run the Streamable HTTP server in a subprocess. Usually this would be a remote server, but for this
+    # demo, we'll run it locally at http://localhost:8000/mcp
+    process: subprocess.Popen[Any] | None = None
+    try:
+        this_dir = os.path.dirname(os.path.abspath(__file__))
+        server_file = os.path.join(this_dir, "server.py")
+
+        print("Starting Streamable HTTP server at http://localhost:8000/mcp ...")
+
+        # Run `uv run server.py` to start the Streamable HTTP server
+        process = subprocess.Popen(["uv", "run", server_file])
+        # Give it 3 seconds to start
+        time.sleep(3)
+
+        print("Streamable HTTP server started. Running example...\n\n")
+    except Exception as e:
+        print(f"Error starting Streamable HTTP server: {e}")
+        exit(1)
+
+    try:
+        asyncio.run(main())
+    finally:
+        if process:
+            process.terminate()
diff --git a/examples/mcp/streamablehttp_example/server.py b/examples/mcp/streamablehttp_example/server.py
@@ -0,0 +1,33 @@
+import random
+
+import requests
+from mcp.server.fastmcp import FastMCP
+
+# Create server
+mcp = FastMCP("Echo Server")
+
+
+@mcp.tool()
+def add(a: int, b: int) -> int:
+    """Add two numbers"""
+    print(f"[debug-server] add({a}, {b})")
+    return a + b
+
+
+@mcp.tool()
+def get_secret_word() -> str:
+    print("[debug-server] get_secret_word()")
+    return random.choice(["apple", "banana", "cherry"])
+
+
+@mcp.tool()
+def get_current_weather(city: str) -> str:
+    print(f"[debug-server] get_current_weather({city})")
+
+    endpoint = "https://wttr.in"
+    response = requests.get(f"{endpoint}/{city}")
+    return response.text
+
+
+if __name__ == "__main__":
+    mcp.run(transport="streamable-http")
diff --git a/examples/realtime/demo.py b/examples/realtime/demo.py
@@ -0,0 +1,114 @@
+import asyncio
+import os
+import sys
+from typing import TYPE_CHECKING
+
+import numpy as np
+
+from agents.realtime import RealtimeSession
+
+# Add the current directory to path so we can import ui
+sys.path.append(os.path.dirname(os.path.abspath(__file__)))
+
+from agents import function_tool
+from agents.realtime import RealtimeAgent, RealtimeRunner, RealtimeSessionEvent
+
+if TYPE_CHECKING:
+    from .ui import AppUI
+else:
+    # Try both import styles
+    try:
+        # Try relative import first (when used as a package)
+        from .ui import AppUI
+    except ImportError:
+        # Fall back to direct import (when run as a script)
+        from ui import AppUI
+
+
+@function_tool
+def get_weather(city: str) -> str:
+    """Get the weather in a city."""
+    return f"The weather in {city} is sunny."
+
+
+agent = RealtimeAgent(
+    name="Assistant",
+    instructions="You always greet the user with 'Top of the morning to you'.",
+    tools=[get_weather],
+)
+
+
+class Example:
+    def __init__(self) -> None:
+        self.ui = AppUI()
+        self.ui.connected = asyncio.Event()
+        self.ui.last_audio_item_id = None
+        # Set the audio callback
+        self.ui.set_audio_callback(self.on_audio_recorded)
+
+        self.session: RealtimeSession | None = None
+
+    async def run(self) -> None:
+        # Start UI in a separate task instead of waiting for it to complete
+        ui_task = asyncio.create_task(self.ui.run_async())
+
+        # Set up session immediately without waiting for UI to finish
+        runner = RealtimeRunner(agent)
+        async with await runner.run() as session:
+            self.session = session
+            self.ui.set_is_connected(True)
+            async for event in session:
+                await self.on_event(event)
+
+        # Wait for UI task to complete when session ends
+        await ui_task
+
+    async def on_audio_recorded(self, audio_bytes: bytes) -> None:
+        """Called when audio is recorded by the UI."""
+        try:
+            # Send the audio to the session
+            assert self.session is not None
+            await self.session.send_audio(audio_bytes)
+        except Exception as e:
+            self.ui.log_message(f"Error sending audio: {e}")
+
+    async def on_event(self, event: RealtimeSessionEvent) -> None:
+        # Display event in the UI
+        try:
+            if event.type == "agent_start":
+                self.ui.add_transcript(f"Agent started: {event.agent.name}")
+            elif event.type == "agent_end":
+                self.ui.add_transcript(f"Agent ended: {event.agent.name}")
+            elif event.type == "handoff":
+                self.ui.add_transcript(
+                    f"Handoff from {event.from_agent.name} to {event.to_agent.name}"
+                )
+            elif event.type == "tool_start":
+                self.ui.add_transcript(f"Tool started: {event.tool.name}")
+            elif event.type == "tool_end":
+                self.ui.add_transcript(f"Tool ended: {event.tool.name}; output: {event.output}")
+            elif event.type == "audio_end":
+                self.ui.add_transcript("Audio ended")
+            elif event.type == "audio":
+                np_audio = np.frombuffer(event.audio.data, dtype=np.int16)
+                self.ui.play_audio(np_audio)
+            elif event.type == "audio_interrupted":
+                self.ui.add_transcript("Audio interrupted")
+            elif event.type == "error":
+                self.ui.add_transcript(f"Error: {event.error}")
+            elif event.type == "history_updated":
+                pass
+            elif event.type == "history_added":
+                pass
+            elif event.type == "raw_model_event":
+                self.ui.log_message(f"Raw model event: {event.data}")
+            else:
+                self.ui.log_message(f"Unknown event type: {event.type}")
+        except Exception as e:
+            # This can happen if the UI has already exited
+            self.ui.log_message(f"Event handling error: {str(e)}")
+
+
+if __name__ == "__main__":
+    example = Example()
+    asyncio.run(example.run())