Skip to content

Commit 907029a

Browse files
authored
Merge branch 'master' into langchain-ruff-arg
2 parents 9c56fee + 8b8d90b commit 907029a

File tree

55 files changed

+807
-181
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+807
-181
lines changed

.github/copilot-instructions.md

Lines changed: 151 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
### 1. Avoid Breaking Changes (Stable Public Interfaces)
2+
3+
* Carefully preserve **function signatures**, argument positions, and names for any exported/public methods.
4+
* Be cautious when **renaming**, **removing**, or **reordering** arguments — even small changes can break downstream consumers.
5+
* Use keyword-only arguments or clearly mark experimental features to isolate unstable APIs.
6+
7+
Bad:
8+
9+
```python
10+
def get_user(id, verbose=False): # Changed from `user_id`
11+
```
12+
13+
Good:
14+
15+
```python
16+
def get_user(user_id: str, verbose: bool = False): # Maintains stable interface
17+
```
18+
19+
🧠 *Ask yourself:* “Would this change break someone's code if they used it last week?”
20+
21+
---
22+
23+
### 2. Simplify Code and Use Clear Variable Names
24+
25+
* Prefer descriptive, **self-explanatory variable names**. Avoid overly short or cryptic identifiers.
26+
* Break up overly long or deeply nested functions for **readability and maintainability**.
27+
* Avoid unnecessary abstraction or premature optimization.
28+
* All generated Python code must include type hints.
29+
30+
Bad:
31+
32+
```python
33+
def p(u, d):
34+
return [x for x in u if x not in d]
35+
```
36+
37+
Good:
38+
39+
```python
40+
def filter_unknown_users(users: List[str], known_users: Set[str]) -> List[str]:
41+
return [user for user in users if user not in known_users]
42+
```
43+
44+
---
45+
46+
### 3. Ensure Unit Tests Cover New and Updated Functionality
47+
48+
* Every new feature or bugfix should be **covered by a unit test**.
49+
* Test edge cases and failure conditions.
50+
* Use `pytest`, `unittest`, or the project’s existing framework consistently.
51+
52+
Checklist:
53+
54+
* [ ] Does the test suite fail if your new logic is broken?
55+
* [ ] Are all expected behaviors exercised (happy path, invalid input, etc)?
56+
* [ ] Do tests use fixtures or mocks where needed?
57+
58+
---
59+
60+
### 4. Look for Suspicious or Risky Code
61+
62+
* Watch out for:
63+
64+
* Use of `eval()`, `exec()`, or `pickle` on user-controlled input.
65+
* Silent failure modes (`except: pass`).
66+
* Unreachable code or commented-out blocks.
67+
* Race conditions or resource leaks (file handles, sockets, threads).
68+
69+
Bad:
70+
71+
```python
72+
def load_config(path):
73+
with open(path) as f:
74+
return eval(f.read()) # ⚠️ Never eval config
75+
```
76+
77+
Good:
78+
79+
```python
80+
import json
81+
82+
def load_config(path: str) -> dict:
83+
with open(path) as f:
84+
return json.load(f)
85+
```
86+
87+
---
88+
89+
### 5. Use Google-Style Docstrings (with Args section)
90+
91+
* All public functions should include a **Google-style docstring**.
92+
* Include an `Args:` section where relevant.
93+
* Types should NOT be written in the docstring — use type hints instead.
94+
95+
Bad:
96+
97+
```python
98+
def send_email(to, msg):
99+
"""Send an email to a recipient."""
100+
```
101+
102+
Good:
103+
104+
```python
105+
def send_email(to: str, msg: str) -> None:
106+
"""
107+
Sends an email to a recipient.
108+
109+
Args:
110+
to: The email address of the recipient.
111+
msg: The message body.
112+
"""
113+
```
114+
115+
📌 *Tip:* Keep descriptions concise but clear. Only document return values if non-obvious.
116+
117+
---
118+
119+
### 6. Propose Better Designs When Applicable
120+
121+
* If there's a **cleaner**, **more scalable**, or **simpler** design, highlight it.
122+
* Suggest improvements, even if they require some refactoring — especially if the new code would:
123+
124+
* Reduce duplication
125+
* Make unit testing easier
126+
* Improve separation of concerns
127+
* Add clarity without adding complexity
128+
129+
Instead of:
130+
131+
```python
132+
def save(data, db_conn):
133+
# manually serializes fields
134+
```
135+
136+
You might suggest:
137+
138+
```python
139+
# Suggest using dataclasses or Pydantic for automatic serialization and validation
140+
```
141+
142+
### 7. Misc
143+
144+
* When suggesting package installation commands, use `uv pip install` as this project uses `uv`.
145+
* When creating tools for agents, use the @tool decorator from langchain_core.tools. The tool's docstring serves as its functional description for the agent.
146+
* Avoid suggesting deprecated components, such as the legacy LLMChain.
147+
* We use Conventional Commits format for pull request titles. Example PR titles:
148+
* feat(core): add multi‐tenant support
149+
* fix(cli): resolve flag parsing error
150+
* docs: update API usage examples
151+
* docs(openai): update API usage examples

.github/workflows/_release.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -340,7 +340,7 @@ jobs:
340340
runs-on: ubuntu-latest
341341
strategy:
342342
matrix:
343-
partner: [openai, anthropic]
343+
partner: [openai]
344344
fail-fast: false # Continue testing other partners if one fails
345345
env:
346346
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

docs/docs/how_to/custom_chat_model.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
"\n",
1919
"Wrapping your LLM with the standard [`BaseChatModel`](https://python.langchain.com/api_reference/core/language_models/langchain_core.language_models.chat_models.BaseChatModel.html) interface allow you to use your LLM in existing LangChain programs with minimal code modifications!\n",
2020
"\n",
21-
"As an bonus, your LLM will automatically become a LangChain [Runnable](/docs/concepts/runnables/) and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
21+
"As a bonus, your LLM will automatically become a LangChain [Runnable](/docs/concepts/runnables/) and will benefit from some optimizations out of the box (e.g., batch via a threadpool), async support, the `astream_events` API, etc.\n",
2222
"\n",
2323
"## Inputs and outputs\n",
2424
"\n",

docs/docs/integrations/text_embedding/azureopenai.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,7 @@
131131
"source": [
132132
"## Indexing and Retrieval\n",
133133
"\n",
134-
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n",
134+
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n",
135135
"\n",
136136
"Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`."
137137
]

docs/docs/integrations/text_embedding/google_generative_ai.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -173,7 +173,7 @@
173173
"source": [
174174
"## Indexing and Retrieval\n",
175175
"\n",
176-
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n",
176+
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n",
177177
"\n",
178178
"Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`."
179179
]

docs/docs/integrations/text_embedding/google_vertex_ai_palm.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -167,7 +167,7 @@
167167
"source": [
168168
"## Indexing and Retrieval\n",
169169
"\n",
170-
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/).\n",
170+
"Embedding models are often used in retrieval-augmented generation (RAG) flows, both as part of indexing data as well as later retrieving it. For more detailed instructions, please see our [RAG tutorials](/docs/tutorials/rag).\n",
171171
"\n",
172172
"Below, see how to index and retrieve data using the `embeddings` object we initialized above. In this example, we will index and retrieve a sample document in the `InMemoryVectorStore`."
173173
]

libs/core/langchain_core/language_models/fake_chat_models.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,8 @@ def _generate(
3636
run_manager: Optional[CallbackManagerForLLMRun] = None,
3737
**kwargs: Any,
3838
) -> ChatResult:
39+
if self.sleep is not None:
40+
time.sleep(self.sleep)
3941
response = self.responses[self.i]
4042
if self.i < len(self.responses) - 1:
4143
self.i += 1
@@ -61,9 +63,9 @@ class FakeListChatModel(SimpleChatModel):
6163
"""List of responses to **cycle** through in order."""
6264
sleep: Optional[float] = None
6365
i: int = 0
64-
"""List of responses to **cycle** through in order."""
65-
error_on_chunk_number: Optional[int] = None
6666
"""Internally incremented after every model invocation."""
67+
error_on_chunk_number: Optional[int] = None
68+
"""If set, raise an error on the specified chunk number during streaming."""
6769

6870
@property
6971
@override
@@ -79,6 +81,8 @@ def _call(
7981
**kwargs: Any,
8082
) -> str:
8183
"""First try to lookup in queries, else return 'foo' or 'bar'."""
84+
if self.sleep is not None:
85+
time.sleep(self.sleep)
8286
response = self.responses[self.i]
8387
if self.i < len(self.responses) - 1:
8488
self.i += 1

libs/core/langchain_core/output_parsers/openai_tools.py

Lines changed: 38 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -234,23 +234,53 @@ def parse_result(self, result: list[Generation], *, partial: bool = False) -> An
234234
Returns:
235235
The parsed tool calls.
236236
"""
237-
parsed_result = super().parse_result(result, partial=partial)
238-
237+
generation = result[0]
238+
if not isinstance(generation, ChatGeneration):
239+
msg = "This output parser can only be used with a chat generation."
240+
raise OutputParserException(msg)
241+
message = generation.message
242+
if isinstance(message, AIMessage) and message.tool_calls:
243+
parsed_tool_calls = [dict(tc) for tc in message.tool_calls]
244+
for tool_call in parsed_tool_calls:
245+
if not self.return_id:
246+
_ = tool_call.pop("id")
247+
else:
248+
try:
249+
raw_tool_calls = copy.deepcopy(message.additional_kwargs["tool_calls"])
250+
except KeyError:
251+
if self.first_tool_only:
252+
return None
253+
return []
254+
parsed_tool_calls = parse_tool_calls(
255+
raw_tool_calls,
256+
partial=partial,
257+
strict=self.strict,
258+
return_id=self.return_id,
259+
)
260+
# For backwards compatibility
261+
for tc in parsed_tool_calls:
262+
tc["type"] = tc.pop("name")
239263
if self.first_tool_only:
264+
parsed_result = list(
265+
filter(lambda x: x["type"] == self.key_name, parsed_tool_calls)
266+
)
240267
single_result = (
241-
parsed_result
242-
if parsed_result and parsed_result["type"] == self.key_name
268+
parsed_result[0]
269+
if parsed_result and parsed_result[0]["type"] == self.key_name
243270
else None
244271
)
245272
if self.return_id:
246273
return single_result
247274
if single_result:
248275
return single_result["args"]
249276
return None
250-
parsed_result = [res for res in parsed_result if res["type"] == self.key_name]
251-
if not self.return_id:
252-
parsed_result = [res["args"] for res in parsed_result]
253-
return parsed_result
277+
return (
278+
[res for res in parsed_tool_calls if res["type"] == self.key_name]
279+
if self.return_id
280+
else [
281+
res["args"] for res in parsed_tool_calls if res["type"] == self.key_name
282+
]
283+
)
254284

255285

256286
# Common cause of ValidationError is truncated output due to max_tokens.

libs/core/langchain_core/outputs/__init__.py

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,23 @@
11
"""Output classes.
22
3-
**Output** classes are used to represent the output of a language model call
4-
and the output of a chat.
3+
Used to represent the output of a language model call and the output of a chat.
54
6-
The top container for information is the `LLMResult` object. `LLMResult` is used by
7-
both chat models and LLMs. This object contains the output of the language
8-
model and any additional information that the model provider wants to return.
5+
The top container for information is the `LLMResult` object. `LLMResult` is used by both
6+
chat models and LLMs. This object contains the output of the language model and any
7+
additional information that the model provider wants to return.
98
109
When invoking models via the standard runnable methods (e.g. invoke, batch, etc.):
10+
1111
- Chat models will return `AIMessage` objects.
1212
- LLMs will return regular text strings.
1313
1414
In addition, users can access the raw output of either LLMs or chat models via
15-
callbacks. The on_chat_model_end and on_llm_end callbacks will return an
15+
callbacks. The ``on_chat_model_end`` and ``on_llm_end`` callbacks will return an
1616
LLMResult object containing the generated outputs and any additional information
1717
returned by the model provider.
1818
19-
In general, if information is already available
20-
in the AIMessage object, it is recommended to access it from there rather than
21-
from the `LLMResult` object.
19+
In general, if information is already available in the AIMessage object, it is
20+
recommended to access it from there rather than from the `LLMResult` object.
2221
"""
2322

2423
from typing import TYPE_CHECKING

libs/core/langchain_core/outputs/chat_generation.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,11 @@ class ChatGeneration(Generation):
2727
"""
2828

2929
text: str = ""
30-
"""*SHOULD NOT BE SET DIRECTLY* The text contents of the output message."""
30+
"""The text contents of the output message.
31+
32+
.. warning::
33+
SHOULD NOT BE SET DIRECTLY!
34+
"""
3135
message: BaseMessage
3236
"""The message output by the chat model."""
3337
# Override type to be ChatGeneration, ignore mypy error as this is intentional

0 commit comments

Comments
 (0)