Skip to content

v1.73.6.rc #12146

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Jul 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
a83f8f3
fix - using on python 3.9
ishaan-jaff Jun 28, 2025
9b4f26f
[⚡️ Python SDK Import] - 2 second faster import times (#12135)
ishaan-jaff Jun 28, 2025
e77eb82
docs(index.md): initial pre-release note
krrishdholakia Jun 28, 2025
f93326a
🧹 Refactor init.py to use a model registry (#12138)
ishaan-jaff Jun 28, 2025
092e359
fix import loc
ishaan-jaff Jun 28, 2025
6b623f9
test whitelisted models
ishaan-jaff Jun 28, 2025
79a8b1a
Revert "🧹 Refactor init.py to use a model registry (#12138)" (#12141)
ishaan-jaff Jun 28, 2025
0c19414
[⚡️ Python SDK import] - reduce python sdk import time by .3s (#12140)
ishaan-jaff Jun 28, 2025
ca89104
fix imports
ishaan-jaff Jun 28, 2025
04f5ccf
fix - revert list team changes
ishaan-jaff Jun 28, 2025
9735202
fix for o-series param checks
ishaan-jaff Jun 28, 2025
a021e7c
bump poetry
ishaan-jaff Jun 29, 2025
123631c
docs(index.md): update release note with cleaner table for updated mo…
krrishdholakia Jun 29, 2025
f7af890
`/v1/messages` - Remove hardcoded model name on streaming + Tags - en…
krrishdholakia Jun 29, 2025
50dd998
docs - update release notes
ishaan-jaff Jun 29, 2025
7da4e21
Benefits of using gemini-cli with LiteLLM
ishaan-jaff Jun 29, 2025
cec7e49
UI QA Fixes - prevent team model reset on model add + return team-onl…
krrishdholakia Jun 29, 2025
79ffb8a
docs gemini cli x litellm
ishaan-jaff Jun 29, 2025
7cc2ce0
docs: index.md
krrishdholakia Jun 29, 2025
1df5c16
docs(index.md): add more hyperlinks to docs
krrishdholakia Jun 29, 2025
f76375b
docs(index.md): add batch api cost tracking to docs
krrishdholakia Jun 29, 2025
f3a701b
docs(index.md): update docs
krrishdholakia Jun 29, 2025
ddfd411
VertexAI Anthropic - streaming cost tracking w/ prompt caching fixes …
krrishdholakia Jul 1, 2025
29880cc
Fix rendering ui on non-root images (#12226)
krrishdholakia Jul 2, 2025
54eb913
build(pyproject.toml): version rc2
krrishdholakia Jul 2, 2025
17e5806
fix(streaming_handler.py): store finish reason, even if is_finished i…
krrishdholakia Jul 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1358,6 +1358,7 @@ jobs:
# - run: python ./tests/documentation_tests/test_general_setting_keys.py
- run: python ./tests/code_coverage_tests/check_licenses.py
- run: python ./tests/code_coverage_tests/router_code_coverage.py
- run: python ./tests/code_coverage_tests/test_proxy_types_import.py
- run: python ./tests/code_coverage_tests/callback_manager_test.py
- run: python ./tests/code_coverage_tests/recursive_detector.py
- run: python ./tests/code_coverage_tests/test_router_strategy_async.py
Expand Down
198 changes: 192 additions & 6 deletions docs/my-website/docs/proxy/cost_tracking.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,198 @@ curl -L -X GET 'http://localhost:4000/user/daily/activity?start_date=2025-03-20&

See our [Swagger API](https://litellm-api.up.railway.app/#/Budget%20%26%20Spend%20Tracking/get_user_daily_activity_user_daily_activity_get) for more details on the `/user/daily/activity` endpoint

## Custom Tags

Requirements:

- Virtual Keys & a database should be set up, see [virtual keys](https://docs.litellm.ai/docs/proxy/virtual_keys)

**Note:** By default, LiteLLM will track `User-Agent` as a custom tag for cost tracking. This enables viewing usage for tools like Claude Code, Gemini CLI, etc.


<Image img={require('../../img/claude_cli_tag_usage.png')} />


### Client-side spend tag

<Tabs>
<TabItem value="key" label="Set on Key">

```bash
curl -L -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"metadata": {
"tags": ["tag1", "tag2", "tag3"]
}
}

'
```

</TabItem>
<TabItem value="team" label="Set on Team">

```bash
curl -L -X POST 'http://0.0.0.0:4000/team/new' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"metadata": {
"tags": ["tag1", "tag2", "tag3"]
}
}

'
```

</TabItem>
<TabItem value="openai" label="OpenAI Python v1.0.0+">

Set `extra_body={"metadata": { }}` to `metadata` you want to pass

```python
import openai
client = openai.OpenAI(
api_key="anything",
base_url="http://0.0.0.0:4000"
)


response = client.chat.completions.create(
model="gpt-3.5-turbo",
messages = [
{
"role": "user",
"content": "this is a test request, write a short poem"
}
],
extra_body={
"metadata": {
"tags": ["model-anthropic-claude-v2.1", "app-ishaan-prod"] # 👈 Key Change
}
}
)

print(response)
```
</TabItem>


<TabItem value="openai js" label="OpenAI JS">

```js
const openai = require('openai');

async function runOpenAI() {
const client = new openai.OpenAI({
apiKey: 'sk-1234',
baseURL: 'http://0.0.0.0:4000'
});

try {
const response = await client.chat.completions.create({
model: 'gpt-3.5-turbo',
messages: [
{
role: 'user',
content: "this is a test request, write a short poem"
},
],
metadata: {
tags: ["model-anthropic-claude-v2.1", "app-ishaan-prod"] // 👈 Key Change
}
});
console.log(response);
} catch (error) {
console.log("got this exception from server");
console.error(error);
}
}

// Call the asynchronous function
runOpenAI();
```
</TabItem>

<TabItem value="Curl" label="Curl Request">

Pass `metadata` as part of the request body

```shell
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data '{
"model": "gpt-3.5-turbo",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
],
"metadata": {"tags": ["model-anthropic-claude-v2.1", "app-ishaan-prod"]}
}'
```
</TabItem>
<TabItem value="langchain" label="Langchain">

```python
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
ChatPromptTemplate,
HumanMessagePromptTemplate,
SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage

chat = ChatOpenAI(
openai_api_base="http://0.0.0.0:4000",
model = "gpt-3.5-turbo",
temperature=0.1,
extra_body={
"metadata": {
"tags": ["model-anthropic-claude-v2.1", "app-ishaan-prod"]
}
}
)

messages = [
SystemMessage(
content="You are a helpful assistant that im using to make a test request to."
),
HumanMessage(
content="test from litellm. tell me why it's amazing in 1 sentence"
),
]
response = chat(messages)

print(response)
```

</TabItem>
</Tabs>



### Add custom headers to spend tracking

You can add custom headers to the request to track spend and usage.

```yaml
litellm_settings:
extra_spend_tag_headers:
- "x-custom-header"
```

### Disable user-agent tracking

You can disable user-agent tracking by setting `litellm_settings.disable_user_agent_tracking` to `true`.

```yaml
litellm_settings:
disable_user_agent_tracking: true
```
## ✨ (Enterprise) Generate Spend Reports

Use this to charge other teams, customers, users
Expand Down Expand Up @@ -617,11 +809,5 @@ Logging specific key,value pairs in spend logs metadata is an enterprise feature
:::


## ✨ Custom Tags

:::info

Tracking spend with Custom tags is an enterprise feature. [See here](./enterprise.md#tracking-spend-for-custom-tags)

:::

3 changes: 3 additions & 0 deletions docs/my-website/docs/proxy/custom_root_ui.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,9 @@ Requires v1.72.3 or higher.

:::

Limitations:
- This does not work in [litellm non-root](./deploy#non-root---without-internet-connection) images, as it requires write access to the UI files.

## Usage

### 1. Set `SERVER_ROOT_PATH` in your .env
Expand Down
Loading