Skip to content

Commit b4fc703

Browse files
LiteLLM Stable release notes (#10919)
* docs(index/v1.70.1-stable): style improvements * style: add style improvements to docs * docs: cleanup docs * docs: more style improvements * docs: style improvements * docs(gemini/realtime): add docs on realtime api via Google AI Studio * docs: add openai example to anthropic web search docs * docs: add missing doc links * docs: doc cleanup * docs: add more doc links * fix: cleanup * docs: add docker information * docs: update doc links * docs: add demo instance details to doc s
1 parent ac54139 commit b4fc703

File tree

17 files changed

+586
-412
lines changed

17 files changed

+586
-412
lines changed

docs/my-website/docs/embedding/supported_embedding.md

Lines changed: 0 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -225,36 +225,6 @@ response = embedding(
225225
| text-embedding-3-large | `embedding('text-embedding-3-large', input)` | `os.environ['OPENAI_API_KEY']` |
226226
| text-embedding-ada-002 | `embedding('text-embedding-ada-002', input)` | `os.environ['OPENAI_API_KEY']` |
227227

228-
## Azure OpenAI Embedding Models
229-
230-
### API keys
231-
This can be set as env variables or passed as **params to litellm.embedding()**
232-
```python
233-
import os
234-
os.environ['AZURE_API_KEY'] =
235-
os.environ['AZURE_API_BASE'] =
236-
os.environ['AZURE_API_VERSION'] =
237-
```
238-
239-
### Usage
240-
```python
241-
from litellm import embedding
242-
response = embedding(
243-
model="azure/<your deployment name>",
244-
input=["good morning from litellm"],
245-
api_key=api_key,
246-
api_base=api_base,
247-
api_version=api_version,
248-
)
249-
print(response)
250-
```
251-
252-
| Model Name | Function Call |
253-
|----------------------|---------------------------------------------|
254-
| text-embedding-ada-002 | `embedding(model="azure/<your deployment name>", input=input)` |
255-
256-
h/t to [Mikko](https://www.linkedin.com/in/mikkolehtimaki/) for this integration
257-
258228
## OpenAI Compatible Embedding Models
259229
Use this for calling `/embedding` endpoints on OpenAI Compatible Servers, example https://github.com/xorbitsai/inference
260230

docs/my-website/docs/observability/phoenix_integration.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
import Image from '@theme/IdealImage';
22

3-
# Phoenix OSS
3+
# Arize Phoenix OSS
44

55
Open source tracing and evaluation platform
66

docs/my-website/docs/providers/anthropic.md

Lines changed: 78 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -847,13 +847,50 @@ curl http://0.0.0.0:4000/v1/chat/completions \
847847
<TabItem value="web_search" label="Web Search">
848848
849849
:::info
850-
851-
Unified web search (same param across OpenAI + Anthropic) coming soon!
850+
Live from v1.70.1+
852851
:::
853852
853+
LiteLLM maps OpenAI's `search_context_size` param to Anthropic's `max_uses` param.
854+
855+
| OpenAI | Anthropic |
856+
| --- | --- |
857+
| Low | 1 |
858+
| Medium | 5 |
859+
| High | 10 |
860+
861+
854862
<Tabs>
855863
<TabItem value="sdk" label="SDK">
856864
865+
866+
<Tabs>
867+
<TabItem value="openai" label="OpenAI Format">
868+
869+
```python
870+
from litellm import completion
871+
872+
model = "claude-3-5-sonnet-20241022"
873+
messages = [{"role": "user", "content": "What's the weather like today?"}]
874+
875+
resp = completion(
876+
model=model,
877+
messages=messages,
878+
web_search_options={
879+
"search_context_size": "medium",
880+
"user_location": {
881+
"type": "approximate",
882+
"approximate": {
883+
"city": "San Francisco",
884+
},
885+
}
886+
}
887+
)
888+
889+
print(resp)
890+
```
891+
</TabItem>
892+
<TabItem value="anthropic" label="Anthropic Format">
893+
857894
```python
858895
from litellm import completion
859896
@@ -873,8 +910,11 @@ resp = completion(
873910
874911
print(resp)
875912
```
913+
</TabItem>
876914
915+
</Tabs>
877916
</TabItem>
917+
878918
<TabItem value="proxy" label="PROXY">
879919
880920
1. Setup config.yaml
@@ -894,22 +934,56 @@ litellm --config /path/to/config.yaml
894934
895935
3. Test it!
896936
937+
<Tabs>
938+
<TabItem value="openai" label="OpenAI Format">
939+
940+
897941
```bash
898942
curl http://0.0.0.0:4000/v1/chat/completions \
899943
-H "Content-Type: application/json" \
900944
-H "Authorization: Bearer $LITELLM_KEY" \
901945
-d '{
902946
"model": "claude-3-5-sonnet-latest",
903-
"messages": [{"role": "user", "content": "There's a syntax error in my primes.py file. Can you help me fix it?"}],
904-
"tools": [{"type": "web_search_20250305", "name": "web_search", "max_uses": 5}]
947+
"messages": [{"role": "user", "content": "What's the weather like today?"}],
948+
"web_search_options": {
949+
"search_context_size": "medium",
950+
"user_location": {
951+
"type": "approximate",
952+
"approximate": {
953+
"city": "San Francisco",
954+
},
955+
}
956+
}
957+
}'
958+
```
959+
</TabItem>
960+
<TabItem value="anthropic" label="Anthropic Format">
961+
962+
```bash
963+
curl http://0.0.0.0:4000/v1/chat/completions \
964+
-H "Content-Type: application/json" \
965+
-H "Authorization: Bearer $LITELLM_KEY" \
966+
-d '{
967+
"model": "claude-3-5-sonnet-latest",
968+
"messages": [{"role": "user", "content": "What's the weather like today?"}],
969+
"tools": [{
970+
"type": "web_search_20250305",
971+
"name": "web_search",
972+
"max_uses": 5
973+
}]
905974
}'
906975
```
976+
977+
</TabItem>
978+
</Tabs>
907979
</TabItem>
908980
</Tabs>
909981
910982
</TabItem>
911983
</Tabs>
912984
985+
986+
913987
## Usage - Vision
914988
915989
```python

docs/my-website/docs/providers/azure.md renamed to docs/my-website/docs/providers/azure/azure.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ import TabItem from '@theme/TabItem';
1111
|-------|-------|
1212
| Description | Azure OpenAI Service provides REST API access to OpenAI's powerful language models including o1, o1-mini, GPT-4o, GPT-4o mini, GPT-4 Turbo with Vision, GPT-4, GPT-3.5-Turbo, and Embeddings model series |
1313
| Provider Route on LiteLLM | `azure/`, [`azure/o_series/`](#azure-o-series-models) |
14-
| Supported Operations | [`/chat/completions`](#azure-openai-chat-completion-models), [`/completions`](#azure-instruct-models), [`/embeddings`](../embedding/supported_embedding#azure-openai-embedding-models), [`/audio/speech`](#azure-text-to-speech-tts), [`/audio/transcriptions`](../audio_transcription), `/fine_tuning`, [`/batches`](#azure-batches-api), `/files`, [`/images`](../image_generation#azure-openai-image-generation-models) |
14+
| Supported Operations | [`/chat/completions`](#azure-openai-chat-completion-models), [`/completions`](#azure-instruct-models), [`/embeddings`](./azure_embedding), [`/audio/speech`](#azure-text-to-speech-tts), [`/audio/transcriptions`](../audio_transcription), `/fine_tuning`, [`/batches`](#azure-batches-api), `/files`, [`/images`](../image_generation#azure-openai-image-generation-models) |
1515
| Link to Provider Doc | [Azure OpenAI ↗](https://learn.microsoft.com/en-us/azure/ai-services/openai/overview)
1616

1717
## API Keys, Params
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
import Image from '@theme/IdealImage';
2+
import Tabs from '@theme/Tabs';
3+
import TabItem from '@theme/TabItem';
4+
5+
# Azure OpenAI Embeddings
6+
7+
### API keys
8+
This can be set as env variables or passed as **params to litellm.embedding()**
9+
```python
10+
import os
11+
os.environ['AZURE_API_KEY'] =
12+
os.environ['AZURE_API_BASE'] =
13+
os.environ['AZURE_API_VERSION'] =
14+
```
15+
16+
### Usage
17+
```python
18+
from litellm import embedding
19+
response = embedding(
20+
model="azure/<your deployment name>",
21+
input=["good morning from litellm"],
22+
api_key=api_key,
23+
api_base=api_base,
24+
api_version=api_version,
25+
)
26+
print(response)
27+
```
28+
29+
| Model Name | Function Call |
30+
|----------------------|---------------------------------------------|
31+
| text-embedding-ada-002 | `embedding(model="azure/<your deployment name>", input=input)` |
32+
33+
h/t to [Mikko](https://www.linkedin.com/in/mikkolehtimaki/) for this integration
34+
35+
36+
## **Usage - LiteLLM Proxy Server**
37+
38+
Here's how to call Azure OpenAI models with the LiteLLM Proxy Server
39+
40+
### 1. Save key in your environment
41+
42+
```bash
43+
export AZURE_API_KEY=""
44+
```
45+
46+
### 2. Start the proxy
47+
48+
```yaml
49+
model_list:
50+
- model_name: text-embedding-ada-002
51+
litellm_params:
52+
model: azure/my-deployment-name
53+
api_base: https://openai-gpt-4-test-v-1.openai.azure.com/
54+
api_version: "2023-05-15"
55+
api_key: os.environ/AZURE_API_KEY # The `os.environ/` prefix tells litellm to read this from the env.
56+
```
57+
58+
### 3. Test it
59+
60+
<Tabs>
61+
<TabItem value="Curl" label="Curl Request">
62+
63+
```shell
64+
curl --location 'http://0.0.0.0:4000/embeddings' \
65+
--header 'Content-Type: application/json' \
66+
--data ' {
67+
"model": "text-embedding-ada-002",
68+
"input": ["write a litellm poem"]
69+
}'
70+
```
71+
</TabItem>
72+
<TabItem value="openai" label="OpenAI v1.0.0+">
73+
74+
```python
75+
import openai
76+
from openai import OpenAI
77+
78+
# set base_url to your proxy server
79+
# set api_key to send to proxy server
80+
client = OpenAI(api_key="<proxy-api-key>", base_url="http://0.0.0.0:4000")
81+
82+
response = client.embeddings.create(
83+
input=["hello from litellm"],
84+
model="text-embedding-ada-002"
85+
)
86+
87+
print(response)
88+
89+
```
90+
</TabItem>
91+
</Tabs>
92+
93+
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Gemini Realtime API - Google AI Studio
2+
3+
| Feature | Description | Comments |
4+
| --- | --- | --- |
5+
| Proxy || |
6+
| SDK | ⌛️ | Experimental access via `litellm._arealtime`. |
7+
8+
9+
## Proxy Usage
10+
11+
### Add model to config
12+
13+
```yaml
14+
model_list:
15+
- model_name: "gemini-2.0-flash"
16+
litellm_params:
17+
model: gemini/gemini-2.0-flash-live-001
18+
model_info:
19+
mode: realtime
20+
```
21+
22+
### Start proxy
23+
24+
```bash
25+
litellm --config /path/to/config.yaml
26+
27+
# RUNNING on http://0.0.0.0:8000
28+
```
29+
30+
### Test
31+
32+
Run this script using node - `node test.js`
33+
34+
```js
35+
// test.js
36+
const WebSocket = require("ws");
37+
38+
const url = "ws://0.0.0.0:4000/v1/realtime?model=openai-gemini-2.0-flash";
39+
40+
const ws = new WebSocket(url, {
41+
headers: {
42+
"api-key": `${LITELLM_API_KEY}`,
43+
"OpenAI-Beta": "realtime=v1",
44+
},
45+
});
46+
47+
ws.on("open", function open() {
48+
console.log("Connected to server.");
49+
ws.send(JSON.stringify({
50+
type: "response.create",
51+
response: {
52+
modalities: ["text"],
53+
instructions: "Please assist the user.",
54+
}
55+
}));
56+
});
57+
58+
ws.on("message", function incoming(message) {
59+
console.log(JSON.parse(message.toString()));
60+
});
61+
62+
ws.on("error", function handleError(error) {
63+
console.error("Error: ", error);
64+
});
65+
```
66+
67+
## Limitations
68+
69+
- Does not support audio transcription.
70+
- Does not support tool calling
71+
72+
## Supported OpenAI Realtime Events
73+
74+
- `session.created`
75+
- `response.created`
76+
- `response.output_item.added`
77+
- `conversation.item.created`
78+
- `response.content_part.added`
79+
- `response.text.delta`
80+
- `response.audio.delta`
81+
- `response.text.done`
82+
- `response.audio.done`
83+
- `response.content_part.done`
84+
- `response.output_item.done`
85+
- `response.done`
86+
87+
88+
89+
## [Supported Session Params](https://github.com/BerriAI/litellm/blob/e87b536d038f77c2a2206fd7433e275c487179ee/litellm/llms/gemini/realtime/transformation.py#L155)
90+
91+
## More Examples
92+
### [Gemini Realtime API with Audio Input/Output](../../../docs/tutorials/gemini_realtime_with_audio)

docs/my-website/docs/providers/litellm_proxy.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -163,9 +163,11 @@ LiteLLM Proxy works seamlessly with Langchain, LlamaIndex, OpenAI JS, Anthropic
163163

164164
[Learn how to use LiteLLM proxy with these libraries →](../proxy/user_keys)
165165

166-
## Flags to send requests to litellm proxy
166+
## Send all SDK requests to LiteLLM Proxy
167167

168-
Use the following options to route all requests through your LiteLLM proxy, regardless of the model specified.
168+
Use this when calling LiteLLM Proxy from any library / codebase already using the LiteLLM SDK.
169+
170+
These flags will route all requests through your LiteLLM proxy, regardless of the model specified.
169171

170172
When enabled, requests will use `LITELLM_PROXY_API_BASE` with `LITELLM_PROXY_API_KEY` as the authentication.
171173

0 commit comments

Comments
 (0)