Skip to content

Fix/issue 10113 embeddings use non default tokenizer #10629

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

camfarineau
Copy link
Contributor

@camfarineau camfarineau commented May 7, 2025

Title

Embeddings: Not use default gpt-3.5-turbo's tokenizer

Relevant issues

Fixes #10113

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🐛 Bug Fix

Changes

  • embeddings: allow for passthrough of list of lists of tokens to self hosted vllm models

litellm_issue_10113_unit_test_local

Copy link

vercel bot commented May 7, 2025

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback May 13, 2025 8:46am

@CLAassistant
Copy link

CLAassistant commented May 7, 2025

CLA assistant check
All committers have signed the CLA.

@@ -3814,14 +3814,15 @@ async def embeddings( # noqa: PLR0915
if m["model_name"] == data["model"] and (
m["litellm_params"]["model"] in litellm.open_ai_embedding_models
or m["litellm_params"]["model"].startswith("azure/")
or m["litellm_params"]["model"].startswith("hosted_vllm/")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a unit test for this in test_proxy_server.py in tests/litellm -

So there are no future regressions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally - this can be a list in constants.py - stating the list of Llmproviders which support an input of array of tokens

Constants.py file -

DEFAULT_BATCH_SIZE = 512

example list -

LITELLM_CHAT_PROVIDERS = [

…accept a arrays of tokens as input

When passing a list of tokens as input, verify the provider of the model by going through the list of models (`llm_model_list`). First, it check for model name then get the provider and verify if it accept or not arrays of tokens. If yes, then pass, else decode.
Previously, it was verifying provider and model name at the same time resulting in decoding even if the current model checked was not the target one (looping onto `llm_model_list`)
@camfarineau
Copy link
Contributor Author

Screenshot of the new test passing locally
litellm_issue_10113_unit_test_local

@camfarineau camfarineau marked this pull request as ready for review May 13, 2025 08:46
@krrishdholakia krrishdholakia merged commit b88e56e into BerriAI:main May 15, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Proxy Server - Embeddings - Error while trying to decode tokens from non-OpenAI models using a default gpt-3.5-turbo tokenizer
3 participants