Add buffer in the maximum number of tokens generated (to fix #353) #354

viswavi · 2023-09-20T00:49:31Z

Description

Issue #353 found that the LiteLLM agent is now reporting that our model's request is exceeding the maximum allowable number of tokens.

This problem stems from the fact that there's a disparity between tiktoken's tokenizer counts and:

the number of tokens that OpenAI's API perceives
the number of tokens that OpenAI's tokenizer playground perceives

For the full prompt parser prompt, tiktoken says there's 2569 tokens, so we set max_tokens for LiteLLM to be 4097 - 2569 = 1528
However, OpenAI's API perceives there to be 2576 tokens, which exceeds the 4097 limit, while OpenAI's tokenizer thinks there's 2862 tokens.

A naive solution here is to give a buffer; e.g. generate 300 fewer tokens than the maximum limit (so we would set max_tokens to be 1228 instead of 1528). This PR implements that solution.

References

N/A

Blocked by

N/A

neubig

This looks good, but maybe we want to reduce the token buffer to a smaller value? 300 is a pretty significant number of tokens and may reduce our flexibility somewhat. Could we get away with 100?

Also, just a github tip, you can write
Fixes #353

in the PR description (e.g. in the references section) and the issue will be auto-closed when the PR is merged.

neubig

Oh, actually, apologies. I see 300 is actually necessary for the prompt parser (per your message). I approve this PR.

krrishdholakia · 2023-10-11T18:42:31Z

@viswavi @neubig would it have been helpful here to have the token counter expose a buffer param, so you could've just added a manual buffer there?

ideally tiktoken would 1:1 map the api

neubig · 2023-10-11T18:52:29Z

Hmm, it sounds interesting but I'm not exactly sure what you mean by this?

krrishdholakia · 2023-10-11T19:05:49Z

Nvm - looks like i had an incorrect understanding of the problem. I thought you were using our token counting helper function

but it looks like it was being read from the response object.

Add buffer in the maximum number of tokens generated

95dd11c

viswavi requested review from neubig and saum7800 September 20, 2023 00:49

Add the token_buffer consistently in all subclasses

ae2db4a

neubig reviewed Sep 20, 2023

View reviewed changes

neubig approved these changes Sep 20, 2023

View reviewed changes

neubig merged commit b01a7f8 into main Sep 20, 2023

neubig deleted the vijay_add_buffer_in_number_of_tokens_for_litellm branch September 20, 2023 12:20

krrishdholakia mentioned this pull request Oct 11, 2023

[Feature]: Allow user to add buffer param to token counter BerriAI/litellm#585

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add buffer in the maximum number of tokens generated (to fix #353) #354

Add buffer in the maximum number of tokens generated (to fix #353) #354

Uh oh!

viswavi commented Sep 20, 2023

Uh oh!

neubig left a comment

Uh oh!

neubig left a comment

Uh oh!

krrishdholakia commented Oct 11, 2023

Uh oh!

neubig commented Oct 11, 2023

Uh oh!

krrishdholakia commented Oct 11, 2023 •

edited

Loading

Uh oh!

Uh oh!

Add buffer in the maximum number of tokens generated (to fix #353) #354

Add buffer in the maximum number of tokens generated (to fix #353) #354

Uh oh!

Conversation

viswavi commented Sep 20, 2023

Description

References

Blocked by

Uh oh!

neubig left a comment

Choose a reason for hiding this comment

Uh oh!

neubig left a comment

Choose a reason for hiding this comment

Uh oh!

krrishdholakia commented Oct 11, 2023

Uh oh!

neubig commented Oct 11, 2023

Uh oh!

krrishdholakia commented Oct 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

krrishdholakia commented Oct 11, 2023 •

edited

Loading