NVIDIA object never overrides GPT2 Fast Tokenizer

I am testing out a basic RAG pipeline by chunking HTML pages to use as a document library. To determine if a chunk is larger enough, I am attempting to count the number of tokens in the chunk currently, however after finding some weird behavior I dug into the class code a little bit and found that the NVIDIA class never overrides the default behavior of the inherited BaseLanguageModel.

It always will instantiate a tokenizer by using the [transformers library to download the GPT2 fast tokenizer](https://github.com/langchain-ai/langchain/blob/73fefe02950cb649fd7ac8a24189cf458a091423/libs/core/langchain_core/language_models/base.py#L64) completely ignoring any idea of the Nvidia NIM I am directing it at. 

Is this expected behavior? If so, why? The documentation for NVIDIA.get_num_tokens() states "Useful for checking if an input fits in a model’s context window." but how is this useful if the token count retrieved is for a different model's tokenizer? I can see that the BaseLanguageModel class [does check if the `custom_get_token_ids` attribute is set](https://github.com/langchain-ai/langchain/blob/73fefe02950cb649fd7ac8a24189cf458a091423/libs/core/langchain_core/language_models/base.py#L351) and uses that if so, so I can set that myself, but why is the object setting that on instantiation and checking with the NIM in some way?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVIDIA object never overrides GPT2 Fast Tokenizer #164

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

NVIDIA object never overrides GPT2 Fast Tokenizer #164

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions