I am testing out a basic RAG pipeline by chunking HTML pages to use as a document library. To determine if a chunk is larger enough, I am attempting to count the number of tokens in the chunk currently, however after finding some weird behavior I dug into the class code a little bit and found that the NVIDIA class never overrides the default behavior of the inherited BaseLanguageModel.
It always will instantiate a tokenizer by using the transformers library to download the GPT2 fast tokenizer completely ignoring any idea of the Nvidia NIM I am directing it at.
Is this expected behavior? If so, why? The documentation for NVIDIA.get_num_tokens() states "Useful for checking if an input fits in a model’s context window." but how is this useful if the token count retrieved is for a different model's tokenizer? I can see that the BaseLanguageModel class does check if the custom_get_token_ids attribute is set and uses that if so, so I can set that myself, but why is the object setting that on instantiation and checking with the NIM in some way?
I am testing out a basic RAG pipeline by chunking HTML pages to use as a document library. To determine if a chunk is larger enough, I am attempting to count the number of tokens in the chunk currently, however after finding some weird behavior I dug into the class code a little bit and found that the NVIDIA class never overrides the default behavior of the inherited BaseLanguageModel.
It always will instantiate a tokenizer by using the transformers library to download the GPT2 fast tokenizer completely ignoring any idea of the Nvidia NIM I am directing it at.
Is this expected behavior? If so, why? The documentation for NVIDIA.get_num_tokens() states "Useful for checking if an input fits in a model’s context window." but how is this useful if the token count retrieved is for a different model's tokenizer? I can see that the BaseLanguageModel class does check if the
custom_get_token_idsattribute is set and uses that if so, so I can set that myself, but why is the object setting that on instantiation and checking with the NIM in some way?