microsoft/phi-2 model outputs nonsense on NPU #27
Description
Describe the bug
After I compile the microsoft/phi-2 model with intel_npu_acceleration_library the output of the model is complete nonsense. It just outputs text like to- or in of ", as for, on, and, is,, and, are,., and,,,,, and,,,, and,,, and,,,, and,,, and,, and,,, and,,
To Reproduce
Steps to reproduce the behavior:
from` langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
from langchain.llms import HuggingFacePipeline
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import intel_npu_acceleration_library
import torch
model_id = "microsoft/Phi-2"
model = AutoModelForCausalLM.from_pretrained(model_id, use_cache=True).eval()
tokenizer = AutoTokenizer.from_pretrained(model_id, use_default_system_prompt=True)
npu_model = intel_npu_acceleration_library.compile(model, dtype=torch.int8)
pipe = pipeline(
"text-generation",
model=npu_model,
tokenizer=tokenizer,
max_length=256,
temperature=0.9,
top_p=0.95,
repetition_penalty=1.2
)
local_llm = HuggingFacePipeline(pipeline=pipe)
pipe.model.config.pad_token_id = pipe.model.config.eos_token_id
template = """Question: {question}
Answer: """
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(
prompt=prompt,
llm=local_llm
)
question = "What's the distance between the Earth and the Moon?"
print(llm_chain.run(question))
The output is:
Question: What's the distance between the Earth and the Moon?
Answer: to- or in of ", as for, on, and, is,, and, are,., and,,,,, and,,,, and,,, and,,,, and,,, and,, and,,, and,,, and,,, and,, and,, and,, and,, a....
Expected behavior
When running the initial model (the one compiled for CPU) the output is:
_Question: What's the distance between the Earth and the Moon?
Answer: The average distance from the Earth to the moon is about 238,855 miles._
Desktop (please complete the following information):
- OS: Windows 11