Difference between pipeline and model.generate?

I tried the following two things and find a significant difference between pipeline and model.generate to complete sequences.

model_pr = AutoModelForCausalLM.from_pretrained("gpt2")
tokenizer.decode(model_pr.generate(**input_tok)[0])
'My name is Merve and my favorite+ and my CR+ and my CR+ and my CR'

(2) Using pipeline to do that same thing

generator = pipeline('text-generation', model='gpt2') 
generator(input) 
[{'generated_text': 'My name is Merve and my favorite brand is Baskin-Robbins.\n\n"They bring a whole lot of stuff to the table and we have to come up with a new way of making a big deal," explains Jeff.'}]

I get a lot more sensible output for pipeline for some reason. My understanding was that both should have given similar responses.

Has anyone figured out the solution to this? Experiencing the same problem.

Hi,

After digging a bit into the code base, I found that GPT-2 by default uses do_sample=True and max_length=50 when generating text as seen here: config.json · openai-community/gpt2 at main.

Hence to get the equivalent behaviour, one can do the following:

from transformers import pipeline, set_seed, AutoModelForCausalLM, AutoTokenizer

set_seed(42)

pipe = pipeline(model="gpt2")

prompt = "hello world, my name is"

result = pipe(prompt, do_sample=False)["generated_text"]
print(result)

# equivalent to:
tokenizer = AutoTokenizer.from_pretrained("gpt2")
model = AutoModelForCausalLM.from_pretrained("gpt2")

inputs = tokenizer(prompt, return_tensors="pt")

generated_ids = model.generate(**inputs, do_sample=False, max_length=50)
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(generated_text)

The pipeline uses the generate() method behind the scenes, but uses some default generation keyword arguments which are documented here. I didn’t get equivalent results when using sampling (despite using `set_seed).