Assistance Needed: Unrecognized Model Error in AutoTokenizer.from_pretrained for Gemma-2-2B

Hello everyone,

I’m encountering an issue when trying to load my locally stored model (Gemma-2-2b) in my FastAPI application. Specifically, when I call:

tokenizer = AutoTokenizer.from_pretrained(
“./models/Gemma-2-2b”,
trust_remote_code=True,
use_auth_token=“MY_Token”
)
model = AutoModelForCausalLM.from_pretrained(
“./models/Gemma-2-2b”,
torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
device_map=“auto” if torch.cuda.is_available() else None,
trust_remote_code=True,
use_auth_token=“MY_Token”
)

I receive the following error:

ValueError: Unrecognized model in ./models/Gemma-2-2b. Should have a model_type key in its config.json, or contain one of the following strings in its name: albert, align, altclip, aria, aria_text, audio-spectrogram-transformer…

What I’ve Tried:

I verified that my directory structure is correct.
I ensured I’m authenticated (using use_auth_token="MY_Token" and logging in via huggingface-cli login).
I attempted to patch the model’s config.json manually to add "model_type": "gemma", but that approach led to additional errors.
I also tried using trust_remote_code=True without success.

My Questions:

Does Gemma-2-2b require additional configuration beyond the standard from_pretrained() approach?
Are there known compatibility issues with the latest version of Transformers and this model?
What are the best practices for resolving model loading errors when the config.json lacks a proper model_type key?
Is there an alternative method to properly initialize this model in a FastAPI setting?

System Information:

OS: Ubuntu 22.04 LTS
Python Version: 3.10
Transformers Version: 4.48.3
Model Path: ./models/Gemma-2-2b
Framework: FastAPI + Uvicorn

Sending Love and light.

This is an error that often occurs when you have an old version of Transformers, but I don’t know if that’s the case.

Hi John,
I hope you are having a beautiful day.

Thank you for your response. Yes, you are correct, it’s certainly not the version error.

Any guesses?

Sending Love,
thelifeofjoun

By the way, since it seems to be used locally, I think that tokens and trust_remote_code are not necessary.

tokenizer = AutoTokenizer.from_pretrained("./models/Gemma-2-2b")
model = AutoModelForCausalLM.from_pretrained(
    "./models/Gemma-2-2b",
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    device_map=“auto” if torch.cuda.is_available() else None,
)