Trouble Loading a Fine-Tuned Model Locally After Training on HF - Any Ideas?

Regarding the error message mentioning config: Personally, I think the most common issues are forgetting to call tokenizer.save_pretrained(), encountering some error after saving the PEFT adapter and then calling model.from_pretrained() (though it sometimes works fine), and the repository containing the original model referenced by the PEFT adapter being corrupted or incomplete.


Yes, this is very common. Almost all such errors fall into a small set of causes, and the fixes are fairly systematic once you know how Hugging Face saves things.

I’ll walk through:

  1. What from_pretrained() expects (background)
  2. The main causes of “missing/incompatible configuration”
  3. Concrete solutions and code patterns for each
  4. A short checklist you can use for any future model

1. Background: what from_pretrained() expects

When you call something like:

from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("you/your-model")

transformers assumes "you/your-model" is a full Transformers checkpoint, i.e. a repo or folder that contains at least:

  • config.json — with a valid "model_type" and hyperparameters
  • Weight files — pytorch_model.bin or model-00001-of-0000n.safetensors (+ index)
  • Tokenizer files — tokenizer.json or tokenizer.model, tokenizer_config.json, special_tokens_map.json, etc. (Hugging Face)

Internally, it does:

  1. AutoConfig.from_pretrained(...) → reads config.json, looks at model_type.
  2. Uses that to pick the correct model class.
  3. Loads weights into that model.

If config.json is missing, broken, or your transformers version doesn’t understand the model_type, you get exactly the “missing or incompatible configuration” errors you’re seeing.

Now, the catch: a lot of HF-hosted fine-tuning jobs don’t produce a full checkpoint—they produce PEFT/LoRA adapters, which are a different format.


2. Cause #1: You actually have a LoRA / PEFT adapter repo, not a full model

This is the single most common cause.

What the repo looks like

Open your model repo and check the file list. If you see something like:

  • adapter_model.safetensors (or adapter_model.bin)
  • adapter_config.json
  • maybe training_args.bin
  • maybe tokenizer files

and there is no config.json and no big weight files, then you have a PEFT adapter-only checkpoint, not a full model.

The PEFT docs explicitly say a PEFT checkpoint consists of:

  • adapter_model.safetensors +
  • adapter_config.json (Hugging Face)

And the quicktour shows an example adapter repo that only has those two files. (Hugging Face)

Why this causes your error

If you call:

AutoModelForCausalLM.from_pretrained("you/your-finetune")

on that adapter repo, transformers looks for config.json, doesn’t find it, and throws:

OSError: you/your-finetune does not appear to have a file named config.json (Stack Overflow)

This exact pattern shows up in:

  • A StackOverflow question where someone fine-tuned LLaMA with QLoRA, pushed to HF, and their repo had only adapters; loading with AutoModelForCausalLM.from_pretrained fails with “does not appear to have a file named config.json”. (Stack Overflow)
  • AutoTrain and autotrain-advanced issues: “fine tuned model being pushed to HF repo doesn’t have config.json” and “Missing config.json file after AutoTraining” — both resolved by explaining that AutoTrain saved an adapter model (PEFT), not a full checkpoint. (Stack Overflow)
  • Multiple PEFT/Transformers GitHub issues where people try to load adapter repos directly and get the same error. (GitHub)

So this is not a bug with your training. It’s just that you’re holding a different kind of object (an adapter), but trying to load it with the full-model API.

Solution 1A: Load it as base model + adapter (PEFT way)

You need two IDs:

  • base_model_id — the original model you fine-tuned (e.g. meta-llama/Meta-Llama-3-8B)
  • adapter_id — your fine-tuned repo on HF (the one with adapter_model.safetensors)

Then:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "the/base/model/you/used"
adapter_id    = "your-username/your-finetune"

tokenizer = AutoTokenizer.from_pretrained(adapter_id or base_model_id)

base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",   # or "cpu"
    torch_dtype="auto",
)

model = PeftModel.from_pretrained(
    base_model,
    adapter_id,
)

This is exactly how the PEFT quicktour shows loading adapters: load the base model, then wrap it with PeftModel.from_pretrained pointing at the adapter repo. (Hugging Face)

Solution 1B: Merge the adapter into a full model and save

If you want a single model that loads with plain AutoModelForCausalLM.from_pretrained(...) and no PEFT code, you can merge the LoRA weights into the base model:

from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer

adapter_id = "your-username/your-finetune"

# Load adapter as a PEFT model
model = AutoPeftModelForCausalLM.from_pretrained(
    adapter_id,
    torch_dtype="auto",
    device_map="cpu",   # merge on CPU to avoid GPU OOM
)

# Merge LoRA weights into base model and drop adapter structure
merged_model = model.merge_and_unload()

save_dir = "./merged-model"
merged_model.save_pretrained(save_dir, safe_serialization=True)

tokenizer = AutoTokenizer.from_pretrained(adapter_id, use_fast=True)
tokenizer.save_pretrained(save_dir)

Now you have a normal folder:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("./merged-model")
tokenizer = AutoTokenizer.from_pretrained("./merged-model")

Several TRL/PEFT examples and discussions recommend exactly this pattern for turning QLoRA fine-tunes into deployable full models. (Stack Overflow)


3. Cause #2: Broken or incomplete config.json in a “full” checkpoint

Sometimes you do have a config.json, but from_pretrained() still fails with an “incompatible config” message such as:

  • ValueError: Unrecognized model in ./trained_model. Should have a model_type key in its config.json... (Hugging Face Forums)

This means:

  • The file exists, but is missing critical fields (especially "model_type"), or
  • It’s in a format your transformers version doesn’t understand.

Examples online:

  • HF forum thread: user tries to load ./trained_model and gets Unrecognized model ... Should have a model_type key in its config.json; root cause: they saved weights manually or used a custom config without model_type. (Hugging Face Forums)
  • StackOverflow: “Unrecognized model type when loading my trained custom transformer”; fix was to add a "model_type" key and use save_pretrained(...) to create a proper config. (Stack Overflow)

How to diagnose

Run:

from transformers import AutoConfig

cfg = AutoConfig.from_pretrained("you/your-model")
print(cfg)

If this fails with “unrecognized model” or says config.json is missing model_type, your config is the problem.

Solutions

  1. If you fine-tuned an existing HF model

    • Download the base model’s config.json from its Hub page.

    • Compare it with your fine-tuned repo’s config.json.

    • Make sure:

      • "model_type" is present and identical (e.g. "llama", "mistral", etc.).
      • Any essential architecture fields (hidden size, number of layers) match.
    • Re-upload a corrected config.json to your fine-tuned repo.

    This is exactly what people do in AutoTrain issues: they manually create or fix config.json by copying from the base model, then upload it; afterwards from_pretrained() works. (GitHub)

  2. Always save with save_pretrained() in your own code

If you’re saving locally:

model.save_pretrained("my_dir")
tokenizer.save_pretrained("my_dir")

This writes a correct config.json with model_type so that from_pretrained("my_dir") works later.


4. Cause #3: transformers version too old for the model type

Recent models (Gemma 2, Qwen2.5, custom architectures) use new model_type values and new config classes. If you fine-tune on a recent environment and then try to load locally with an older transformers, you may see:

  • Unrecognized configuration class ...
  • Unrecognized model ... even though config.json looks fine. (Hugging Face Forums)

This is mentioned in multiple HF discussions about model_type errors for newer models. (Hugging Face Forums)

Solution

Upgrade your libraries:

pip install -U "transformers" "peft" "accelerate" "safetensors"

Then re-run:

from transformers import AutoConfig
cfg = AutoConfig.from_pretrained("you/your-model")
print(cfg.model_type)

If this works, from_pretrained() should also work (assuming the other issues are resolved).


5. Cause #4: Using the wrong AutoModel* class

Even with a good config, using the wrong AutoModel can give “incompatible configuration” errors.

Examples:

  • Trying to load a decoder-only LLM (LLaMA, Mistral, etc.) with AutoModelForSeq2SeqLM instead of AutoModelForCausalLM.
  • Loading a classification head with AutoModelForCausalLM, etc.

The auto classes check expected heads and architectures; mismatches can show up as config incompatibility.

Solution

Match the class to the task/model:

  • Chat / decoder-only LLM → AutoModelForCausalLM
  • T5/BART-like encoder-decoder → AutoModelForSeq2SeqLM
  • BERT/RoBERTa classifiers → AutoModelForSequenceClassification

You can confirm from the base model’s model card or its config.architectures.


6. Cause #5: Path / cache confusion (local folder vs Hub, etc.)

Sometimes the problem is just that from_pretrained() isn’t actually reading the repo you think it is.

Common pitfalls:

  • You have a local folder with the same name as the Hub repo (./you/your-model) that does not contain config.json. from_pretrained("you/your-model") prefers the local folder and fails. (GitHub)
  • You downloaded only part of a model (e.g. via --include "original/*") and the local directory is incomplete. (Hugging Face)

Solutions

  • Try loading explicitly from the Hub:

    model = AutoModelForCausalLM.from_pretrained("you/your-model", force_download=True)
    
  • Check your current working directory; rename or remove any conflicting local folder.

  • If you suspect cache corruption, clear ~/.cache/huggingface/hub (careful: this removes cached snapshots).


7. Practical checklist for your case

Given your description, here’s how I’d debug your exact situation:

  1. Inspect the repo on HF

    • If you see adapter_model.safetensors + adapter_config.json and no config.json:
      → It’s a PEFT adapter; use PEFT loading (Cause #1, Solutions 1A / 1B). (Hugging Face)
  2. If config.json is there, validate it

    from transformers import AutoConfig
    cfg = AutoConfig.from_pretrained("you/your-model")
    print(cfg)
    
    • If this fails: fix config (model_type) or upgrade transformers (Cause #2/#3). (Hugging Face Forums)
  3. Check your load code

    • Decoder-only LLM → AutoModelForCausalLM.
    • Make sure you’re not pointing at the wrong path or a partial local folder.
  4. Once fixed, standard pattern

    Full model:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model = AutoModelForCausalLM.from_pretrained("you/your-model")
    tokenizer = AutoTokenizer.from_pretrained("you/your-model")
    

    Adapter:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    from peft import PeftModel
    
    base_id    = "base/model"
    adapter_id = "you/your-finetune"
    
    tok   = AutoTokenizer.from_pretrained(adapter_id or base_id)
    base  = AutoModelForCausalLM.from_pretrained(base_id)
    model = PeftModel.from_pretrained(base, adapter_id)
    

8. Short, easy-to-skim summary

  • from_pretrained() expects a full HF model directory with config.json + full weights + tokenizer.

  • Many Hub fine-tuning flows (AutoTrain, QLoRA, LoRA) save PEFT adapters only (adapter_model.safetensors, adapter_config.json) and no config.json. Trying to load those with AutoModel...from_pretrained() gives exactly the “missing config” / “incompatible configuration” errors you’re seeing. (Hugging Face)

  • For adapter repos, the correct solutions are:

    • Load base model + adapter via PeftModel.from_pretrained, or
    • Merge the adapter into the base with AutoPeftModelForCausalLMmerge_and_unload()save_pretrained() to create a full model. (Hugging Face)
  • If config.json exists but is “incompatible”, either:

    • It’s missing "model_type" / malformed (fix by copying from base and/or always using save_pretrained()), or
    • Your transformers is too old for this model_type; upgrade to a newer version. (Hugging Face Forums)
  • Also check for simpler issues: wrong AutoModel* class, local folders shadowing the Hub repo name, or partial downloads.