chandra-grpo-lora-10-ep-50-docs

LoRA adapter for fine-tuned Chandra OCR model trained with GRPO (Group Relative Policy Optimization).

Model Details

  • Base Model: allenai/olmOCR-2-7B-1025
  • Training Method: GRPO with unit test rewards
  • Adapter Type: LoRA (Low-Rank Adaptation)
  • Parameters: ~0.86% of base model trainable
  • Size: ~50MB

Training Details

Usage

With PEFT (Recommended)

from peft import PeftModel, PeftConfig
from transformers import AutoModelForVision2Seq, AutoProcessor

# Load configuration
config = PeftConfig.from_pretrained("suv11235/chandra-grpo-lora-10-ep-50-docs")

# Load base model
base_model = AutoModelForVision2Seq.from_pretrained(
    config.base_model_name_or_path,
    torch_dtype="auto",
    trust_remote_code=True,
)

# Load model with LoRA adapter
model = PeftModel.from_pretrained(base_model, "suv11235/chandra-grpo-lora-10-ep-50-docs")

# Load processor
processor = AutoProcessor.from_pretrained(
    config.base_model_name_or_path,
    trust_remote_code=True,
)

# Use for inference
# ... (process images and generate text)

Merge for Deployment

If you need a standalone model (e.g., for vLLM):

from peft import PeftModel, PeftConfig
from transformers import AutoModelForVision2Seq

# Load with adapter
config = PeftConfig.from_pretrained("suv11235/chandra-grpo-lora-10-ep-50-docs")
base_model = AutoModelForVision2Seq.from_pretrained(
    config.base_model_name_or_path,
    torch_dtype="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, "suv11235/chandra-grpo-lora-10-ep-50-docs")

# Merge and save
merged_model = model.merge_and_unload()
merged_model.save_pretrained("./merged-model")

Performance

Trained to improve unit test pass rates on document OCR tasks.

Limitations

  • Requires PEFT library for loading
  • For vLLM/TGI deployment, merge the adapter first
  • Base model required (~14GB download)

Citation

If you use this model, please cite the base Chandra model and this fine-tuned version.

Downloads last month
31
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for suv11235/chandra-grpo-lora-10-ep-50-docs

Adapter
(4)
this model