chandra-grpo-lora-10-ep-50-docs

LoRA adapter for fine-tuned Chandra OCR model trained with GRPO (Group Relative Policy Optimization).

Model Details

Base Model: allenai/olmOCR-2-7B-1025
Training Method: GRPO with unit test rewards
Adapter Type: LoRA (Low-Rank Adaptation)
Parameters: ~0.86% of base model trainable
Size: ~50MB

Training Details

Usage

With PEFT (Recommended)

from peft import PeftModel, PeftConfig
from transformers import AutoModelForVision2Seq, AutoProcessor

# Load configuration
config = PeftConfig.from_pretrained("suv11235/chandra-grpo-lora-10-ep-50-docs")

# Load base model
base_model = AutoModelForVision2Seq.from_pretrained(
    config.base_model_name_or_path,
    torch_dtype="auto",
    trust_remote_code=True,
)

# Load model with LoRA adapter
model = PeftModel.from_pretrained(base_model, "suv11235/chandra-grpo-lora-10-ep-50-docs")

# Load processor
processor = AutoProcessor.from_pretrained(
    config.base_model_name_or_path,
    trust_remote_code=True,
)

# Use for inference
# ... (process images and generate text)

Merge for Deployment

If you need a standalone model (e.g., for vLLM):

from peft import PeftModel, PeftConfig
from transformers import AutoModelForVision2Seq

# Load with adapter
config = PeftConfig.from_pretrained("suv11235/chandra-grpo-lora-10-ep-50-docs")
base_model = AutoModelForVision2Seq.from_pretrained(
    config.base_model_name_or_path,
    torch_dtype="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, "suv11235/chandra-grpo-lora-10-ep-50-docs")

# Merge and save
merged_model = model.merge_and_unload()
merged_model.save_pretrained("./merged-model")

Performance

Trained to improve unit test pass rates on document OCR tasks.

Limitations

Requires PEFT library for loading
For vLLM/TGI deployment, merge the adapter first
Base model required (~14GB download)

Citation

If you use this model, please cite the base Chandra model and this fine-tuned version.

Downloads last month: 31

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for suv11235/chandra-grpo-lora-10-ep-50-docs

Base model

Qwen/Qwen2.5-VL-7B-Instruct

Finetuned

allenai/olmOCR-2-7B-1025

Adapter

(4)

this model