chandra-grpo-lora-10-ep-50-docs
LoRA adapter for fine-tuned Chandra OCR model trained with GRPO (Group Relative Policy Optimization).
Model Details
- Base Model: allenai/olmOCR-2-7B-1025
- Training Method: GRPO with unit test rewards
- Adapter Type: LoRA (Low-Rank Adaptation)
- Parameters: ~0.86% of base model trainable
- Size: ~50MB
Training Details
Usage
With PEFT (Recommended)
from peft import PeftModel, PeftConfig
from transformers import AutoModelForVision2Seq, AutoProcessor
# Load configuration
config = PeftConfig.from_pretrained("suv11235/chandra-grpo-lora-10-ep-50-docs")
# Load base model
base_model = AutoModelForVision2Seq.from_pretrained(
config.base_model_name_or_path,
torch_dtype="auto",
trust_remote_code=True,
)
# Load model with LoRA adapter
model = PeftModel.from_pretrained(base_model, "suv11235/chandra-grpo-lora-10-ep-50-docs")
# Load processor
processor = AutoProcessor.from_pretrained(
config.base_model_name_or_path,
trust_remote_code=True,
)
# Use for inference
# ... (process images and generate text)
Merge for Deployment
If you need a standalone model (e.g., for vLLM):
from peft import PeftModel, PeftConfig
from transformers import AutoModelForVision2Seq
# Load with adapter
config = PeftConfig.from_pretrained("suv11235/chandra-grpo-lora-10-ep-50-docs")
base_model = AutoModelForVision2Seq.from_pretrained(
config.base_model_name_or_path,
torch_dtype="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, "suv11235/chandra-grpo-lora-10-ep-50-docs")
# Merge and save
merged_model = model.merge_and_unload()
merged_model.save_pretrained("./merged-model")
Performance
Trained to improve unit test pass rates on document OCR tasks.
Limitations
- Requires PEFT library for loading
- For vLLM/TGI deployment, merge the adapter first
- Base model required (~14GB download)
Citation
If you use this model, please cite the base Chandra model and this fine-tuned version.
- Downloads last month
- 31
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support