Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Cuiunbo 's Collections
VLM dataset
MiniCPM-V
VLM For OCR
Dataset For OCR
audio

VLM For OCR

updated Jun 29, 2024
Upvote
4

  • Qwen/Qwen-VL

    Text Generation • Updated Jan 25, 2024 • 18.4k • 269

  • google/pix2struct-large

    Image-to-Text • 1B • Updated Sep 6, 2023 • 1.46k • 34

  • zai-org/cogagent-chat-hf

    Text Generation • 18B • Updated Dec 24, 2024 • 265 • 69

  • openbmb/MiniCPM-Llama3-V-2_5

    Image-Text-to-Text • 9B • Updated Jan 15 • 58.7k • 1.4k

  • google/paligemma-3b-pt-896

    Image-Text-to-Text • 3B • Updated Jun 22 • 371 • 123

  • UCSC-VLAA/Recap-DataComp-1B

    Viewer • Updated Jan 9 • 1.88B • 5.79k • 191

  • WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

    Paper • 2406.11069 • Published Jun 16, 2024 • 14

  • pbevan11/synthetic-ocr-correction-gpt4o

    Viewer • Updated Jul 25, 2024 • 10k • 81 • 5

  • yifeihu/ACL-23-Paper-OCR-Markdown

    Viewer • Updated Jun 8, 2024 • 2.15k • 61 • 19

  • LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

    Paper • 2406.15319 • Published Jun 21, 2024 • 64
Upvote
4
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs