Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
johannhartmann
's Collections
Music
Computer Use Models
Document & UI Intelligence
Multimodal Models
Medical MultiModal
Multimodal Models
updated
Jan 28, 2025
A collection of multimodal models for the gpu poor
Upvote
3
google/paligemma-3b-pt-896
Image-Text-to-Text
•
3B
•
Updated
Jun 22, 2025
•
793
•
123
OpenGVLab/InternVL-Chat-V1-5
Image-Text-to-Text
•
Updated
Mar 25, 2025
•
3.23k
•
416
alexshengzhili/llava-v1.5-13b-dpo
Text Generation
•
Updated
Apr 13, 2024
•
40
•
5
llava-hf/llava-v1.6-mistral-7b-hf
Image-Text-to-Text
•
8B
•
Updated
Dec 22, 2025
•
718k
•
304
Qwen/Qwen-VL
Text Generation
•
Updated
Jan 25, 2024
•
46.5k
•
277
zai-org/cogvlm2-llama3-chat-19B
Text Generation
•
20B
•
Updated
Sep 3, 2024
•
2.59k
•
219
BK-Lee/MoAI-7B
Image-Text-to-Text
•
9B
•
Updated
Oct 2, 2024
•
8
•
45
01-ai/Yi-VL-34B
Image-Text-to-Text
•
Updated
Jun 26, 2024
•
18
•
264
mPLUG/DocOwl1.5-Omni
8B
•
Updated
Apr 10, 2024
•
91
•
17
google/paligemma-3b-ft-docvqa-896
Image-Text-to-Text
•
3B
•
Updated
Jul 19, 2024
•
326
•
9
Lin-Chen/open-llava-next-llama3-8b
Image-Text-to-Text
•
8B
•
Updated
May 27, 2024
•
9
•
26
Mizukiluke/mplug_owl_2_1
Updated
Jan 31, 2024
•
4
•
12
HuanjinYao/DenseConnector-v1.5-8B
Image-to-Text
•
8B
•
Updated
May 26, 2024
•
7
microsoft/Phi-3-vision-128k-instruct
Text Generation
•
Updated
Dec 10, 2025
•
93.5k
•
971
tiiuae/falcon-11B-vlm
Image-Text-to-Text
•
11B
•
Updated
Jun 12, 2024
•
275
•
47
AIDC-AI/Ovis1.5-Llama3-8B
Image-Text-to-Text
•
Updated
Feb 26, 2025
•
13
•
27
HuggingFaceM4/Idefics3-8B-Llama3
Image-Text-to-Text
•
Updated
Dec 2, 2024
•
157k
•
302
openbmb/MiniCPM-V-2_6
Image-Text-to-Text
•
Updated
Jun 13, 2025
•
109k
•
1.03k
microsoft/Florence-2-large
Image-Text-to-Text
•
0.8B
•
Updated
Aug 4, 2025
•
1.23M
•
1.77k
allenai/Molmo-7B-D-0924
Image-Text-to-Text
•
Updated
Dec 15, 2025
•
36.9k
•
565
meta-llama/Llama-3.2-11B-Vision-Instruct
Image-Text-to-Text
•
Updated
Dec 4, 2024
•
279k
•
1.57k
BAAI/Emu3-Gen
Any-to-Any
•
8B
•
Updated
Oct 23, 2024
•
7.53k
•
224
vidore/colpali-v1.2
Visual Document Retrieval
•
Updated
Mar 14, 2025
•
23.5k
•
113
Qwen/Qwen2-VL-2B-Instruct
Image-Text-to-Text
•
Updated
Jan 12, 2025
•
3.48M
•
495
deepseek-ai/Janus-1.3B
Any-to-Any
•
Updated
Jan 27, 2025
•
4.16k
•
593
NexaAI/OmniVLM-968M
0.5B
•
Updated
Aug 20, 2025
•
1.59k
•
530
Xkev/Llama-3.2V-11B-cot
Image-Text-to-Text
•
11B
•
Updated
Nov 16, 2025
•
781
•
158
alibaba-damo/mgp-str-base
Image-to-Text
•
0.1B
•
Updated
Dec 11, 2023
•
110k
•
65
omkarthawakar/LlamaV-o1
Question Answering
•
Updated
Jan 13, 2025
•
66
•
96
openbmb/MiniCPM-o-2_6
Any-to-Any
•
Updated
Oct 5, 2025
•
118k
•
1.29k
deepseek-ai/Janus-Pro-7B
Any-to-Any
•
Updated
Feb 1, 2025
•
59.8k
•
3.57k
Upvote
3
Share collection
View history
Collection guide
Browse collections