Plan for much smaller model (quantized model)?

by ydmhmhm - opened 16 days ago

16 days ago

Hello! I'm interested in trying this model, but it's too big to run. Do you guys have a plan for quantizing the model for consumer GPUs? Something for like 16BGB VRAM? Thanks!

Andyx1976

6 days ago

a quant of an 83b model is still gigantic, Or compressed to oblivion.
I wonder if this monstrosity of a model is as good as it is bigger than the competition? If it were, it surely would make more waves. People still raving on about Nanobanana instead. Hunyuan having a bit of a bad streak in this area, HY video 1.5 also made hardly a splash (rightly). While in the image world smaller models ran away with it.

Hunyuan2.1 was a very mixed experience. Great capabilities of what it could do in terms of in prompt understanding (no editing or anything though) but images looked rather meh.
Flux2.dev is also big (nowhere near as much as this, 32b vs 80b) . And if you use a quant small enough to run reasonably well on a 16gb card, that quality is probably dropping below those. I use it sometimes on a 5090 (q8,still 32gb, text encoder fp8 17gb) . Imaging HY3 being 2-3 times bigger.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment