Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

1,089

Full-text search

Active filters: llama.cpp

qwp4w3hyb/Meta-Llama-3-8B-Instruct-iMat-GGUF

Text Generation • 8B • Updated Apr 29, 2024 • 1.25k • 6

mgonzs13/Mistroll-7B-v2.2-GGUF

Text Generation • 7B • Updated Apr 29, 2024 • 181

mgonzs13/ladybird-base-7B-v8-GGUF

Text Generation • 7B • Updated Apr 29, 2024 • 38

google/codegemma-1.1-2b-GGUF

Text Generation • 3B • Updated Jun 27, 2024 • 22 • 3

google/codegemma-1.1-7b-it-GGUF

Text Generation • 9B • Updated Jun 27, 2024 • 12 • 13

openbmb/MiniCPM-Llama3-V-2_5-gguf

Updated Feb 27 • 3.53k • 215

QuantFactory/Ahma-3B-GGUF

Text Generation • 4B • Updated Jul 2 • 700 • 2

mgonzs13/TextBase-7B-v0.1-GGUF

Text Generation • 7B • Updated Jun 11, 2024 • 66

QuantFactory/TextBase-7B-v0.1-GGUF

Text Generation • 7B • Updated Jun 18, 2024 • 185

njwright92/ComicBot_v.2-gguf

Text Generation • 7B • Updated Aug 30, 2024 • 189

Irathernotsay/qwen2-1.5B-medical_qa-Finetune

Text Generation • 2B • Updated Jul 17, 2024 • 11

palusi/Qwen2-0.5B-Instruct-GGUF

0.5B • Updated Jun 27, 2024 • 107

XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k

Text Generation • Updated Jul 9, 2024 • 32

ruslanmv/Medical-Llama3-v2-Q4_K_M-GGUF

8B • Updated Jun 30, 2024 • 50

XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k-GGUF

Text Generation • Updated Jul 9, 2024 • 77

XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k-GPTQ

Text Generation • Updated Jul 9, 2024 • 8

zhhan/Phi-3-mini-4k-instruct_gguf_derived

Summarization • 4B • Updated Jul 2, 2024 • 36

XavierSpycy/Meta-Llama-3-8B-Instruct-zh-10k-AWQ

Text Generation • Updated Jul 9, 2024 • 9

mgonzs13/stablelm-zephyr-3B-localmentor-GGUF

Text Generation • 3B • Updated Jul 3, 2024 • 161

google/gemma-2-2b-it-GGUF

3B • Updated Aug 27, 2024 • 187 • 85

google/gemma-2-2b-GGUF

3B • Updated Aug 2, 2024 • 110 • 17

chatpdflocal/llama3.1-8b-gguf

8B • Updated Dec 27, 2024 • 438 • 29

akshathmangudi/llama3.1-8b-gguf

Updated Jul 26, 2024

dahara1/llama-translate-gguf

8B • Updated Aug 14, 2024 • 834 • 16

jhilburn/gemma-inference

Text Generation • Updated Aug 7, 2024

ghost-x/ghost-8b-beta-1608-gguf

Text Generation • 8B • Updated Aug 26, 2024 • 303 • 6

PaulJusst/codegemma-7b-it-GGUF

Text Generation • 9B • Updated Sep 13, 2024

TheCluster/Llama-3.2-3B-Instruct-GGUF

Text Generation • 3B • Updated Sep 25, 2024 • 4

v000000/Typhon-Mixtral-v1-imatrix-v2.Q6_K-GGUF

Updated Sep 26, 2024 • 72 • 1

LPN64/LongCite-llama3.1-8b-GGUF

Text Generation • 8B • Updated Oct 1, 2024 • 365 • 6