whisper.cpp

Running

App Files Files Community

whisper.cpp / ggml /src /ggml-cuda /CMakeLists.txt

Commit History

CUDA cmake: add `-lineinfo` for easier debug (llama/15260)

008e169

am17an commited on Aug 12, 2025

cuda: remove linking to cublasLt (llama/14790)

fafaa8b

yeahdongcn commited on Jul 21, 2025

CUDA: FA support for Deepseek (Ampere or newer) (llama/13306)

507d30c

JohannesGaessler commited on May 9, 2025

CUDA: mix virt/real CUDA archs for GGML_NATIVE=OFF (llama/13135)

9fb68a1

JohannesGaessler commited on May 6, 2025

build : fix build info on windows (llama/13239)

415b9fc

Diego Devesa commited on May 1, 2025

CUDA: compress mode option and default to size (llama/12029)

4ec988a

Erik Scholz commited on Mar 1, 2025

CUDA: app option to compile without FlashAttention (llama/12025)

fbc5f16

JohannesGaessler commited on Feb 22, 2025

CUDA: correct the lowest Maxwell supported by CUDA 12 (llama/11984)

6641178

PureJourney

JohannesGaessler commited on Feb 21, 2025

cuda : add ampere to the list of default architectures (llama/11870)

1d19dec

Diego Devesa commited on Feb 14, 2025

CUDA: use mma PTX instructions for FlashAttention (llama/11583)

f328957

JohannesGaessler Diego Devesa commited on Feb 2, 2025

ggml : sync remnants (skip) (#0)

451937f
unverified

ggerganov commited on Dec 8, 2024

ggml : sync resolve (skip) (#0)

d4d67dc

ggerganov commited on Nov 19, 2024

Commit History

CUDA cmake: add `-lineinfo` for easier debug (llama/15260) 008e169

cuda: remove linking to cublasLt (llama/14790) fafaa8b

CUDA: FA support for Deepseek (Ampere or newer) (llama/13306) 507d30c

CUDA: mix virt/real CUDA archs for GGML_NATIVE=OFF (llama/13135) 9fb68a1

build : fix build info on windows (llama/13239) 415b9fc

CUDA: compress mode option and default to size (llama/12029) 4ec988a

CUDA: app option to compile without FlashAttention (llama/12025) fbc5f16

CUDA: correct the lowest Maxwell supported by CUDA 12 (llama/11984) 6641178

cuda : add ampere to the list of default architectures (llama/11870) 1d19dec

CUDA: use mma PTX instructions for FlashAttention (llama/11583) f328957

ggml : sync remnants (skip) (#0) 451937f unverified

ggml : sync resolve (skip) (#0) d4d67dc