whisper.cpp

Running

App Files Files Community

whisper.cpp / ggml /src /ggml-cuda /vendors

Commit History

HIP: Cleanup hipification header (llama/15285)

7cdf9cd

uvos

JohannesGaessler commited on Aug 14, 2025

HIP: bump requirement to rocm 6.1 (llama/15296)

58a3802

uvos commited on Aug 13, 2025

HIP: disable sync warp shuffel operators from clr amd_warp_sync_functions.h (llama/15273)

8fca6dd

uvos commited on Aug 12, 2025

CUDA: GEMM for FP32/FP16/BF16 and ne11 <= 16 (llama/15131)

1d24833

JohannesGaessler commited on Aug 7, 2025

llama : add gpt-oss (llama/15091)

bf225d6

ggerganov

ngxson HF Staff slaren commited on Aug 5, 2025

HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly support only AMD targets (llama/14945)

e37eff3

uvos commited on Jul 29, 2025

HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624)

5422b31

deepsek commited on Jul 26, 2025

musa: upgrade musa sdk to rc4.2.0 (llama/14498)

a687ec3

yeahdongcn commited on Jul 24, 2025

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634)

4354560

Slobodan Josic commited on Jul 11, 2025

CUDA/HIP: Share the same unified memory allocation logic. (llama/12934)

143cb70

David Huang commited on Apr 15, 2025

cuda : fix HIP and MUSA BF16 (llama/0)

6dc5583

ggerganov commited on Apr 7, 2025

HIP: Add support for RDNA4 targets (llama/12372)

a73f01f

Slobodan Josic commited on Mar 26, 2025

CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (llama/12183)

3a7ca19

Gaurav Garg

JohannesGaessler commited on Mar 19, 2025

cuda : enable CUDA Graph on CUDA Toolkit < 12.x (llama/12394)

1e69b8c

Gaurav Garg commited on Mar 17, 2025

CUDA/HIP: add support for selectable warp size to mmv (llama/11519)

ed08269

uvos commited on Feb 2, 2025

CUDA: use mma PTX instructions for FlashAttention (llama/11583)

f328957

JohannesGaessler Diego Devesa commited on Feb 2, 2025

hip : Add hipGraph and VMM support to ROCM (llama/11362)

089afa0

uvos commited on Jan 24, 2025

CUDA: add BF16 support (llama/11093)

961ef57

JohannesGaessler commited on Jan 6, 2025

Add some minimal optimizations for CDNA (llama/10498)

bf49bbe

uvos commited on Nov 27, 2024

musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (llama/9526)

8ec75c3

R0CKSTAR commited on Sep 22, 2024

ggml : fix builds (llama/0)

524a01b

ggerganov commited on Sep 20, 2024

musa: remove Clang builtins mapping (llama/9421)

ba2469d

R0CKSTAR commited on Sep 11, 2024

cuda : organize vendor-specific headers into vendors directory (llama/8746)

ec2f307

R0CKSTAR commited on Jul 29, 2024

Commit History

HIP: Cleanup hipification header (llama/15285) 7cdf9cd

HIP: bump requirement to rocm 6.1 (llama/15296) 58a3802

HIP: disable sync warp shuffel operators from clr amd_warp_sync_functions.h (llama/15273) 8fca6dd

CUDA: GEMM for FP32/FP16/BF16 and ne11 <= 16 (llama/15131) 1d24833

llama : add gpt-oss (llama/15091) bf225d6

HIP: remove the use of __HIP_PLATFORM_AMD__, explicitly support only AMD targets (llama/14945) e37eff3

HIP: Enable Matrix cores for MMQ Kernels, Enable stream-K for CDNA 3 (llama/14624) 5422b31

musa: upgrade musa sdk to rc4.2.0 (llama/14498) a687ec3

HIP : Add HIP 7.0+ compatibility for hipBLAS compute types (llama/14634) 4354560

CUDA/HIP: Share the same unified memory allocation logic. (llama/12934) 143cb70

cuda : fix HIP and MUSA BF16 (llama/0) 6dc5583

HIP: Add support for RDNA4 targets (llama/12372) a73f01f

CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case (llama/12183) 3a7ca19

cuda : enable CUDA Graph on CUDA Toolkit < 12.x (llama/12394) 1e69b8c

CUDA/HIP: add support for selectable warp size to mmv (llama/11519) ed08269

CUDA: use mma PTX instructions for FlashAttention (llama/11583) f328957

hip : Add hipGraph and VMM support to ROCM (llama/11362) 089afa0

CUDA: add BF16 support (llama/11093) 961ef57

Add some minimal optimizations for CDNA (llama/10498) bf49bbe

musa: enable building fat binaries, enable unified memory, and disable Flash Attention on QY1 (MTT S80) (llama/9526) 8ec75c3

ggml : fix builds (llama/0) 524a01b

musa: remove Clang builtins mapping (llama/9421) ba2469d

cuda : organize vendor-specific headers into vendors directory (llama/8746) ec2f307