Spaces:
Running
Running
Commit History
opencl: add initial mxfp4 support via mv (llama/15270)
1a0281c
lhez
shawngu-quic
commited on
opencl: allow mixed f16/f32 `add` (llama/15140)
345810b
ggml : fix field name when new ggml_backend (llama/14944)
685748d
AN Long
commited on
opencl: support sink in `soft_max` (attn sinks) (llama/15152)
d8664e4
lhez
commited on
fix profiling crash (llama/15072)
67ec576
opencl: add `swiglu_oai` and `add_id` (llama/15121)
1c97db6
lhez
commited on
opencl: fix adreno compiler detection logic (llama/15029)
e6a209e
lhez
commited on
opencl: add f16 for `add`, `sub`, `mul`, `div` (llama/14984)
4dc1834
lhez
commited on
opencl: add `mul_mat_f32_f32_l4_lm` and `mul_mat_f16_f32_l4_lm` (llama/14809)
05577c3
lhez
commited on
opencl: add fused `rms_norm_mul` (llama/14841)
5629961
lhez
commited on
opencl: remove unreachable `return` (llama/14806)
cfa3731
lhez
commited on
opencl: add conv2d kernel (llama/14403)
d579f20
ggml : add build-time message to remind about ggml_set_rows (llama/14661)
0f5d4ba
opencl: add tiled mul_mat_f16_f32 (llama/14535)
398dc49
opencl: add `set_rows` for `f16` and `f32` (llama/14547)
5e203ec
lhez
commited on
ggml : add ggml_scale_bias (llama/14417)
573d50a
opencl: add GELU_ERF (llama/14476)
b19d736
Sigbjørn Skjæret
commited on
ggml : implement GEGLU_ERF and GEGLU_QUICK ops (llama/14445)
f798922
Sigbjørn Skjæret
commited on
opencl : broadcast for soft_max (llama/14510)
4434043
lhez
commited on
kv-cache : use ggml_set_rows (llama/14285)
7d6d9e8
opencl : fix possible buffer overflow in dump_tensor (llama/14490)
deb934d
opencl : skip empty nodes on cgraph compute (llama/14491)
5c36e7c
Eric Zhang
commited on
opencl : update upscale to support align corners (llama/14488)
2b95b05
lhez
commited on
opencl : add GEGLU, REGLU, SWIGLU (llama/14456)
d70ff9f
lhez
commited on
opencl: ref count `ggml_backend_opencl_context` and refactor profiling (llama/14254)
ae0c7b8
lhez
commited on
opencl: add `mul_mv_id_q4_0_f32_8x_flat` (llama/14003)
d0a458b
lhez
commited on
opencl: add `backend_synchronize` (llama/13939)
a9ce9a8
lhez
commited on
OpenCL: Add concat, tsembd, upscale, tanh, pad and repeat (llama/13840)
5ff8785
rmatif
commited on
opencl: add new ops - `argsort`, `div`, `sub`, `addrows`, `sigmoid`, `group_norm` (llama/13787)
1ab0f23
lhez
commited on
opencl: mark `mul_mat` `f32f32` as supporting non-contiguous tensors (llama/13790)
4473109
lhez
commited on
opencl: Add support for multiple devices (llama/12622)
b6cddb5
Henry Linjamäki
commited on
opencl: fix couple crashes (llama/12795)
2eea73d
Henry Linjamäki
commited on
opencl: remove unnecessary assert for `add` (llama/13257)
a245fbf
lhez
commited on
opencl: split ggml-opencl.cl into multiple files and cleanup (llama/12886)
291a5b7
lhez
Shangqing Gu
commited on
opencl: fix incorrect local_size index in profiling log (llama/12868)
8f5d919
kimminsu
commited on
opencl: better identify Adreno GPU (llama/12760)
5560cd6
lhez
commited on
opencl: use `max_alloc_size` in backend ctx instead of querying again (llama/12705)
3847456
lhez
commited on
opencl : fix memory allocation size (llama/12649)
b00a8a9
opencl: add multi and vision rope, `gelu_quick` and `im2col` (llama/12600)
3261fcd
lhez
commited on
opencl: improve profiling (llama/12442)
4abe3ae
lhez
commited on
opencl: use OpenCL C standard supported by the device (llama/12221)
57028a7
Henry Linjamäki
commited on
opencl: Noncontiguous `norm`, `rms_norm`, disable `fp16` for some ops (llama/12217)
94449e3
lhez
commited on
opencl : fix buffer alignment (llama/12197)
7d25156
opencl : fix `ulong` kernel args were set from `int` variables (llama/12174)
67ffff0
opencl : fix profile-related errors (llama/12095)
e11a847
simon886212
ubuntu
commited on
ggml : upgrade init_tensor API to return a ggml_status (llama/11854)
d6b6852
William Tambellini
slaren
commited on
opencl: fix for small models (llama/11950)
4532dc6
lhez
Shawn Gu
Skyler Szot
commited on
opencl: Fix rope and softmax (llama/11833)
bf3b6f8
lhez
commited on