metal : use `vm_allocate` instead of `posix_memalign` on macOS (llama/7078) eb910b1 Gilad S commited on May 8, 2024
CUDA: CUDART < 11.7 workaround for __hmax, __hmax2 (llama/7019) 4cf786d JohannesGaessler commited on May 1, 2024
ggml : add Flash Attention (llama/5021) 34d3b03 ggerganov JohannesGaessler phymbert commited on Apr 30, 2024
Fix more int overflow during quant (PPL/CUDA). (llama/6563) 531387f dranger003 commited on Apr 28, 2024
gguf : enforce that tensor names are unique (llama/6905) 22e446d Xuan Son Nguyen slaren commited on Apr 28, 2024
Reset schedule earlier to allow overlap with ggml graph computation on device (llama/6933) 3a8eea8 agray3 commited on Apr 26, 2024
gguf : fix mismatch between alloc and free functions (llama/6929) d8fb433 slaren commited on Apr 26, 2024
ggml : fix redefinition of vaddvq_f32 for 32-bit ARM (llama/6906) f900de6 ggerganov commited on Apr 25, 2024
ggml : fix ggml_backend_cpu_supports_op() for CPY (llama/0) d645791 ggerganov commited on Apr 21, 2024
ggml : group all experts in a single ggml_mul_mat_id (llama/6505) f0b5c67 slaren ggerganov commited on Apr 18, 2024
fix mul_mat_id() for new input, make the ut pass (llama/6682) 6d1ba81 Neo Zhang Jianyu commited on Apr 15, 2024
fix memcpy() crash, add missed cmd in guide, fix softmax (llama/6622) 6901743 Neo Zhang Jianyu commited on Apr 14, 2024
CUDA: fix matrix multiplication logic for tests (llama/6667) 6ccb5a5 JohannesGaessler commited on Apr 13, 2024
llama : add gguf_remove_key + remove split meta during quantize (llama/6591) 1706870 jiez z5269887 commited on Apr 12, 2024
ggml : expose SSE3 and SSSE3 for MSVC when AVX is available (#2128) 340b9ae unverified Przemysław Pawełczyk commited on May 8, 2024
build : improve disabling AVX-512 (#2129) dd6f1ab unverified Przemysław Pawełczyk commited on May 8, 2024
minor: add CMakeSettings.json to gitignore (#2094) a361a80 unverified stanimirovb commited on May 8, 2024
make : change GNU make default CXX from g++ to c++ (#2100) 610f480 unverified Przemysław Pawełczyk commited on Apr 28, 2024
Remove unnecessary memory reallocation in fft (#2080) 3198674 unverified goldwaving commited on Apr 28, 2024
whisper : more prominent log message for sub-1s audio (#2065) 5ddb20b unverified ggerganov commited on Apr 24, 2024
main : pass nullptr when regex is empty (#2070) 8677fc4 unverified ggerganov commited on Apr 17, 2024
readme : add up-to-date repository for Python bindings (#2063) f573a31 unverified AIWintermuteAI commited on Apr 16, 2024
build : fix embedded Metal library generation (#2045) b0e83a9 unverified Didzis Gosko commited on Apr 15, 2024