Hugging Face Forums
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling `cublasGemmEx( handle, opa, opb, m, n, k, &falpha, a, CUDA_R_16F, lda, b, CUDA_R_16F, ldb, &fbeta, c, CUDA_R_16F, ldc, CUDA_R_32F, CUBLAS_GEMM_DEFAULT_TENSOR_OP)
🤗Transformers
nielsr
September 30, 2024, 1:36pm
3
Hi,
Would recommend the following:
Training Model on CPU instead of GPU - #2 by sgugger
.
1 Like
show post in topic
Related topics
Topic
Replies
Views
Activity
Error when fine-tuning on multi-gpu
🤗Transformers
1
906
February 17, 2025
LoRA Finetuning RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0!
🤗Transformers
4
266
June 16, 2025
Training llama with Lora on multiple GPUs may exist bug
🤗Transformers
10
9888
August 25, 2023
Fine tune "meta-llama/Llama-2-7b-hf" Bug:RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:1 and cuda:0! (when checking argument for argument target in method wrapper_CUDA_nll_loss_forward)
Beginners
15
237
December 6, 2024
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:7 and cuda:0!
🤗Transformers
2
237
March 25, 2025