🤗Transformers

Topic	Replies	Views	Activity
Rethinking Transformer Architecture Through the Lens of Group Theory: 🤗Transformers	1	30	July 9, 2026
Contributing Gemma 4 support and ONNX export optimizations 🤗Transformers	1	52	July 7, 2026
How should I balance learning rate and data sampling during CPT on multiple datasets? 🤗Transformers	1	34	July 7, 2026
[Research] From Functional Geometry to Dynamic Grammar: New LIMEN Audits (V23–V24) Across 7 Architectures 🤗Transformers	2	42	July 2, 2026
A comprehensive, bilingual guide to Transformers: From foundations to KV-cache compression & attention dynamics 🤗Transformers	0	24	June 29, 2026
Error fix of the 503 loop 🤗Transformers	1	58	June 25, 2026
Deprecated parameters of pipeline() included in the course 🤗Transformers	0	28	June 12, 2026
Fine tuning for social media trends 🤗Transformers	2	88	June 8, 2026
A note on interpreting internal dynamics: Stability vs. Semantic Correctness in Transformers 🤗Transformers	0	31	June 2, 2026
How can LLMs be fine-tuned for specialized domain knowledge? 🤗Transformers	3	1584	May 29, 2026
Need generative model, high-quality description generation 🤗Transformers	3	110	May 28, 2026
SFTTrainerflags blocks assistant_only_loss=True 🤗Transformers	3	131	May 26, 2026
Date format for tine-tuning AI models 🤗Transformers	5	112	May 22, 2026
Chatbot Start Prompt for GPT-J 🤗Transformers	5	1403	May 21, 2026
Automatic -100 masking of the questions in Labels 🤗Transformers	1	46	May 21, 2026
PTQ INT8 via TFLiteConverter — encoder-decoder seq2seq model loses encoder context entirely after conversion 🤗Transformers	3	115	May 16, 2026
Fucking hugging face changed the zerogpu 🤗Transformers	0	35	May 14, 2026
Train a fully open SmolLM4-750M model 🤗Transformers	0	217	May 11, 2026
The BPE pre-tokenizer was not recognized! 🤗Transformers	6	320	May 7, 2026
Custom batches in sentence-transformers for MultipleNegativesRankingLoss 🤗Transformers	3	140	May 1, 2026
I developed an experimental Graph-Native Artificial Brain engine 🤗Transformers	4	84	May 1, 2026
When i use tool its pause and restart space not working why DeepSpeed	0	22	April 30, 2026
CPU offloading error scenario 🤗Transformers	11	389	April 27, 2026
Gemma 3 12B: 4-bit Quantization failing/ignored in Transformers v5.1.0 (Gemma3ForConditionalGeneration) 🤗Transformers	9	491	April 24, 2026
Why am I facing this Error while running this code 🤗Transformers	1	115	April 23, 2026
What are the best tutorials to learn Transformers step by step? 🤗Transformers	2	172	April 20, 2026
LLM Course code errors 🤗Transformers	8	387	April 17, 2026
Independent researcher looking for technical feedback on a paper about a revision-capable language model 🤗Transformers	0	39	April 17, 2026
Why this BERTScore has a high precision? 🤗Transformers	1	144	April 16, 2026
Fine-tuning Gemma-4-E2B on MacBook M3 🤗Transformers	4	1171	April 14, 2026