AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers
Abstract
AdapterTune addresses optimization instability and capacity selection in Vision Transformer transfer learning by using residual low-rank bottlenecks with zero-initialized up-projections, achieving superior performance with reduced parameters.
Frozen-backbone transfer with Vision Transformers faces two under-addressed issues: optimization instability when adapters are naively inserted into a fixed feature extractor, and the absence of principled guidance for setting adapter capacity. We introduce AdapterTune, which augments each transformer block with a residual low-rank bottleneck whose up-projection is zero-initialized, guaranteeing that the adapted network starts exactly at the pretrained function and eliminates early-epoch representation drift. On the analytical side, we formalize adapter rank as a capacity budget for approximating downstream task shifts in feature space. The resulting excess-risk decomposition predicts monotonic but diminishing accuracy gains with increasing rank, an ``elbow'' behavior we confirm through controlled sweeps. We evaluate on 9 datasets and 3 backbone scales with multi-seed reporting throughout. On a core 5 dataset transfer suite, AdapterTune improves top-1 accuracy over head-only transfer by +14.9 points on average while training only 0.92 of the parameters required by full fine-tuning, and outperforms full fine-tuning on 10 of 15 dataset-backbone pairs. Across the full benchmark, AdapterTune improves over head-only transfer on every dataset-backbone pair tested. Ablations on rank, placement, and initialization isolate each design choice. The code is available at: https://github.com/salimkhazem/adaptertune
Community
Vision Adaptation
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- D2-LoRA: A Synergistic Approach to Differential and Directional Low-Rank Adaptation (2026)
- LORA-CRAFT: Cross-layer Rank Adaptation via Frozen Tucker Decomposition of Pre-trained Attention Weights (2026)
- When Is Rank-1 Enough? Geometry-Guided Initialization for Parameter-Efficient Fine-Tuning (2026)
- Spectral Surgery: Training-Free Refinement of LoRA via Gradient-Guided Singular Value Reweighting (2026)
- NOBLE: Accelerating Transformers with Nonlinear Low-Rank Branches (2026)
- FVG-PT: Adaptive Foreground View-Guided Prompt Tuning for Vision-Language Models (2026)
- Shared LoRA Subspaces for almost Strict Continual Learning (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper