UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Paper
•
2111.12085
•
Published
Unicorn accomplishes the great unification of the network architecture and the learning paradigm for four tracking tasks. Unicorn puts forwards new state-of-the-art performance on many challenging tracking benchmarks using the same model parameters. This model has an input size of 800x1280.
This model can be used for:
LaSOT AUC (%): 68.5 BDD100K mMOTA (%): 41.2 DAVIS17 J&F (%): 69.2 BDD100K MOTS mMOTSA (%): 29.6
@inproceedings{unicorn,
title={Towards Grand Unification of Object Tracking},
author={Yan, Bin and Jiang, Yi and Sun, Peize and Wang, Dong and Yuan, Zehuan and Luo, Ping and Lu, Huchuan},
booktitle={ECCV},
year={2022}
}