SAM2.1++ tracking examples
Memory-based trackers such as SAM2 demonstrate remarkable performance, however still struggle with distractors. We propose a new plug-in distractor-aware memory (DAM) and management strategy that substantially improves tracking robustness. The new model is demonstrated on SAM2.1, leading to SAM2.1++, which sets a new state-of-the-art on six benchmarks, including the most challenging VOT/S benchmarks without additional training. We also propose a new distractor-distilled (DiDi) dataset to better study the distractor problem.
DiDi is a distractor-distilled tracking dataset created to address the limitation of low distractor presence in current visual object tracking benchmarks. To enhance the evaluation and analysis of tracking performance amidst distractors, we have semi-automatically distilled several existing benchmarks into the DiDi dataset. The dataset is available for download at this link.
Model | Quality | Accuracy | Robustness |
---|---|---|---|
TransT | 0.465 | 0.669 | 0.678 |
KeepTrack | 0.502 | 0.646 | 0.748 |
SeqTrack | 0.529 | 0.714 | 0.718 |
AQATrack | 0.535 | 0.693 | 0.753 |
AOT | 0.541 | 0.622 | 0.852 |
Cutie | 0.575 | 0.704 | 0.776 |
ODTrack | 0.608 | 0.740 🥇 | 0.809 |
SAM2.1Long | 0.646 | 0.719 | 0.883 |
SAM2.1 (baseline) | 0.649 🥉 | 0.720 | 0.887 🥉 |
SAMURAI | 0.680 🥈 | 0.722 🥉 | 0.930 🥈 |
SAM2.1++ (ours) | 0.694 🥇 | 0.727 🥈 | 0.944 🥇 |