Abstract: Referring Multi-Object Tracking (RMOT) aims to dynamically track an arbitrary number of referred targets in a video sequence according to the language expression. Previous methods mainly ...
Abstract: Previous studies on remote sensing foundation models have demonstrated the representational ability of convolutional neural networks (CNNs) and vision transformers (ViTs). However, these ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results