Abstract: This paper improves upon the Pix2Seq object detector by extending it for videos. In the process, it introduces a new way to perform end-to-end video object detection that improves upon ...
DEIMv2 is an evolution of the DEIM framework while leveraging the rich features from DINOv3. Our method is designed with various model sizes, from an ultra-light version up to S, M, L, and X, to be ...
Abstract: Satellite videos have played important roles in many applications in recent years due to the advantages of continuous providing high temporal resolution remote sensing images. Although much ...
April. 30th, 2025: We have performed comparative experiments with mainstream tracking-by-detection methods on the OVIS datasets(see Fig.5). Task-Specific SpatioTemporal Context-Aware Decoupling for ...