Open-vocabulary object detection (OVD) is a critical research area in computer vision, particularly for applications in autonomous driving and robotics. Many existing OVD methods adopt transformer ...
We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors ...