Compared to still images, videos provide a more natural set of features for detecting visual relations, in which features like the dynamic interactions between objects enable visual ties like “A-follow-B” and “A towards-B” to be recognized in videos. Visual relation detection is a recent effort in offering a more comprehensive understanding of visual content beyond objects and aims to capture the various interactions between objects in images. However, detecting visual relations in videos is more technically challenging than in images due to the difficulties in accurate object tracking and diverse relation appearances in the video domain.
Our video detection technology is a novel method that helps overcome the technical challenges in detecting visual relations in videos by using object tracking proposal, short-term relation prediction, and greedy relational association. Moreover, the organization contributes the first dataset for our video detection evaluation, with manually labelled visual relations, to validate our proposed method. On this dataset, our method achieves the best performance in comparison with the state-of-the-art baselines.
Technology Features, Specifications and Advantages
What can this technology do for your business?
· The technology helps bridge the gap between vision and language in the multimedia analysis.
· Automatically detects a set of visual relation instances between pairs of objects in videos.
· Predicts visual relation between object pairs in a short-term video clip.
Our video detection technology may effectively unpin numerous visual-language tasks such as captioning, visual search and visual question-answering. Also, our technology can be used in a wide variety of applications in computer vision such as video compression, video surveillance, vision-based control, medical imaging, Augmented Reality (AR) and robotics.