Novel Vision Technique to Perform Visual Relation Detection in Videos

Abstract/Technology Overview

Value Proposition:

Compared to still images, videos provide a more natural set of features for detecting visual relations, in which features like the dynamic interactions between objects enable visual ties like “A-follow-B” and “A towards-B” to be recognized in videos. Visual relation detection is a recent effort in offering a more comprehensive understanding of visual content beyond objects and aims to capture the various interactions between objects in images. However, detecting visual relations in videos is more technically challenging than in images due to the difficulties in accurate object tracking and diverse relation appearances in the video domain.


Our video detection technology is a novel method that helps overcome the technical challenges in detecting visual relations in videos by using object tracking proposal, short-term relation prediction, and greedy relational association. Moreover, the organization contributes the first dataset for our video detection evaluation, with manually labelled visual relations, to validate our proposed method. On this dataset, our method achieves the best performance in comparison with the state-of-the-art baselines.

Technology Features, Specifications and Advantages

What can this technology do for your business?

· The technology helps bridge the gap between vision and language in the multimedia analysis.

· Automatically detects a set of visual relation instances between pairs of objects in videos.

· Predicts visual relation between object pairs in a short-term video clip.

Potential Applications

Our video detection technology may effectively unpin numerous visual-language tasks such as captioning, visual search and visual question-answering. Also, our technology can be used in a wide variety of applications in computer vision such as video compression, video surveillance, vision-based control, medical imaging, Augmented Reality (AR) and robotics.

Technology Owner




Technology Category
  • Infocomm
  • Artificial Intelligence
  • Augmented Reality, Virtual Reality & Computer-Simulated Environments
  • Interactive Digital Media & Multimedia
  • Video/Image Analysis
Technology Status
  • Available for Licensing
Technology Readiness Level
  • TRL 4

