COVE
Broad access to image and video datasets has been responsible for much of the progress in computer vision recognition problems over the last decade. These common benchmarks have played a leading role in transforming recognition research from a black art into an experimental science. Progress, however, has stagnated; although datasets continue to grow, they are developed and annotated in isolation: e.g., a collection of sporting activities, a set of objects in images, etc. These isolated datasets suffer from task and domain-specific bias, and knowledge transfer across them is extremely limited. This project is investigating and establishing a prototype architecture that federates across various recognition problems and modalities, by establishing a common namespace for entities, events and annotations across the datasets. The project is also establishing a web-portal for the prototype federated dataset architecture and linking two existing recognition datasets into the prototype architecture. The resulting federated structure is truly greater than the sum of its parts, and can support new research that was not previously possible for the computer vision community and other related fields.
As a first test scenario for this federated architecture, this project is investigating and constructing a new federated dataset of images and video annotated with various forms of associated text. Image and video content annotations span both the spatial and temporal dimensions while textual annotations reflecting depicted content range from complete free-form natural language descriptions, to more targeted phrases and referring expressions, to individual keyword lists. This dataset is being constructed to promote and enhance collaboration efforts between the vision and language communities by providing a new multi-modal annotated dataset with associated research competitions.
Funded by: https://nsf.gov/awardsearch/showAward?AWD_ID=1463102
Bingyu Shen, Walter J. Scheirer
Collaborators: Sheng Liu, Neela Devi Kaushik, Kate Saenko, Jason Corso
[website]