We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computer Vision and Pattern Recognition

Title: Towards General Purpose Vision Systems

Abstract: A special purpose learning system assumes knowledge of admissible tasks at design time. Adapting such a system to unforeseen tasks requires architecture manipulation such as adding an output head for each new task or dataset. In this work, we propose a task-agnostic vision-language system that accepts an image and a natural language task description and outputs bounding boxes, confidences, and text. The system supports a wide range of vision tasks such as classification, localization, question answering, captioning, and more. We evaluate the system's ability to learn multiple skills simultaneously, to perform tasks with novel skill-concept combinations, and to learn new skills efficiently and without forgetting.
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as: arXiv:2104.00743 [cs.CV]
  (or arXiv:2104.00743v1 [cs.CV] for this version)

Submission history

From: Tanmay Gupta [view email]
[v1] Thu, 1 Apr 2021 19:35:21 GMT (6756kb,D)
[v2] Tue, 19 Apr 2022 21:47:02 GMT (7420kb,D)

Link back to: arXiv, form interface, contact.