GraghVQA: Language-Guided Graph Neural Networks for Graph-based Visual Question Answering

Liang, Weixin; Jiang, Yanhao; Liu, Zixuan

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2104

Computer Science > Computation and Language

Title: GraghVQA: Language-Guided Graph Neural Networks for Graph-based Visual Question Answering

Authors: Weixin Liang, Yanhao Jiang, Zixuan Liu

(Submitted on 20 Apr 2021 (v1), last revised 2 Jun 2021 (this version, v2))

Abstract: Images are more than a collection of objects or attributes -- they represent a web of relationships among interconnected objects. Scene Graph has emerged as a new modality for a structured graphical representation of images. Scene Graph encodes objects as nodes connected via pairwise relations as edges. To support question answering on scene graphs, we propose GraphVQA, a language-guided graph neural network framework that translates and executes a natural language question as multiple iterations of message passing among graph nodes. We explore the design space of GraphVQA framework, and discuss the trade-off of different design choices. Our experiments on GQA dataset show that GraphVQA outperforms the state-of-the-art model by a large margin (88.43% vs. 94.78%).

Comments:	NAACL 2021 MAI-Workshop. Code available at this https URL
Subjects:	Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2104.10283 [cs.CL]
	(or arXiv:2104.10283v2 [cs.CL] for this version)

Submission history

From: Weixin Liang [view email]
[v1] Tue, 20 Apr 2021 23:54:41 GMT (3799kb,D)
[v2] Wed, 2 Jun 2021 05:29:00 GMT (3799kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2104.10283

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: GraghVQA: Language-Guided Graph Neural Networks for Graph-based Visual Question Answering

Submission history