References & Citations
Computer Science > Software Engineering
Title: CRaDLe: Deep Code Retrieval Based on Semantic Dependency Learning
(Submitted on 2 Dec 2020 (this version), latest version 29 Mar 2022 (v2))
Abstract: Code retrieval is a common practice for programmers to reuse existing code snippets in the open-source repositories. Given a user query (i.e., a natural language description), code retrieval aims at searching the most relevant ones from a set of code snippets. The main challenge of effective code retrieval lies in mitigating the semantic gap between natural language descriptions and code snippets. With the ever-increasing amount of available open-source code, recent studies resort to neural networks to learn the semantic matching relationships between the two sources. The statement-level dependency information, which highlights the dependency relations among the program statements during the execution, reflects the structural importance of one statement in the code, which is favor-able for accurately capturing the code semantics but has never been explored for the code retrieval task. In this paper, we propose CRaDLe, a novel approach forCodeRtrieval based on statement-levelsemanticDependencyLearning. Specifically, CRaDLe distills code representations through fusing both the dependency and semantic information at the statement level and then learns a unified vector representation for each code and description pair for modeling the matching relationship. Comprehensive experiments and analysis on real-world datasets show that the proposed approach can accurately retrieve code snippets for a given query and significantly outperform the state-of-the-art approaches on the task.
Submission history
From: Wenchao Gu [view email][v1] Wed, 2 Dec 2020 08:47:01 GMT (1884kb,D)
[v2] Tue, 29 Mar 2022 02:38:01 GMT (1049kb,D)
Link back to: arXiv, form interface, contact.