Which Kind Is Better in Open-domain Multi-turn Dialog,Hierarchical or Non-hierarchical Models? An Empirical Study

Lan, Tian; Mao, Xian-Ling; Wei, Wei; Huang, Heyan

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2008

Computer Science > Computation and Language

Title: Which Kind Is Better in Open-domain Multi-turn Dialog,Hierarchical or Non-hierarchical Models? An Empirical Study

Authors: Tian Lan, Xian-Ling Mao, Wei Wei, Heyan Huang

(Submitted on 7 Aug 2020)

Abstract: Currently, open-domain generative dialog systems have attracted considerable attention in academia and industry. Despite the success of single-turn dialog generation, multi-turn dialog generation is still a big challenge. So far, there are two kinds of models for open-domain multi-turn dialog generation: hierarchical and non-hierarchical models. Recently, some works have shown that the hierarchical models are better than non-hierarchical models under their experimental settings; meanwhile, some works also demonstrate the opposite conclusion. Due to the lack of adequate comparisons, it's not clear which kind of models are better in open-domain multi-turn dialog generation. Thus, in this paper, we will measure systematically nearly all representative hierarchical and non-hierarchical models over the same experimental settings to check which kind is better. Through extensive experiments, we have the following three important conclusions: (1) Nearly all hierarchical models are worse than non-hierarchical models in open-domain multi-turn dialog generation, except for the HRAN model. Through further analysis, the excellent performance of HRAN mainly depends on its word-level attention mechanism; (2) The performance of other hierarchical models will also obtain a great improvement if integrating the word-level attention mechanism into these models. The modified hierarchical models even significantly outperform the non-hierarchical models; (3) The reason why the word-level attention mechanism is so powerful for hierarchical models is because it can leverage context information more effectively, especially the fine-grained information. Besides, we have implemented all of the models and already released the codes.

Subjects:	Computation and Language (cs.CL); Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2008.02964 [cs.CL]
	(or arXiv:2008.02964v1 [cs.CL] for this version)

Submission history

From: Tian Lan [view email]
[v1] Fri, 7 Aug 2020 02:54:55 GMT (1777kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2008.02964

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Which Kind Is Better in Open-domain Multi-turn Dialog,Hierarchical or Non-hierarchical Models? An Empirical Study

Submission history