We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.FL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Formal Languages and Automata Theory

Title: Combining pattern-based CRFs and weighted context-free grammars

Abstract: We consider two models for the sequence labeling (tagging) problem. The first one is a {\em Pattern-Based Conditional Random Field }(\PB), in which the energy of a string (chain labeling) $x=x_1\ldots x_n\in D^n$ is a sum of terms over intervals $[i,j]$ where each term is non-zero only if the substring $x_i\ldots x_j$ equals a prespecified word $w\in \Lambda$. The second model is a {\em Weighted Context-Free Grammar }(\WCFG) frequently used for natural language processing. \PB and \WCFG encode local and non-local interactions respectively, and thus can be viewed as complementary.
We propose a {\em Grammatical Pattern-Based CRF model }(\GPB) that combines the two in a natural way. We argue that it has certain advantages over existing approaches such as the {\em Hybrid model} of Bened{\'i} and Sanchez that combines {\em $\mbox{$N$-grams}$} and \WCFGs. The focus of this paper is to analyze the complexity of inference tasks in a \GPB such as computing MAP. We present a polynomial-time algorithm for general \GPBs and a faster version for a special case that we call {\em Interaction Grammars}.
Comments: 11 pages
Subjects: Formal Languages and Automata Theory (cs.FL); Data Structures and Algorithms (cs.DS); Machine Learning (cs.LG)
ACM classes: I.2.7
Cite as: arXiv:1404.5475 [cs.FL]
  (or arXiv:1404.5475v2 [cs.FL] for this version)

Submission history

From: Rustem Takhanov [view email]
[v1] Tue, 22 Apr 2014 12:44:42 GMT (93kb,D)
[v2] Sat, 1 Nov 2014 13:29:52 GMT (403kb,D)

Link back to: arXiv, form interface, contact.