We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Grammatical Case Based IS-A Relation Extraction with Boosting for Polish

Abstract: Pattern-based methods of IS-A relation extraction rely heavily on so called Hearst patterns. These are ways of expressing instance enumerations of a class in natural language. While these lexico-syntactic patterns prove quite useful, they may not capture all taxonomical relations expressed in text. Therefore in this paper we describe a novel method of IS-A relation extraction from patterns, which uses morpho-syntactical annotations along with grammatical case of noun phrases that constitute entities participating in IS-A relation. We also describe a method for increasing the number of extracted relations that we call pseudo-subclass boosting which has potential application in any pattern-based relation extraction method. Experiments were conducted on a corpus of about 0.5 billion web documents in Polish language.
Subjects: Computation and Language (cs.CL); Information Retrieval (cs.IR)
ACM classes: H.3.1
Cite as: arXiv:1605.02916 [cs.CL]
  (or arXiv:1605.02916v1 [cs.CL] for this version)

Submission history

From: Dariusz Czerski [view email]
[v1] Tue, 10 May 2016 10:03:48 GMT (95kb,D)

Link back to: arXiv, form interface, contact.