We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: A Novel Way of Identifying Cyber Predators

Abstract: Recurrent Neural Networks with Long Short-Term Memory cell (LSTM-RNN) have impressive ability in sequence data processing, particularly for language model building and text classification. This research proposes the combination of sentiment analysis, new approach of sentence vectors and LSTM-RNN as a novel way for Sexual Predator Identification (SPI). LSTM-RNN language model is applied to generate sentence vectors which are the last hidden states in the language model. Sentence vectors are fed into another LSTM-RNN classifier, so as to capture suspicious conversations. Hidden state enables to generate vectors for sentences never seen before. Fasttext is used to filter the contents of conversations and generate a sentiment score so as to identify potential predators. The experiment achieves a record-breaking accuracy and precision of 100% with recall of 81.10%, exceeding the top-ranked result in the SPI competition.
Comments: 6 pages
Subjects: Computation and Language (cs.CL); Computers and Society (cs.CY)
Cite as: arXiv:1712.03903 [cs.CL]
  (or arXiv:1712.03903v1 [cs.CL] for this version)

Submission history

From: Dan Liu [view email]
[v1] Mon, 11 Dec 2017 17:24:13 GMT (718kb)

Link back to: arXiv, form interface, contact.