We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computation and Language

Title: Towards Proactively Forecasting Sentence-Specific Information Popularity within Online News Documents

Abstract: Multiple studies have focused on predicting the prospective popularity of an online document as a whole, without paying attention to the contributions of its individual parts. We introduce the task of proactively forecasting popularities of sentences within online news documents solely utilizing their natural language content. We model sentence-specific popularity forecasting as a sequence regression task. For training our models, we curate InfoPop, the first dataset containing popularity labels for over 1.7 million sentences from over 50,000 online news documents. To the best of our knowledge, this is the first dataset automatically created using streams of incoming search engine queries to generate sentence-level popularity annotations. We propose a novel transfer learning approach involving sentence salience prediction as an auxiliary task. Our proposed technique coupled with a BERT-based neural model exceeds nDCG values of 0.8 for proactive sentence-specific popularity forecasting. Notably, our study presents a non-trivial takeaway: though popularity and salience are different concepts, transfer learning from salience prediction enhances popularity forecasting. We release InfoPop and make our code publicly available: this https URL
Comments: In 33rd ACM Conference on Hypertext and Social Media [HT '22] (Main Track), Link: this https URL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Journal reference: In HT '22. Association for Computing Machinery, New York, NY, USA, 11-20 (2022)
DOI: 10.1145/3511095.3531268
Cite as: arXiv:2301.00152 [cs.CL]
  (or arXiv:2301.00152v1 [cs.CL] for this version)

Submission history

From: Sayar Ghosh Roy [view email]
[v1] Sat, 31 Dec 2022 08:40:08 GMT (1389kb,D)

Link back to: arXiv, form interface, contact.