We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computation and Language

New submissions

[ total of 35 entries: 1-35 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 15 Jan 21

[1]  arXiv:2101.05400 [pdf, other]
Title: Machine-Assisted Script Curation
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

We describe Machine-Aided Script Curator (MASC), a system for human-machine collaborative script authoring. Scripts produced with MASC include (1) English descriptions of sub-events that comprise a larger, complex event; (2) event types for each of those events; (3) a record of entities expected to participate in multiple sub-events; and (4) temporal sequencing between the sub-events. MASC automates portions of the script creation process with suggestions for event types, links to Wikidata, and sub-events that may have been forgotten. We illustrate how these automations are useful to the script writer with a few case-study scripts.

[2]  arXiv:2101.05469 [pdf, other]
Title: Text Augmentation in a Multi-Task View
Comments: Accepted to EACL 2021
Subjects: Computation and Language (cs.CL)

Traditional data augmentation aims to increase the coverage of the input distribution by generating augmented examples that strongly resemble original samples in an online fashion where augmented examples dominate training. In this paper, we propose an alternative perspective -- a multi-task view (MTV) of data augmentation -- in which the primary task trains on original examples and the auxiliary task trains on augmented examples. In MTV data augmentation, both original and augmented samples are weighted substantively during training, relaxing the constraint that augmented examples must resemble original data and thereby allowing us to apply stronger levels of augmentation. In empirical experiments using four common data augmentation techniques on three benchmark text classification datasets, we find that the MTV leads to higher and more robust performance improvements than traditional augmentation.

[3]  arXiv:2101.05478 [pdf, other]
Title: WER-BERT: Automatic WER Estimation with BERT in a Balanced Ordinal Classification Paradigm
Comments: Accepted Long Paper at EACL 2021
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Audio Speech Recognition (ASR) systems are evaluated using Word Error Rate (WER) which is calculated by comparing the number of errors between the ground truth and the ASR system's transcription. This calculation, however, requires manual transcription of the speech signal to obtain the ground truth. Since transcribing audio signals is a costly process, Automatic WER Evaluation (e-WER) methods have been developed which attempt to predict the WER of a Speech system by only relying on the transcription and the speech signal features. While WER is a continuous variable, previous works have shown that positing e-WER as a classification problem is more effective than regression. However, while converting to a classification setting, these approaches suffer from heavy class imbalance. In this paper, we propose a new balanced paradigm for e-WER in a classification setting. Within this paradigm, we also propose WER-BERT, a BERT based architecture with speech features for e-WER. Furthermore, we introduce a distance loss function to tackle the ordinal nature of e-WER classification. The proposed approach and paradigm are evaluated on the Librispeech dataset and a commercial (black box) ASR system, Google Cloud's Speech-to-Text API. The results and experiments demonstrate that WER-BERT establishes a new state-of-the-art in automatic WER estimation.

[4]  arXiv:2101.05494 [pdf, ps, other]
Title: Hostility Detection in Hindi leveraging Pre-Trained Language Models
Subjects: Computation and Language (cs.CL)

Hostile content on social platforms is ever increasing. This has led to the need for proper detection of hostile posts so that appropriate action can be taken to tackle them. Though a lot of work has been done recently in the English Language to solve the problem of hostile content online, similar works in Indian Languages are quite hard to find. This paper presents a transfer learning based approach to classify social media (i.e Twitter, Facebook, etc.) posts in Hindi Devanagari script as Hostile or Non-Hostile. Hostile posts are further analyzed to determine if they are Hateful, Fake, Defamation, and Offensive. This paper harnesses attention based pre-trained models fine-tuned on Hindi data with Hostile-Non hostile task as Auxiliary and fusing its features for further sub-tasks classification. Through this approach, we establish a robust and consistent model without any ensembling or complex pre-processing. We have presented the results from our approach in CONSTRAINT-2021 Shared Task on hostile post detection where our model performs extremely well with 3rd runner up in terms of Weighted Fine-Grained F1 Score.

[5]  arXiv:2101.05499 [pdf, other]
Title: ECOL: Early Detection of COVID Lies Using Content, Prior Knowledge and Source Information
Comments: to be published in Constraint-2021 Workshop @ AAAI
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Social media platforms are vulnerable to fake news dissemination, which causes negative consequences such as panic and wrong medication in the healthcare domain. Therefore, it is important to automatically detect fake news in an early stage before they get widely spread. This paper analyzes the impact of incorporating content information, prior knowledge, and credibility of sources into models for the early detection of fake news. We propose a framework modeling those features by using BERT language model and external sources, namely Simple English Wikipedia and source reliability tags. The conducted experiments on CONSTRAINT datasets demonstrated the benefit of integrating these features for the early detection of fake news in the healthcare domain.

[6]  arXiv:2101.05509 [pdf, other]
Title: Transformer-based Language Model Fine-tuning Methods for COVID-19 Fake News Detection
Comments: 9 pages, 1 figures
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

With the pandemic of COVID-19, relevant fake news is spreading all over the sky throughout the social media. Believing in them without discrimination can cause great trouble to people's life. However, universal language models may perform weakly in these fake news detection for lack of large-scale annotated data and sufficient semantic understanding of domain-specific knowledge. While the model trained on corresponding corpora is also mediocre for insufficient learning. In this paper, we propose a novel transformer-based language model fine-tuning approach for these fake news detection. First, the token vocabulary of individual model is expanded for the actual semantics of professional phrases. Second, we adapt the heated-up softmax loss to distinguish the hard-mining samples, which are common for fake news because of the disambiguation of short text. Then, we involve adversarial training to improve the model's robustness. Last, the predicted features extracted by universal language model RoBERTa and domain-specific model CT-BERT are fused by one multiple layer perception to integrate fine-grained and high-level specific representations. Quantitative experimental results evaluated on existing COVID-19 fake news dataset show its superior performances compared to the state-of-the-art methods among various evaluation metrics. Furthermore, the best weighted average F1 score achieves 99.02%.

[7]  arXiv:2101.05593 [pdf, ps, other]
Title: On the Temporality of Priors in Entity Linking
Journal-ref: 2020 European Conference on Information Retrieval
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Entity linking is a fundamental task in natural language processing which deals with the lexical ambiguity in texts. An important component in entity linking approaches is the mention-to-entity prior probability. Even though there is a large number of works in entity linking, the existing approaches do not explicitly consider the time aspect, specifically the temporality of an entity's prior probability. We posit that this prior probability is temporal in nature and affects the performance of entity linking systems. In this paper we systematically study the effect of the prior on the entity linking performance over the temporal validity of both texts and KBs.

[8]  arXiv:2101.05634 [pdf, other]
Title: Better Together -- An Ensemble Learner for Combining the Results of Ready-made Entity Linking Systems
Comments: SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Entity linking (EL) is the task of automatically identifying entity mentions in text and resolving them to a corresponding entity in a reference knowledge base like Wikipedia. Throughout the past decade, a plethora of EL systems and pipelines have become available, where performance of individual systems varies heavily across corpora, languages or domains. Linking performance varies even between different mentions in the same text corpus, where, for instance, some EL approaches are better able to deal with short surface forms while others may perform better when more context information is available. To this end, we argue that performance may be optimised by exploiting results from distinct EL systems on the same corpus, thereby leveraging their individual strengths on a per-mention basis. In this paper, we introduce a supervised approach which exploits the output of multiple ready-made EL systems by predicting the correct link on a per-mention basis. Experimental results obtained on existing ground truth datasets and exploiting three state-of-the-art EL systems show the effectiveness of our approach and its capacity to significantly outperform the individual EL systems as well as a set of baseline methods.

[9]  arXiv:2101.05656 [pdf, other]
Title: On Informative Tweet Identification For Tracking Mass Events
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Twitter has been heavily used as an important channel for communicating and discussing about events in real-time. In such major events, many uninformative tweets are also published rapidly by many users, making it hard to follow the events. In this paper, we address this problem by investigating machine learning methods for automatically identifying informative tweets among those that are relevant to a target event. We examine both traditional approaches with a rich set of handcrafted features and state of the art approaches with automatically learned features. We further propose a hybrid model that leverages both the handcrafted features and the automatically learned ones. Our experiments on several large datasets of real-world events show that the latter approaches significantly outperform the former and our proposed model performs the best, suggesting highly effective mechanisms for tracking mass events.

[10]  arXiv:2101.05701 [pdf, other]
Title: TUDublin team at Constraint@AAAI2021 -- COVID19 Fake News Detection
Comments: 8 pages
Subjects: Computation and Language (cs.CL)

The paper is devoted to the participation of the TUDublin team in Constraint@AAAI2021 - COVID19 Fake News Detection Challenge. Today, the problem of fake news detection is more acute than ever in connection with the pandemic. The number of fake news is increasing rapidly and it is necessary to create AI tools that allow us to identify and prevent the spread of false information about COVID-19 urgently. The main goal of the work was to create a model that would carry out a binary classification of messages from social media as real or fake news in the context of COVID-19. Our team constructed the ensemble consisting of Bidirectional Long Short Term Memory, Support Vector Machine, Logistic Regression, Naive Bayes and a combination of Logistic Regression and Naive Bayes. The model allowed us to achieve 0.94 F1-score, which is within 5\% of the best result.

[11]  arXiv:2101.05716 [pdf, other]
Title: SICKNL: A Dataset for Dutch Natural Language Inference
Comments: To appear at EACL 2021
Subjects: Computation and Language (cs.CL)

We present SICK-NL (read: signal), a dataset targeting Natural Language Inference in Dutch. SICK-NL is obtained by translating the SICK dataset of Marelli et al. (2014)from English into Dutch. Having a parallel inference dataset allows us to compare both monolingual and multilingual NLP models for English and Dutch on the two tasks. In the paper, we motivate and detail the translation process, perform a baseline evaluation on both the original SICK dataset and its Dutch incarnation SICK-NL, taking inspiration from Dutch skipgram embeddings and contextualised embedding models. In addition, we encapsulate two phenomena encountered in the translation to formulate stress tests and verify how well the Dutch models capture syntactic restructurings that do not affect semantics. Our main finding is all models perform worse on SICK-NL than on SICK, indicating that the Dutch dataset is more challenging than the English original. Results on the stress tests show that models don't fully capture word order freedom in Dutch, warranting future systematic studies.

[12]  arXiv:2101.05783 [pdf, other]
Title: Persistent Anti-Muslim Bias in Large Language Models
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

It has been observed that large-scale language models capture undesirable societal biases, e.g. relating to race and gender; yet religious bias has been relatively unexplored. We demonstrate that GPT-3, a state-of-the-art contextual language model, captures persistent Muslim-violence bias. We probe GPT-3 in various ways, including prompt completion, analogical reasoning, and story generation, to understand this anti-Muslim bias, demonstrating that it appears consistently and creatively in different uses of the model and that it is severe even compared to biases about other religious groups. For instance, "Muslim" is analogized to "terrorist" in 23% of test cases, while "Jewish" is mapped to "money" in 5% of test cases. We quantify the positive distraction needed to overcome this bias with adversarial text prompts, and find that use of the most positive 6 adjectives reduces violent completions for "Muslims" from 66% to 20%, but which is still higher than for other religious groups.

[13]  arXiv:2101.05786 [pdf]
Title: Persuasive Natural Language Generation -- A Literature Review
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

This literature review focuses on the use of Natural Language Generation (NLG) to automatically detect and generate persuasive texts. Extending previous research on automatic identification of persuasion in text, we concentrate on generative aspects through conceptualizing determinants of persuasion in five business-focused categories: benevolence, linguistic appropriacy, logical argumentation, trustworthiness, tools and datasets. These allow NLG to increase an existing message's persuasiveness. Previous research illustrates key aspects in each of the above mentioned five categories. A research agenda to further study persuasive NLG is developed. The review includes analysis of seventy-seven articles, outlining the existing body of knowledge and showing the steady progress in this research field.

Cross-lists for Fri, 15 Jan 21

[14]  arXiv:1701.08888 (cross-list from cs.IR) [pdf, other]
Title: Integrating Reviews into Personalized Ranking for Cold Start Recommendation
Comments: TextBPR
Journal-ref: PAKDD 2017
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)

Item recommendation task predicts a personalized ranking over a set of items for each individual user. One paradigm is the rating-based methods that concentrate on explicit feedbacks and hence face the difficulties in collecting them. Meanwhile, the ranking-based methods are presented with rated items and then rank the rated above the unrated. This paradigm takes advantage of widely available implicit feedback. It, however, usually ignores a kind of important information: item reviews. Item reviews not only justify the preferences of users, but also help alleviate the cold-start problem that fails the collaborative filtering. In this paper, we propose two novel and simple models to integrate item reviews into Bayesian personalized ranking. In each model, we make use of text features extracted from item reviews using word embeddings. On top of text features we uncover the review dimensions that explain the variation in users' feedback and these review factors represent a prior preference of users. Experiments on six real-world data sets show the benefits of leveraging item reviews on ranking prediction. We also conduct analyses to understand the proposed models.

[15]  arXiv:1803.09551 (cross-list from cs.IR) [pdf, other]
Title: Collaborative Filtering with Topic and Social Latent Factors Incorporating Implicit Feedback
Comments: 27 pages, 11 figures, 6 tables, ACM TKDD 2018
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)

Recommender systems (RSs) provide an effective way of alleviating the information overload problem by selecting personalized items for different users. Latent factors based collaborative filtering (CF) has become the popular approaches for RSs due to its accuracy and scalability. Recently, online social networks and user-generated content provide diverse sources for recommendation beyond ratings. Although {\em social matrix factorization} (Social MF) and {\em topic matrix factorization} (Topic MF) successfully exploit social relations and item reviews, respectively, both of them ignore some useful information. In this paper, we investigate the effective data fusion by combining the aforementioned approaches. First, we propose a novel model {\em \mbox{MR3}} to jointly model three sources of information (i.e., ratings, item reviews, and social relations) effectively for rating prediction by aligning the latent factors and hidden topics. Second, we incorporate the implicit feedback from ratings into the proposed model to enhance its capability and to demonstrate its flexibility. We achieve more accurate rating prediction on real-life datasets over various state-of-the-art methods. Furthermore, we measure the contribution from each of the three data sources and the impact of implicit feedback from ratings, followed by the sensitivity analysis of hyperparameters. Empirical studies demonstrate the effectiveness and efficacy of our proposed model and its extension.

[16]  arXiv:2101.05313 (cross-list from eess.AS) [pdf, other]
Title: Whispered and Lombard Neural Speech Synthesis
Comments: To appear in SLT 2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD)

It is desirable for a text-to-speech system to take into account the environment where synthetic speech is presented, and provide appropriate context-dependent output to the user. In this paper, we present and compare various approaches for generating different speaking styles, namely, normal, Lombard, and whisper speech, using only limited data. The following systems are proposed and assessed: 1) Pre-training and fine-tuning a model for each style. 2) Lombard and whisper speech conversion through a signal processing based approach. 3) Multi-style generation using a single model based on a speaker verification model. Our mean opinion score and AB preference listening tests show that 1) we can generate high quality speech through the pre-training/fine-tuning approach for all speaking styles. 2) Although our speaker verification (SV) model is not explicitly trained to discriminate different speaking styles, and no Lombard and whisper voice is used for pre-training this system, the SV model can be used as a style encoder for generating different style embeddings as input for the Tacotron system. We also show that the resulting synthetic Lombard speech has a significant positive impact on intelligibility gain.

[17]  arXiv:2101.05365 (cross-list from econ.GN) [pdf]
Title: Scared into Action: How Partisanship and Fear are Associated with Reactions to Public Health Directives
Comments: 54 pages, 11 figures
Subjects: General Economics (econ.GN); Computation and Language (cs.CL); Computation (stat.CO)

Differences in political ideology are increasingly appearing as an impediment to successful bipartisan communication from local leadership. For example, recent empirical findings have shown that conservatives are less likely to adhere to COVID-19 health directives. This behavior is in direct contradiction to past research which indicates that conservatives are more rule abiding, prefer to avoid loss, and are more prevention-motivated than liberals. We reconcile this disconnect between recent empirical findings and past research by using insights gathered from press releases, millions of tweets, and mobility data capturing local movement in retail, grocery, workplace, parks, and transit domains during COVID-19 shelter-in-place orders. We find that conservatives adhere to health directives when they express more fear of the virus. In order to better understand this phenomenon, we analyze both official and citizen communications and find that press releases from local and federal government, along with the number of confirmed COVID-19 cases, lead to an increase in expressions of fear on Twitter.

[18]  arXiv:2101.05405 (cross-list from cs.CR) [pdf, other]
Title: Privacy Analysis in Language Models via Training Data Leakage Report
Subjects: Cryptography and Security (cs.CR); Computation and Language (cs.CL); Machine Learning (cs.LG)

Recent advances in neural network based language models lead to successful deployments of such models, improving user experience in various applications. It has been demonstrated that strong performance of language models may come along with the ability to memorize rare training samples, which poses serious privacy threats in case the model training is conducted on confidential user content. This necessitates privacy monitoring techniques to minimize the chance of possible privacy breaches for the models deployed in practice. In this work, we introduce a methodology that investigates identifying the user content in the training data that could be leaked under a strong and realistic threat model. We propose two metrics to quantify user-level data leakage by measuring a model's ability to produce unique sentence fragments within training data. Our metrics further enable comparing different models trained on the same data in terms of privacy. We demonstrate our approach through extensive numerical studies on real-world datasets such as email and forum conversations. We further illustrate how the proposed metrics can be utilized to investigate the efficacy of mitigations like differentially private training or API hardening.

[19]  arXiv:2101.05525 (cross-list from eess.AS) [pdf, other]
Title: An evaluation of word-level confidence estimation for end-to-end automatic speech recognition
Comments: Accepted at SLT 2021
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)

Quantifying the confidence (or conversely the uncertainty) of a prediction is a highly desirable trait of an automatic system, as it improves the robustness and usefulness in downstream tasks. In this paper we investigate confidence estimation for end-to-end automatic speech recognition (ASR). Previous work has addressed confidence measures for lattice-based ASR, while current machine learning research mostly focuses on confidence measures for unstructured deep learning. However, as the ASR systems are increasingly being built upon deep end-to-end methods, there is little work that tries to develop confidence measures in this context. We fill this gap by providing an extensive benchmark of popular confidence methods on four well-known speech datasets. There are two challenges we overcome in adapting existing methods: working on structured data (sequences) and obtaining confidences at a coarser level than the predictions (words instead of tokens). Our results suggest that a strong baseline can be obtained by scaling the logits by a learnt temperature, followed by estimating the confidence as the negative entropy of the predictive distribution and, finally, sum pooling to aggregate at word level.

[20]  arXiv:2101.05611 (cross-list from cs.IR) [pdf, other]
Title: TrNews: Heterogeneous User-Interest Transfer Learning for News Recommendation
Comments: EACL 2021
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)

We investigate how to solve the cross-corpus news recommendation for unseen users in the future. This is a problem where traditional content-based recommendation techniques often fail. Luckily, in real-world recommendation services, some publisher (e.g., Daily news) may have accumulated a large corpus with lots of consumers which can be used for a newly deployed publisher (e.g., Political news). To take advantage of the existing corpus, we propose a transfer learning model (dubbed as TrNews) for news recommendation to transfer the knowledge from a source corpus to a target corpus. To tackle the heterogeneity of different user interests and of different word distributions across corpora, we design a translator-based transfer-learning strategy to learn a representation mapping between source and target corpora. The learned translator can be used to generate representations for unseen users in the future. We show through experiments on real-world datasets that TrNews is better than various baselines in terms of four metrics. We also show that our translator is effective among existing transfer strategies.

[21]  arXiv:2101.05646 (cross-list from cs.CR) [pdf]
Title: Malicious Code Detection: Run Trace Output Analysis by LSTM
Comments: 11 pages, 5 figures, 5 tables, accepted to IEEE Access
Subjects: Cryptography and Security (cs.CR); Computation and Language (cs.CL); Machine Learning (cs.LG)

Malicious software threats and their detection have been gaining importance as a subdomain of information security due to the expansion of ICT applications in daily settings. A major challenge in designing and developing anti-malware systems is the coverage of the detection, particularly the development of dynamic analysis methods that can detect polymorphic and metamorphic malware efficiently. In the present study, we propose a methodological framework for detecting malicious code by analyzing run trace outputs by Long Short-Term Memory (LSTM). We developed models of run traces of malicious and benign Portable Executable (PE) files. We created our dataset from run trace outputs obtained from dynamic analysis of PE files. The obtained dataset was in the instruction format as a sequence and was called Instruction as a Sequence Model (ISM). By splitting the first dataset into basic blocks, we obtained the second one called Basic Block as a Sequence Model (BSM). The experiments showed that the ISM achieved an accuracy of 87.51% and a false positive rate of 18.34%, while BSM achieved an accuracy of 99.26% and a false positive rate of 2.62%.

[22]  arXiv:2101.05667 (cross-list from cs.IR) [pdf, other]
Title: The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)

We propose a design pattern for tackling text ranking problems, dubbed "Expando-Mono-Duo", that has been empirically validated for a number of ad hoc retrieval tasks in different domains. At the core, our design relies on pretrained sequence-to-sequence models within a standard multi-stage ranking architecture. "Expando" refers to the use of document expansion techniques to enrich keyword representations of texts prior to inverted indexing. "Mono" and "Duo" refer to components in a reranking pipeline based on a pointwise model and a pairwise model that rerank initial candidates retrieved using keyword search. We present experimental results from the MS MARCO passage and document ranking tasks, the TREC 2020 Deep Learning Track, and the TREC-COVID challenge that validate our design. In all these tasks, we achieve effectiveness that is at or near the state of the art, in some cases using a zero-shot approach that does not exploit any training data from the target task. To support replicability, implementations of our design pattern are open-sourced in the Pyserini IR toolkit and PyGaggle neural reranking library.

[23]  arXiv:2101.05779 (cross-list from cs.LG) [pdf, other]
Title: Structured Prediction as Translation between Augmented Natural Languages
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)

We propose a new framework, Translation between Augmented Natural Languages (TANL), to solve many structured prediction language tasks including joint entity and relation extraction, nested named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, and dialogue state tracking. Instead of tackling the problem by training task-specific discriminative classifiers, we frame it as a translation task between augmented natural languages, from which the task-relevant information can be easily extracted. Our approach can match or outperform task-specific models on all tasks, and in particular, achieves new state-of-the-art results on joint entity and relation extraction (CoNLL04, ADE, NYT, and ACE2005 datasets), relation classification (FewRel and TACRED), and semantic role labeling (CoNLL-2005 and CoNLL-2012). We accomplish this while using the same architecture and hyperparameters for all tasks and even when training a single model to solve all tasks at the same time (multi-task learning). Finally, we show that our framework can also significantly improve the performance in a low-resource regime, thanks to better use of label semantics.

Replacements for Fri, 15 Jan 21

[24]  arXiv:2001.00137 (replaced) [pdf, other]
Title: Stacked DeBERT: All Attention in Incomplete Data for Text Classification
Comments: Published (this https URL), Code (this https URL)
Journal-ref: Neural Networks 136 (2021) 87-96
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[25]  arXiv:2005.11882 (replaced) [pdf, other]
Title: Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other Affectual States from Text
Authors: Saif M. Mohammad
Comments: This is the author's manuscript of what is slated to appear in the Second Edition of Emotion Measurement, 2021
Journal-ref: Second Edition of Emotion Measurement, 2021
Subjects: Computation and Language (cs.CL)
[26]  arXiv:2006.01067 (replaced) [pdf, other]
Title: Aligning Faithful Interpretations with their Social Attribution
Comments: Accepted as a journal paper to TACL
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[27]  arXiv:2009.05169 (replaced) [pdf, other]
Title: Sparsifying Transformer Models with Differentiable Representation Pooling
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[28]  arXiv:2101.01628 (replaced) [pdf]
Title: Local Translation Services for Neglected Languages
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[29]  arXiv:2101.04547 (replaced) [pdf, other]
Title: Of Non-Linearity and Commutativity in BERT
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[30]  arXiv:2101.04899 (replaced) [pdf, ps, other]
Title: Experimental Evaluation of Deep Learning models for Marathi Text Classification
Comments: Accepted at ICMISC 2021
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[31]  arXiv:1812.01995 (replaced) [pdf, other]
Title: Deep Learning Model for Finding New Superconductors
Comments: 10 pages in main text. Deep learning, Machine learning, Material search, Superconductors
Journal-ref: Phys. Rev. B 103, 014509, (2021)
Subjects: Machine Learning (cs.LG); Materials Science (cond-mat.mtrl-sci); Superconductivity (cond-mat.supr-con); Computation and Language (cs.CL); Computational Physics (physics.comp-ph)
[32]  arXiv:2004.05916 (replaced) [pdf, other]
Title: Telling BERT's full story: from Local Attention to Global Aggregation
Comments: Accepted at EACL 2021
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL)
[33]  arXiv:2005.02171 (replaced) [pdf]
Title: Neural Computing for Online Arabic Handwriting Character Recognition using Hard Stroke Features Mining
Authors: Amjad Rehman
Comments: 16 pages
Journal-ref: IJICIC 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[34]  arXiv:2012.06185 (replaced) [pdf, ps, other]
Title: Exploring wav2vec 2.0 on speaker verification and language identification
Comments: Self-supervised, speaker verification, language identification, multi-task learning, wav2vec 2.0
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[35]  arXiv:2012.11213 (replaced) [pdf, ps, other]
Title: Self-Supervised Learning for Visual Summary Identification in Scientific Publications
Subjects: Information Retrieval (cs.IR); Computation and Language (cs.CL)
[ total of 35 entries: 1-35 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2101, contact, help  (Access key information)