We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computation and Language

New submissions

[ total of 40 entries: 1-40 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 30 Oct 20

[1]  arXiv:2010.15149 [pdf, other]
Title: DeSMOG: Detecting Stance in Media On Global Warming
Comments: 9 pages, 6 figures (excluding references and appendices). To appear in Findings of EMNLP 2020
Journal-ref: Findings of EMNLP 2020
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)

Citing opinions is a powerful yet understudied strategy in argumentation. For example, an environmental activist might say, "Leading scientists agree that global warming is a serious concern," framing a clause which affirms their own stance ("that global warming is serious") as an opinion endorsed ("[scientists] agree") by a reputable source ("leading"). In contrast, a global warming denier might frame the same clause as the opinion of an untrustworthy source with a predicate connoting doubt: "Mistaken scientists claim [...]." Our work studies opinion-framing in the global warming (GW) debate, an increasingly partisan issue that has received little attention in NLP. We introduce DeSMOG, a dataset of stance-labeled GW sentences, and train a BERT classifier to study novel aspects of argumentation in how different sides of a debate represent their own and each other's opinions. From 56K news articles, we find that similar linguistic devices for self-affirming and opponent-doubting discourse are used across GW-accepting and skeptic media, though GW-skeptical media shows more opponent-doubt. We also find that authors often characterize sources as hypocritical, by ascribing opinions expressing the author's own view to source entities known to publicly endorse the opposing view. We release our stance dataset, model, and lexicons of framing devices for future work on opinion-framing and the automatic detection of GW stance.

[2]  arXiv:2010.15225 [pdf, other]
Title: A Visuospatial Dataset for Naturalistic Verb Learning
Comments: 9 pages, 3 figures, starsem 2020
Subjects: Computation and Language (cs.CL)

We introduce a new dataset for training and evaluating grounded language models. Our data is collected within a virtual reality environment and is designed to emulate the quality of language data to which a pre-verbal child is likely to have access: That is, naturalistic, spontaneous speech paired with richly grounded visuospatial context. We use the collected data to compare several distributional semantics models for verb learning. We evaluate neural models based on 2D (pixel) features as well as feature-engineered models based on 3D (symbolic, spatial) features, and show that neither modeling approach achieves satisfactory performance. Our results are consistent with evidence from child language acquisition that emphasizes the difficulty of learning verbs from naive distributional data. We discuss avenues for future work on cognitively-inspired grounded language learning, and release our corpus with the intent of facilitating research on the topic.

[3]  arXiv:2010.15266 [pdf, other]
Title: CopyNext: Explicit Span Copying and Alignment in Sequence to Sequence Models
Comments: 4th Workshop on Structured Prediction for NLP (EMNLP 2020)
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Copy mechanisms are employed in sequence to sequence models (seq2seq) to generate reproductions of words from the input to the output. These frameworks, operating at the lexical type level, fail to provide an explicit alignment that records where each token was copied from. Further, they require contiguous token sequences from the input (spans) to be copied individually. We present a model with an explicit token-level copy operation and extend it to copying entire spans. Our model provides hard alignments between spans in the input and output, allowing for nontraditional applications of seq2seq, like information extraction. We demonstrate the approach on Nested Named Entity Recognition, achieving near state-of-the-art accuracy with an order of magnitude increase in decoding speed.

[4]  arXiv:2010.15300 [pdf, other]
Title: Uncovering Latent Biases in Text: Method and Application to Peer Review
Subjects: Computation and Language (cs.CL); Computers and Society (cs.CY); Machine Learning (cs.LG)

Quantifying systematic disparities in numerical quantities such as employment rates and wages between population subgroups provides compelling evidence for the existence of societal biases. However, biases in the text written for members of different subgroups (such as in recommendation letters for male and non-male candidates), though widely reported anecdotally, remain challenging to quantify. In this work, we introduce a novel framework to quantify bias in text caused by the visibility of subgroup membership indicators. We develop a nonparametric estimation and inference procedure to estimate this bias. We then formalize an identification strategy to causally link the estimated bias to the visibility of subgroup membership indicators, provided observations from time periods both before and after an identity-hiding policy change. We identify an application wherein "ground truth" bias can be inferred to evaluate our framework, instead of relying on synthetic or secondary data. Specifically, we apply our framework to quantify biases in the text of peer reviews from a reputed machine learning conference before and after the conference adopted a double-blind reviewing policy. We show evidence of biases in the review ratings that serves as "ground truth", and show that our proposed framework accurately detects these biases from the review text without having access to the review ratings.

[5]  arXiv:2010.15313 [pdf, other]
Title: "where is this relationship going?": Understanding Relationship Trajectories in Narrative Text
Comments: Accepted to *Sem 2020
Subjects: Computation and Language (cs.CL)

We examine a new commonsense reasoning task: given a narrative describing a social interaction that centers on two protagonists, systems make inferences about the underlying relationship trajectory. Specifically, we propose two evaluation tasks: Relationship Outlook Prediction MCQ and Resolution Prediction MCQ. In Relationship Outlook Prediction, a system maps an interaction to a relationship outlook that captures how the interaction is expected to change the relationship. In Resolution Prediction, a system attributes a given relationship outlook to a particular resolution that explains the outcome. These two tasks parallel two real-life questions that people frequently ponder upon as they navigate different social situations: "where is this relationship going?" and "how did we end up here?". To facilitate the investigation of human social relationships through these two tasks, we construct a new dataset, Social Narrative Tree, which consists of 1250 stories documenting a variety of daily social interactions. The narratives encode a multitude of social elements that interweave to give rise to rich commonsense knowledge of how relationships evolve with respect to social interactions. We establish baseline performances using language models and the accuracies are significantly lower than human performance. The results demonstrate that models need to look beyond syntactic and semantic signals to comprehend complex human relationships.

[6]  arXiv:2010.15316 [pdf, other]
Title: Multiple Sclerosis Severity Classification From Clinical Text
Comments: EMNLP 2020 Clinical NLP workshop
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Multiple Sclerosis (MS) is a chronic, inflammatory and degenerative neurological disease, which is monitored by a specialist using the Expanded Disability Status Scale (EDSS) and recorded in unstructured text in the form of a neurology consult note. An EDSS measurement contains an overall "EDSS" score and several functional subscores. Typically, expert knowledge is required to interpret consult notes and generate these scores. Previous approaches used limited context length Word2Vec embeddings and keyword searches to predict scores given a consult note, but often failed when scores were not explicitly stated. In this work, we present MS-BERT, the first publicly available transformer model trained on real clinical data other than MIMIC. Next, we present MSBC, a classifier that applies MS-BERT to generate embeddings and predict EDSS and functional subscores. Lastly, we explore combining MSBC with other models through the use of Snorkel to generate scores for unlabelled consult notes. MSBC achieves state-of-the-art performance on all metrics and prediction tasks and outperforms the models generated from the Snorkel ensemble. We improve Macro-F1 by 0.12 (to 0.88) for predicting EDSS and on average by 0.29 (to 0.63) for predicting functional subscores over previous Word2Vec CNN and rule-based approaches.

[7]  arXiv:2010.15360 [pdf, other]
Title: Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection
Subjects: Computation and Language (cs.CL)

Most existing approaches to disfluency detection heavily rely on human-annotated corpora, which is expensive to obtain in practice. There have been several proposals to alleviate this issue with, for instance, self-supervised learning techniques, but they still require human-annotated corpora. In this work, we explore the unsupervised learning paradigm which can potentially work with unlabeled text corpora that are cheaper and easier to obtain. Our model builds upon the recent work on Noisy Student Training, a semi-supervised learning approach that extends the idea of self-training. Experimental results on the commonly used English Switchboard test set show that our approach achieves competitive performance compared to the previous state-of-the-art supervised systems using contextualized word embeddings (e.g. BERT and ELECTRA).

[8]  arXiv:2010.15411 [pdf, other]
Title: Conversation Graph: Data Augmentation, Training and Evaluation for Non-Deterministic Dialogue Management
Comments: Accepted at Transactions of Association of Computational Linguistics (to be presented at ACL 2021)
Subjects: Computation and Language (cs.CL)

Task-oriented dialogue systems typically rely on large amounts of high-quality training data or require complex handcrafted rules. However, existing datasets are often limited in size considering the complexity of the dialogues. Additionally, conventional training signal inference is not suitable for non-deterministic agent behaviour, i.e. considering multiple actions as valid in identical dialogue states. We propose the Conversation Graph (ConvGraph), a graph-based representation of dialogues that can be exploited for data augmentation, multi-reference training and evaluation of non-deterministic agents. ConvGraph generates novel dialogue paths to augment data volume and diversity. Intrinsic and extrinsic evaluation across three datasets shows that data augmentation and/or multi-reference training with ConvGraph can improve dialogue success rates by up to 6.4%.

[9]  arXiv:2010.15423 [pdf, ps, other]
Title: Tilde at WMT 2020: News Task Systems
Subjects: Computation and Language (cs.CL)

This paper describes Tilde's submission to the WMT2020 shared task on news translation for both directions of the English-Polish language pair in both the constrained and the unconstrained tracks. We follow our submissions from the previous years and build our baseline systems to be morphologically motivated sub-word unit-based Transformer base models that we train using the Marian machine translation toolkit. Additionally, we experiment with different parallel and monolingual data selection schemes, as well as sampled back-translation. Our final models are ensembles of Transformer base and Transformer big models that feature right-to-left re-ranking.

[10]  arXiv:2010.15437 [pdf, other]
Title: Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model
Comments: Accepted as a short paper at INLG 2020
Subjects: Computation and Language (cs.CL)

This paper presents a novel fusion method for integrating an external language model (LM) into the Transformer based sequence-to-sequence (seq2seq) model. While paired data are basically required to train the seq2seq model, the external LM can be trained with only unpaired data. Thus, it is important to leverage memorized knowledge in the external LM for building the seq2seq model, since it is hard to prepare a large amount of paired data. However, the existing fusion methods assume that the LM is integrated with recurrent neural network-based seq2seq models instead of the Transformer. Therefore, this paper proposes a fusion method that can explicitly utilize network structures in the Transformer. The proposed method, called {\bf memory attentive fusion}, leverages the Transformer-style attention mechanism that repeats source-target attention in a multi-hop manner for reading the memorized knowledge in the LM. Our experiments on two text-style conversion tasks demonstrate that the proposed method performs better than conventional fusion methods.

[11]  arXiv:2010.15458 [pdf, other]
Title: Named Entity Recognition for Social Media Texts with Semantic Augmentation
Comments: Natural Language Processing. 9 pages, 3 figures. EMNLP-2020
Subjects: Computation and Language (cs.CL)

Existing approaches for named entity recognition suffer from data sparsity problems when conducted on short and informal texts, especially user-generated social media content. Semantic augmentation is a potential way to alleviate this problem. Given that rich semantic information is implicitly preserved in pre-trained word embeddings, they are potential ideal resources for semantic augmentation. In this paper, we propose a neural-based approach to NER for social media texts where both local (from running text) and augmented semantics are taken into account. In particular, we obtain the augmented semantic information from a large-scale corpus, and propose an attentive semantic augmentation module and a gate module to encode and aggregate such information, respectively. Extensive experiments are performed on three benchmark datasets collected from English and Chinese social media platforms, where the results demonstrate the superiority of our approach to previous studies across all three datasets.

[12]  arXiv:2010.15466 [pdf, other]
Title: Improving Named Entity Recognition with Attentive Ensemble of Syntactic Information
Comments: Natural Language Processing. 15 pages, 3 figures, Findings of EMNLP-2020
Subjects: Computation and Language (cs.CL)

Named entity recognition (NER) is highly sensitive to sentential syntactic and semantic properties where entities may be extracted according to how they are used and placed in the running text. To model such properties, one could rely on existing resources to providing helpful knowledge to the NER task; some existing studies proved the effectiveness of doing so, and yet are limited in appropriately leveraging the knowledge such as distinguishing the important ones for particular context. In this paper, we improve NER by leveraging different types of syntactic information through attentive ensemble, which functionalizes by the proposed key-value memory networks, syntax attention, and the gate mechanism for encoding, weighting and aggregating such syntactic information, respectively. Experimental results on six English and Chinese benchmark datasets suggest the effectiveness of the proposed model and show that it outperforms previous studies on all experiment datasets.

[13]  arXiv:2010.15535 [pdf, ps, other]
Title: Unbabel's Participation in the WMT20 Metrics Shared Task
Comments: WMT Metrics Shared Task 2020
Subjects: Computation and Language (cs.CL)

We present the contribution of the Unbabel team to the WMT 2020 Shared Task on Metrics. We intend to participate on the segment-level, document-level and system-level tracks on all language pairs, as well as the 'QE as a Metric' track. Accordingly, we illustrate results of our models in these tracks with reference to test sets from the previous year. Our submissions build upon the recently proposed COMET framework: We train several estimator models to regress on different human-generated quality scores and a novel ranking model trained on relative ranks obtained from Direct Assessments. We also propose a simple technique for converting segment-level predictions into a document-level score. Overall, our systems achieve strong results for all language pairs on previous test sets and in many cases set a new state-of-the-art.

[14]  arXiv:2010.15598 [pdf, other]
Title: May I Ask Who's Calling? Named Entity Recognition on Call Center Transcripts for Privacy Law Compliance
Authors: Micaela Kaplan
Comments: The 6th Workshop on Noisy User-generated Text (W-NUT) 2020 at EMNLP
Journal-ref: Proceedings of the 2020 EMNLP Workshop W-NUT: The Sixth Workshop on Noisy User-generated Text (2020) 1-6
Subjects: Computation and Language (cs.CL)

We investigate using Named Entity Recognition on a new type of user-generated text: a call center conversation. These conversations combine problems from spontaneous speech with problems novel to conversational Automated Speech Recognition, including incorrect recognition, alongside other common problems from noisy user-generated text. Using our own corpus with new annotations, training custom contextual string embeddings, and applying a BiLSTM-CRF, we match state-of-the-art results on our novel task.

[15]  arXiv:2010.15728 [pdf, other]
Title: Explainable Automated Coding of Clinical Notes using Hierarchical Label-wise Attention Networks and Label Embedding Initialisation
Comments: Structured abstract in full text, 17 pages, 5 figures, 4 supplementary materials (3 extra pages), submitted to Journal of Biomedical Informatics
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)

Diagnostic or procedural coding of clinical notes aims to derive a coded summary of disease-related information about patients. Such coding is usually done manually in hospitals but could potentially be automated to improve the efficiency and accuracy of medical coding. Recent studies on deep learning for automated medical coding achieved promising performances. However, the explainability of these models is usually poor, preventing them to be used confidently in supporting clinical practice. Another limitation is that these models mostly assume independence among labels, ignoring the complex correlation among medical codes which can potentially be exploited to improve the performance. We propose a Hierarchical Label-wise Attention Network (HLAN), which aimed to interpret the model by quantifying importance (as attention weights) of words and sentences related to each of the labels. Secondly, we propose to enhance the major deep learning models with a label embedding (LE) initialisation approach, which learns a dense, continuous vector representation and then injects the representation into the final layers and the label-wise attention layers in the models. We evaluated the methods using three settings on the MIMIC-III discharge summaries: full codes, top-50 codes, and the UK NHS COVID-19 shielding codes. Experiments were conducted to compare HLAN and LE initialisation to the state-of-the-art neural network based methods. HLAN achieved the best Micro-level AUC and $F_1$ on the top-50 code prediction and comparable results on the NHS COVID-19 shielding code prediction to other models. By highlighting the most salient words and sentences for each label, HLAN showed more meaningful and comprehensive model interpretation compared to its downgraded baselines and the CNN-based models. LE initialisation consistently boosted most deep learning models for automated medical coding.

[16]  arXiv:2010.15778 [pdf, other]
Title: Contextual BERT: Conditioning the Language Model Using a Global State
Comments: Accepted at the TextGraphs-14 workshop at COLING'2020 - The 28th International Conference on Computational Linguistics
Subjects: Computation and Language (cs.CL)

BERT is a popular language model whose main pre-training task is to fill in the blank, i.e., predicting a word that was masked out of a sentence, based on the remaining words. In some applications, however, having an additional context can help the model make the right prediction, e.g., by taking the domain or the time of writing into account. This motivates us to advance the BERT architecture by adding a global state for conditioning on a fixed-sized context. We present our two novel approaches and apply them to an industry use-case, where we complete fashion outfits with missing articles, conditioned on a specific customer. An experimental comparison to other methods from the literature shows that our methods improve personalization significantly.

Cross-lists for Fri, 30 Oct 20

[17]  arXiv:2010.15251 (cross-list from cs.CV) [pdf, other]
Title: Fusion Models for Improved Visual Captioning
Comments: Under review at "Multi-Modal Deep Learning: Challenges and Applications", ICPR-2020
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)

Visual captioning aims to generate textual descriptions given images. Traditionally, the captioning models are trained on human annotated datasets such as Flickr30k and MS-COCO, which are limited in size and diversity. This limitation hinders the generalization capabilities of these models while also rendering them to often make mistakes. Language models can, however, be trained on vast amounts of freely available unlabelled data and have recently emerged as successful language encoders and coherent text generators. Meanwhile, several unimodal and multimodal fusion techniques have been proven to work well for natural language generation and automatic speech recognition. Building on these recent developments, and with an aim of improving the quality of generated captions, the contribution of our work in this paper is two-fold: First, we propose a generic multimodal model fusion framework for caption generation as well as emendation where we utilize different fusion strategies to integrate a pretrained Auxiliary Language Model (AuxLM) within the traditional encoder-decoder visual captioning frameworks. Next, we employ the same fusion strategies to integrate a pretrained Masked Language Model (MLM), namely BERT, with a visual captioning model, viz. Show, Attend, and Tell, for emending both syntactic and semantic errors in captions. Our caption emendation experiments on three benchmark image captioning datasets, viz. Flickr8k, Flickr30k, and MSCOCO, show improvements over the baseline, indicating the usefulness of our proposed multimodal fusion strategies. Further, we perform a preliminary qualitative analysis on the emended captions and identify error categories based on the type of corrections.

[18]  arXiv:2010.15366 (cross-list from cs.SD) [pdf, other]
Title: Self-supervised Pre-training Reduces Label Permutation Instability of Speech Separation
Comments: submitted to ICASSP2021
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)

Speech separation has been well-developed while there are still problems waiting to be solved. The main problem we focus on in this paper is the frequent label permutation switching of permutation invariant training (PIT). For N-speaker separation, there would be N! possible label permutations. How to stably select correct label permutations is a long-standing problem. In this paper, we utilize self-supervised pre-training to stabilize the label permutations. Among several types of self-supervised tasks, speech enhancement based pre-training tasks show significant effectiveness in our experiments. When using off-the-shelf pre-trained models, training duration could be shortened to one-third to two-thirds. Furthermore, even taking pre-training time into account, the entire training process could still be shorter without a performance drop when using a larger batch size.

[19]  arXiv:2010.15600 (cross-list from cs.LO) [pdf, ps, other]
Title: Three computational models and its equivalence
Subjects: Logic in Computer Science (cs.LO); Computational Complexity (cs.CC); Computation and Language (cs.CL); General Literature (cs.GL)

The study of computability has its origin in Hilbert's conference of 1900, where an adjacent question, to the ones he asked, is to give a precise description of the notion of algorithm. In the search for a good definition arose three independent theories: Turing and the Turing machines, G\"odel and the recursive functions, Church and the Lambda Calculus.
Later there were established by Kleene that the classic models of computation are equivalent. This fact is widely accepted by many textbooks and the proof is omitted since the proof is tedious and unreadable. We intend to fill this gap presenting the proof in a modern way, without forgetting the mathematical details.

[20]  arXiv:2010.15602 (cross-list from cs.CY) [pdf, other]
Title: Designing learning experiences for online teaching and learning
Subjects: Computers and Society (cs.CY); Computation and Language (cs.CL)

Teaching is about constantly innovating strategies, ways and means to engage diverse students in active and meaningful learning. In line with this, SUTD adopts various student-centric teaching and learning teaching methods and approaches. This means that our graduate/undergraduate instructors have to be ready to teach using these student student-centric teaching and learning pedagogies. In this article, I share my experiences of redesigning this teaching course that is typically conducted face-to-face to a synchronous online course and also invite one of the participant in this course to reflect on his experience as a student.

[21]  arXiv:2010.15653 (cross-list from cs.LG) [pdf, other]
Title: Semi-Supervised Speech Recognition via Graph-based Temporal Classification
Comments: Submitted to ICASSP 2021
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Semi-supervised learning has demonstrated promising results in automatic speech recognition (ASR) by self-training using a seed ASR model with pseudo-labels generated for unlabeled data. The effectiveness of this approach largely relies on the pseudo-label accuracy, for which typically only the 1-best ASR hypothesis is used. However, alternative ASR hypotheses of an N-best list can provide more accurate labels for an unlabeled speech utterance and also reflect uncertainties of the seed ASR model. In this paper, we propose a generalized form of the connectionist temporal classification (CTC) objective that accepts a graph representation of the training targets. The newly proposed graph-based temporal classification (GTC) objective is applied for self-training with WFST-based supervision, which is generated from an N-best list of pseudo-labels. In this setup, GTC is used to learn not only a temporal alignment, similarly to CTC, but also a label alignment to obtain the optimal pseudo-label sequence from the weighted graph. Results show that this approach can effectively exploit an N-best list of pseudo-labels with associated scores, outperforming standard pseudo-labeling by a large margin, with ASR results close to an oracle experiment in which the best hypotheses of the N-best lists are selected manually.

Replacements for Fri, 30 Oct 20

[22]  arXiv:1907.06226 (replaced) [pdf, other]
Title: Lexical Simplification with Pretrained Encoders
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[23]  arXiv:1911.02711 (replaced) [pdf, other]
Title: Making the Best Use of Review Summary for Sentiment Analysis
Comments: To be published in COLING-2020
Subjects: Computation and Language (cs.CL)
[24]  arXiv:1911.03875 (replaced) [pdf, other]
Title: Rethinking Self-Attention: Towards Interpretability in Neural Parsing
Comments: EMNLP 2020
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[25]  arXiv:1912.05320 (replaced) [pdf, other]
Title: CoSimLex: A Resource for Evaluating Graded Word Similarity in Context
Journal-ref: Proceedings of the 12th Language Resources and Evaluation Conference (2020) 5878-5886
Subjects: Computation and Language (cs.CL)
[26]  arXiv:2004.00499 (replaced) [pdf, other]
Title: Unique Chinese Linguistic Phenomena
Authors: Shengbin Jia
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[27]  arXiv:2004.03096 (replaced) [pdf, other]
Title: Is Graph Structure Necessary for Multi-hop Question Answering?
Comments: 6 pages, to appear at EMNLP 2020
Subjects: Computation and Language (cs.CL)
[28]  arXiv:2005.12889 (replaced) [pdf, other]
Title: Refining Implicit Argument Annotation for UCCA
Comments: DMR 2020
Subjects: Computation and Language (cs.CL)
[29]  arXiv:2009.07253 (replaced) [pdf, other]
Title: Autoregressive Knowledge Distillation through Imitation Learning
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[30]  arXiv:2010.02480 (replaced) [pdf, other]
Title: Pretrained Language Model Embryology: The Birth of ALBERT
Comments: Accepted to EMNLP 2020, short paper
Subjects: Computation and Language (cs.CL)
[31]  arXiv:2010.02510 (replaced) [pdf, other]
Title: Investigating African-American Vernacular English in Transformer-Based Text Generation
Comments: 7 pages, EMNLP 2020
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
[32]  arXiv:2010.06351 (replaced) [pdf, other]
Title: CAPT: Contrastive Pre-Training for Learning Denoised Sequence Representations
Subjects: Computation and Language (cs.CL)
[33]  arXiv:2010.14571 (replaced) [pdf, other]
Title: Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus
Comments: Accepted to COLING 2020. 9 pages with 8 page abstract
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG)
[34]  arXiv:2010.14584 (replaced) [pdf, other]
Title: Predicting Themes within Complex Unstructured Texts: A Case Study on Safeguarding Reports
Comments: 10 pages, 5 figures, workshop
Subjects: Computation and Language (cs.CL)
[35]  arXiv:2005.14435 (replaced) [pdf, ps, other]
Title: Sub-Band Knowledge Distillation Framework for Speech Enhancement
Comments: Published in Interspeech 2020
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[36]  arXiv:2005.14441 (replaced) [pdf, other]
Title: SNR-Based Teachers-Student Technique for Speech Enhancement
Comments: Published in 2020 IEEE International Conference on Multimedia and Expo (ICME 2020)
Subjects: Audio and Speech Processing (eess.AS); Computation and Language (cs.CL); Sound (cs.SD)
[37]  arXiv:2006.07214 (replaced) [pdf, other]
Title: Sparse and Continuous Attention Mechanisms
Comments: Accepted for spotlight presentation at NeurIPS 2020
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[38]  arXiv:2010.00182 (replaced) [pdf, other]
Title: Dual Attention Model for Citation Recommendation
Authors: Yang Zhang, Qiang Ma
Subjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[39]  arXiv:2010.10759 (replaced) [pdf, other]
Title: Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Comments: 5 pages, 2 figures, submitted to ICASSP 2021
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[40]  arXiv:2010.15058 (replaced) [pdf, other]
Title: Measuring non-trivial compositionality in emergent communication
Comments: 4th Workshop on Emergent Communication, NeurIPS 2020
Subjects: Neural and Evolutionary Computing (cs.NE); Computation and Language (cs.CL); Machine Learning (cs.LG)
[ total of 40 entries: 1-40 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2010, contact, help  (Access key information)