Publications

Filter by type:

Explanations shed light on a machine learning model’s rationales and can aid in identifying deficiencies in its reasoning …

The goal of stance detection is to determine the viewpoint expressed in a piece of text towards a target. These viewpoints or contexts …

Truth can vary over time. Therefore, fact-checking decisions on claim veracity should take into account temporal information of both …

Public trust in science depends on honest and factual communication of scientific papers. However, recent studies have demonstrated a …

As NLP models are increasingly deployed in socially situated settings such as online abusive content detection, ensuring these models …

Stance detection concerns the classification of a writer’s viewpoint towards a target. There are different task variants, e.g., …

Alongside huge volumes of research on deep learning models in NLP in the recent years, there has been also much work on benchmark …

Emotion lexica are commonly used resources to combat data poverty in automatic emotion detection. However, methodological issues emerge …

Most work on scholarly document processing assumes that the information processed is trust-worthy and factually correct. However, this …

Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world. …

Sparse attention has been claimed to increase model interpretability under the assumption that it highlights influential inputs. Yet …

Scientific document understanding is challenging as the data is highly domain specific and diverse. However, datasets for tasks with …

Recently, novel multi-hop models and datasets have been introduced to achieve more complex natural language reasoning with neural …

While the prevalence of large pre-trained language models has led to significant improvements in the performance of NLP systems, recent …

We propose a novel interpretable framework for cross-lingual content flagging, which significantly outperforms prior work both in terms …

Abusive language on online platforms is a major societal problem, often leading to important societal problems such as the …

Detecting attitudes expressed in texts, also known as stance detection, has become an important task for the detection of false …

Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to …

Bridging the performance gap between high- and low-resource languages has been the focus of much previous work. Typological features …

The effectiveness of a language model is influenced by its token representations, which must encode contextual information and handle …

Citation count prediction is the task of predicting the number of citations a paper has gained after a period of time. Prior work …

In this paper, we describe our participation in the TREC Health Misinformation Track 2020. We submitted 11 runs to the Total Recall …

For natural language processing (NLP) tasks such as sentiment or topic classification, currently prevailing approaches heavily rely on …

Learning what to share between tasks has been a topic of high importance recently, as strategic sharing of knowledge has been shown to …

In practical machine learning settings, the data on which a model must make predictions often come from a different distribution than …

Subjectivity is the expression of internal opinions or beliefs which cannot be objectively observed or verified, and has been shown to …

Adversarial attacks reveal important vulnerabilities and flaws of trained models. One potent type of attack are universal adversarial …

Recent developments in machine learning have introduced models that approach human performance at the cost of increased architectural …

Peer review is our best tool for judging the quality of conference submissions, but it is becoming increasingly spurious. We argue that …

A critical component of automatically combating misinformation is the detection of fact check-worthiness, i.e. determining if a piece …

It is challenging to automatically evaluate the answer of a QA model at inference time. Although many models provide confidence scores, …

Typological knowledge bases (KBs) such as WALS contain information about linguistic properties of the world’s languages. They …

Machine Learning (ML) seeks to identify and encode bodies of knowledge within provided datasets. However, data encodes subjective …

While state-of-the-art NLP explainability (XAI) methods focus on supervised, per-instance end or diagnostic probing task evaluation[4, …

This paper provides the first study of how fact checking explanations can be generated automatically based on available claim context, …

We propose a novel Chinese character conversion model that can disambiguate between mappings and convert between the two scripts. The …

In this paper, we extend the task of semantic textual similarity to include sentences which contain emojis. Emojis are ubiquitous on …

Language evolves over time in many ways relevant to natural language processing tasks. For example, recent occurrences of tokens …

We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim …

Digital media enables not only fast sharing of information, but also disinformation. One prominent case of an event leading to …

Although the vast majority of knowledge bases KBs are heavily biased towards English, Wikipedias do cover very different topics in …

Task oriented dialogue systems rely heavily on specialized dialogue state tracking (DST) modules for dynamically predicting user intent …

Task oriented dialogue systems rely heavily on specialized dialogue state tracking (DST) modules for dynamically predicting user intent …

Multi-task learning and self-training are two common ways to improve a machine learning model’s performance in settings with …

Studying to what degree the language we use is gender-specific has long been an area of interest in socio-linguistics. Studies have …

The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages …

The workshop has a focus on vector space models of meaning, compositionality, and the application of deep neural networks and spectral …

When assigning quantitative labels to a dataset, different methodologies may rely on different scales. In particular, when assigning …

In the Principles and Parameters framework, the structural features of languages depend on parameters that may be toggled on or off, …

In online discussion fora, speakers often make arguments for or against something, say birth control, by highlighting certain aspects …

A neural language model trained on a text corpus can be used to induce distributed representations of words, such that similar words …

Multi-task learning (MTL) allows deep neural networks to learn from related tasks by sharing parameters with other networks. In …

The best systems at the SemEval-16 and SemEval-17 community question answering shared tasks – a task that amounts to question …

Previous work has suggested that parameter sharing between transition-based neural dependency parsers for related languages can lead to …

This paper documents the Team Copenhagen system which placed first in the CoNLL–SIGMORPHON 2018 shared task on universal …

Punctuation is a strong indicator of syntactic structure, and parsers trained on text with punctuation often rely heavily on this …

Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For …

The workshop has a focus on vector space models of meaning, compositionality, and the application of deep neural networks and spectral …

Neural part-of-speech (POS) taggers are known to not perform well with little training data. As a step towards overcoming this problem, …

We combine multi-task learning and semisupervised learning by inducing a joint embedding space between disparate label spaces and …

A core part of linguistic typology is the classification of languages according to linguistic properties, such as those detailed in the …

We take a multi-task learning approach to the shared Task 1 at SemEval-2018. The general idea concerning the model structure is to use …

Rumour stance classification, defined as classifying the stance of specific social media posts into one of supporting, denying, …

Although linguistic typology has a long history, computational approaches have only recently gained popularity. The use of distributed …

Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to …

Named Entity Recognition (NER) is a key NLP task, which is all the more challenging on Web and user-generated content with their …

Identifying public misinformation is a complicated and challenging task. An important part of checking the veracity of a specific claim …

Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on …

This paper describes team Turing’s submission to SemEval 2017 RumourEval: Determining rumour veracity and support for rumours …

Rumour stance classification is a task that involves identifying the attitude of Twitter users towards the truthfulness of the rumour …

We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for …

We propose a novel similarity measure able to cope with unbalanced population of schema elements, an unsupervised technique to …