Publications

Filter by type:

Knowledge-intensive language understanding tasks require Language Models (LMs) to integrate relevant context, mitigating their inherent …

How much meaning influences gender assignment across languages is an active area of research in modern linguistics and cognitive …

The emergence of tools based on large language models (LLMs), like OpenAI’s ChatGPT and Google’s Gemini, has garnered immense public …

Uncovering latent values and opinions in large language models (LLMs) can help identify biases and mitigate potential harm. Recently, …

Question Answering (QA) datasets have been instrumental in developing and evaluating Large Language Model (LLM) capabilities. However, …

We are exposed to much information trying to influence us, such as teaser messages, debates, politically framed news, and propaganda - …

Explaining the decision-making process of machine learning models is crucial for ensuring their reliability and fairness. One popular …

Language Models (LMs) acquire parametric knowledge from their training process, embedding it within their weights. The increasing …

What can large language models learn? By definition, language models (LM) are distributions over strings. Therefore, an intuitive way …

Explainable AI methods facilitate the understanding of model behaviour, yet, small, imperceptible perturbations to inputs can vastly …

Distorted science communication harms individuals and society as it can lead to unhealthy behavior change and decrease trust in …

Accurate assessments of symptoms and diagnoses are essential for health research and clinical practice but face many challenges. The …

Human values play a vital role as an analytical tool in social sciences, enabling the study of diverse dimensions within society as a …

Recent studies of the emergent capabilities of transformer-based Natural Language Understanding (NLU) models have indicated that they …

Despite mounting evidence that women in foreign policy often bear the brunt of online hostility, the extent of online gender bias …

Large language models have been shown to encode a variety of social biases, which carries the risk of downstream harms. While the …

The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the …

While the prevalence of large pre-trained language models has led to significant improvements in the performance of NLP systems, recent …

NLP models are used in a variety of critical social computing tasks, such as detecting sexist, racist, or otherwise hateful content. …

The digitisation of historical documents has provided historians with unprecedented research opportunities. Yet, the conventional …

Reasoning over spans of tokens from different parts of the input is essential for natural language understanding (NLU) tasks such as …

The moderation of content on online platforms is usually non-transparent. On Wikipedia, however, this discussion is carried out …

Dual use, the intentional, harmful reuse of technology and scientific artefacts, is a problem yet to be well-defined within the context …

Answering complex queries on incomplete knowledge graphs is a challenging task where a model needs to answer complex logical queries in …

Uncertainty approximation in text classification is an important area with applications in domain adaptation and interpretability. One …

The proliferation of harmful content on online platforms is a major societal problem, which comes in many different forms including …

The task of Stance Detection is concerned with identifying the attitudes expressed by an author towards a target of interest. This task …

NLP methods can aid historians in analyzing textual materials in greater volumes than manually feasible. Developing such methods poses …

Data-driven analyses of biases in historical texts can help illuminate the origin and development of biases prevailing in modern …

Explanations of neural models aim to reveal a model’s decision-making process for its predictions. However, recent work shows …

Language embeds information about social, cultural, and political values people hold. Prior work has explored social and potentially …

Pre-trained language models have been known to perpetuate biases from the underlying datasets to downstream tasks. However, these …

Selecting an effective training signal for tasks in natural language processing is difficult: expert annotations are expensive, and …

The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic …

Fact-checking systems have become important tools to verify fake and misguiding news. These systems become more trustworthy when …

Learning scientific document representations can be substantially improved through contrastive learning objectives, where the challenge …

Whether the media faithfully communicate scientific information has long been a core issue to the science community. Automatically …

In our continuously evolving world, entities change over time and new, previously non-existing or unknown, entities appear. We study …

Two of the most fundamental challenges in Natural Language Understanding (NLU) at present are: (a) how to establish whether deep …

There have been many efforts to try to understand what grammatical knowledge (e.g., ability to understand the part of speech of a …

Alongside huge volumes of research on deep learning models in NLP in the recent years, there has been also much work on benchmark …

Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to …

Despite attempts to increase gender parity in politics, global efforts have struggled to ensure equal female representation. This is …

With the substantial rise in the amount of mis- and disinformation online, fact checking has become an important task to automate. This …

Detecting attitudes expressed in texts, also known as stance detection, has become an important task for the detection of false …

The success of multilingual pre-trained models is underpinned by their ability to learn representations shared by multiple languages …

Counterfactually Augmented Data (CAD) aims to improve out-of-domain generalizability, an indicator of model robustness. The improvement …

Automating the fact checking (FC) process relies on information obtained from external sources. In this work, we posit that it is …

Automated scientific fact checking is difficult due to the complexity of scientific language and a lack of significant amounts of …

This paper presents the Multitask, Multilingual, Multimodal Language Generation COST Action – Multi3Generation (CA18231), an …

The effectiveness of a language model is influenced by its token representations, which must encode contextual information and handle …

The goal of stance detection is to determine the viewpoint expressed in a piece of text towards a target. These viewpoints or contexts …

Explanations shed light on a machine learning model’s rationales and can aid in identifying deficiencies in its reasoning …

While the prevalence of large pre-trained language models has led to significant improvements in the performance of NLP systems, recent …

We propose a novel framework for cross-lingual content flagging with limited target-language data, which significantly outperforms …

For natural language processing (NLP) tasks such as sentiment or topic classification, currently prevailing approaches heavily rely on …

Citation count prediction is the task of predicting the number of citations a paper has gained after a period of time. Prior work …

Medical artificial intelligence (AI) systems have been remarkably successful, even outperforming human performance at certain tasks. …

Truth can vary over time. Therefore, fact-checking decisions on claim veracity should take into account temporal information of both …

Public trust in science depends on honest and factual communication of scientific papers. However, recent studies have demonstrated a …

As NLP models are increasingly deployed in socially situated settings such as online abusive content detection, ensuring these models …

Stance detection concerns the classification of a writer’s viewpoint towards a target. There are different task variants, e.g., …

Emotion lexica are commonly used resources to combat data poverty in automatic emotion detection. However, methodological issues emerge …

Most work on scholarly document processing assumes that the information processed is trust-worthy and factually correct. However, this …

Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world. …

Sparse attention has been claimed to increase model interpretability under the assumption that it highlights influential inputs. Yet …

Scientific document understanding is challenging as the data is highly domain specific and diverse. However, datasets for tasks with …

Recently, novel multi-hop models and datasets have been introduced to achieve more complex natural language reasoning with neural …

Bridging the performance gap between high- and low-resource languages has been the focus of much previous work. Typological features …

The past decade has seen a substantial rise in the amount of mis- and disinformation online, from targeted disinformation campaigns to …

In this paper, we describe our participation in the TREC Health Misinformation Track 2020. We submitted 11 runs to the Total Recall …

Learning what to share between tasks has been a topic of high importance recently, as strategic sharing of knowledge has been shown to …

In practical machine learning settings, the data on which a model must make predictions often come from a different distribution than …

Subjectivity is the expression of internal opinions or beliefs which cannot be objectively observed or verified, and has been shown to …

Adversarial attacks reveal important vulnerabilities and flaws of trained models. One potent type of attack are universal adversarial …

Recent developments in machine learning have introduced models that approach human performance at the cost of increased architectural …

Peer review is our best tool for judging the quality of conference submissions, but it is becoming increasingly spurious. We argue that …

A critical component of automatically combating misinformation is the detection of fact check-worthiness, i.e. determining if a piece …

It is challenging to automatically evaluate the answer of a QA model at inference time. Although many models provide confidence scores, …

Typological knowledge bases (KBs) such as WALS contain information about linguistic properties of the world’s languages. They …

Machine Learning (ML) seeks to identify and encode bodies of knowledge within provided datasets. However, data encodes subjective …

While state-of-the-art NLP explainability (XAI) methods focus on supervised, per-instance end or diagnostic probing task evaluation[4, …

This paper provides the first study of how fact checking explanations can be generated automatically based on available claim context, …

We propose a novel Chinese character conversion model that can disambiguate between mappings and convert between the two scripts. The …

In this paper, we extend the task of semantic textual similarity to include sentences which contain emojis. Emojis are ubiquitous on …

Language evolves over time in many ways relevant to natural language processing tasks. For example, recent occurrences of tokens …

We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim …

Digital media enables not only fast sharing of information, but also disinformation. One prominent case of an event leading to …

Although the vast majority of knowledge bases KBs are heavily biased towards English, Wikipedias do cover very different topics in …

Task oriented dialogue systems rely heavily on specialized dialogue state tracking (DST) modules for dynamically predicting user intent …

Task oriented dialogue systems rely heavily on specialized dialogue state tracking (DST) modules for dynamically predicting user intent …

Multi-task learning and self-training are two common ways to improve a machine learning model’s performance in settings with …

Studying to what degree the language we use is gender-specific has long been an area of interest in socio-linguistics. Studies have …

The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages …

The workshop has a focus on vector space models of meaning, compositionality, and the application of deep neural networks and spectral …

When assigning quantitative labels to a dataset, different methodologies may rely on different scales. In particular, when assigning …

In the Principles and Parameters framework, the structural features of languages depend on parameters that may be toggled on or off, …

In online discussion fora, speakers often make arguments for or against something, say birth control, by highlighting certain aspects …

A neural language model trained on a text corpus can be used to induce distributed representations of words, such that similar words …

Multi-task learning (MTL) allows deep neural networks to learn from related tasks by sharing parameters with other networks. In …

The best systems at the SemEval-16 and SemEval-17 community question answering shared tasks – a task that amounts to question …

Previous work has suggested that parameter sharing between transition-based neural dependency parsers for related languages can lead to …

This paper documents the Team Copenhagen system which placed first in the CoNLL–SIGMORPHON 2018 shared task on universal …

Punctuation is a strong indicator of syntactic structure, and parsers trained on text with punctuation often rely heavily on this …

Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For …

The workshop has a focus on vector space models of meaning, compositionality, and the application of deep neural networks and spectral …

Neural part-of-speech (POS) taggers are known to not perform well with little training data. As a step towards overcoming this problem, …

We combine multi-task learning and semisupervised learning by inducing a joint embedding space between disparate label spaces and …

A core part of linguistic typology is the classification of languages according to linguistic properties, such as those detailed in the …

We take a multi-task learning approach to the shared Task 1 at SemEval-2018. The general idea concerning the model structure is to use …

Rumour stance classification, defined as classifying the stance of specific social media posts into one of supporting, denying, …

Although linguistic typology has a long history, computational approaches have only recently gained popularity. The use of distributed …

Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to …

Named Entity Recognition (NER) is a key NLP task, which is all the more challenging on Web and user-generated content with their …

Identifying public misinformation is a complicated and challenging task. An important part of checking the veracity of a specific claim …

Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on …

This paper describes team Turing’s submission to SemEval 2017 RumourEval: Determining rumour veracity and support for rumours …

Rumour stance classification is a task that involves identifying the attitude of Twitter users towards the truthfulness of the rumour …

We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for …

We propose a novel similarity measure able to cope with unbalanced population of schema elements, an unsupervised technique to …