Publications | CopeNLU

Modeling Public Perceptions of Science in Media

Effectively engaging the public with science is vital for fostering trust and understanding in our scientific community. Yet, with an …

Jiaxin Pei, Dustin Wright, Isabelle Augenstein, David Jurgens

PDF Project

Explainability and Interpretability of Multilingual Large Language Models: A Survey

Multilingual large language models (MLLMs) demonstrate state-of-the-art capabilities across diverse cross-lingual and multilingual …

Lucas Resck, Isabelle Augenstein, Anna Korhonen

PDF Project Project

Community Moderation and the New Epistemology of Fact Checking on Social Media

Social media platforms have traditionally relied on internal moderation teams and partnerships with independent fact-checking …

Isabelle Augenstein, Michiel Bakker, Tanmoy Chakraborty, David Corney, Emilio Ferrara, Iryna Gurevych, Scott Hale, Eduard Hovy, Heng Ji, Irene Larraz, Filippo Menczer, Preslav Nakov, Paolo Papotti, Dhruv Sahnan, Greta Warren, Giovanni Zagni

PDF Project

Explaining Sources of Uncertainty in Automated Fact-Checking

Understanding sources of a model’s uncertainty regarding its predictions is crucial for effective human-AI collaboration. Prior …

Jingyi Sun, Greta Warren, Irina Shklovski, Isabelle Augenstein

PDF Project

CUB: Benchmarking Context Utilisation Techniques for Language Models

Incorporating external knowledge is crucial for knowledge-intensive tasks, such as question answering and fact checking. However, …

Lovisa Hagström, Youna Kim, Haeun Yu, Sang-goo Lee, Richard Johansson, Hyunsoo Cho, Isabelle Augenstein

PDF Project

Can Community Notes Replace Professional Fact-Checkers?

Two commonly-employed strategies to combat the rise of misinformation on social media are (i) fact-checking by professional …

Greta Warren, Nadav Borenstein, Desmond Elliott, Isabelle Augenstein

PDF Project

A Reality Check on Context Utilisation for Retrieval-Augmented Generation

Retrieval-augmented generation (RAG) helps address the limitations of the parametric knowledge embedded within a language model (LM). …

Lovisa Hagström, Sara Vera Marjanović, Haeun Yu, Arnav Arora, Christina Lioma, Maria Maistro, Pepa Atanasova, Isabelle Augenstein

PDF Project

The LEADING Guideline Reporting Standards for Expert Panel, Best-Estimate Diagnosis, and Longitudinal Expert All Data (LEAD) Studies

Accurate assessments of symptoms and diagnoses are essential for health research and clinical practice but face many challenges. The …

Veerle C Eijsbroek, Katarina Kjell, H Andrew Schwartz, Jan R Boehnke, Eiko I Fried, Daniel N Klein, Peik Gustafsson, Isabelle Augenstein, Patrick M M Bossuyt, Oscar Kjell

PDF

Survey of Cultural Awareness in Language Models: Text and Beyond

Large-scale deployment of large language models (LLMs) in various applications, such as chatbots and virtual assistants, requires LLMs …

Siddhesh Milind Pawar, Junyeong Park, Jiho Jin, Arnav Arora, Junho Myung, Srishti Yadav, Faiz Ghifari Haznitrama, Inhwa Song, Alice Oh, Isabelle Augenstein

PDF Project

Collecting Cost-Effective, High-Quality Truthfulness Assessments with LLM Summarized Evidence

With the degradation of guardrails against mis- and disinformation online, it is more critical than ever to be able to effectively …

Kevin Roitero, Dustin Wright, Michael Soprano, Isabelle Augenstein, Stefano Mizzaro

PDF Project Project

Multi-View Knowledge Distillation from Crowd Annotations for Out-of-Domain Generalization

Selecting an effective training signal for tasks in natural language processing is difficult: expert annotations are expensive, and …

Dustin Wright, Isabelle Augenstein

PDF Project

Multi-Modal Framing Analysis of News

Automated frame analysis of political communication is a popular task in computational social science that is used to study how authors …

Arnav Arora, Srishti Yadav, Maria Antoniak, Serge Belongie, Isabelle Augenstein

PDF Project Project Project

A Meta-Evaluation of Style and Attribute Transfer Metrics

LLMs make it easy to rewrite text in any style, be it more polite, persuasive, or more positive. We present a large-scale study of …

Amalie Brogaard Pauli, Isabelle Augenstein, Ira Assent

PDF Project

Unstructured Evidence Attribution for Long Context Query Focused Summarization

Large language models (LLMs) are capable of generating coherent summaries from very long contexts given a user query. Extracting and …

Dustin Wright, Zain Muhammad Mujahid, Lu Wang, Isabelle Augenstein, David Jurgens

PDF Project Project

Presumed Cultural Identity: How Names Shape LLM Responses

Names are deeply tied to human identity. They can serve as markers of individuality, cultural heritage, and personal history. However, …

Siddhesh Pawar, Arnav Arora, Lucie-Aimée Kaffee, Isabelle Augenstein

PDF Project

Show Me the Work: Fact-Checkers' Requirements for Explainable Automated Fact-Checking

The pervasiveness of large language models and generative AI in online media has amplified the need for effective automated …

Greta Warren, Irina Shklovski, Isabelle Augenstein

PDF Project Project

Specializing Large Language Models to Simulate Survey Response Distributions for Global Populations

Large-scale surveys are essential tools for informing social science research and policy, but running surveys is costly and …

Yong Cao, Haijiang Liu, Arnav Arora, Isabelle Augenstein, Paul Röttger, Daniel Hershcovich

Project

Measuring and Benchmarking Large Language Models' Capabilities to Generate Persuasive Language

We are exposed to much information trying to influence us, such as teaser messages, debates, politically framed news, and propaganda - …

Amalie Brogaard Pauli, Isabelle Augenstein, Ira Assent

PDF Project Project

Investigating Human Values in Online Communities

Studying human values is instrumental for cross-cultural research, enabling a better understanding of preferences and behaviour of …

Nadav Borenstein, Arnav Arora, Lucie-Aimée Kaffee, Isabelle Augenstein

PDF Project Project

A Unified Framework for Input Feature Attribution Analysis

Explaining the decision-making process of machine learning models is crucial for ensuring their reliability and fairness. One popular …

Jingyi Sun, Pepa Atanasova, Isabelle Augenstein

PDF Project Project

With Great Backbones Comes Great Adversarial Transferability

The large and ever-increasing amount of data available on the Internet coupled with the laborious task of manual claim and fact …

Erik Arakelyan, Karen Hambardzumyan, Davit Papikyan, Pasquale Minervini, Albert Gordo, Aram H. Markosyan, Isabelle Augenstein

PDF Project

SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages

Question Answering (QA) datasets have been instrumental in developing and evaluating Large Language Model (LLM) capabilities. However, …

Gayane Ghazaryan, Erik Arakelyan, Pasquale Minervini, Isabelle Augenstein

PDF Project Project

FLARE: Faithful Logic-Aided Reasoning and Exploration

Modern Question Answering (QA) and Reasoning approaches based on Large Language Models (LLMs) commonly use prompting techniques, such …

Erik Arakelyan, Pasquale Minervini, Pat Verga, Patrick Lewis, Isabelle Augenstein

PDF Project Project

Social Bias Probing: Fairness Benchmarking for Language Models

Large language models have been shown to encode a variety of social biases, which carries the risk of downstream harms. While the …

Marta Marchiori Manerba, Karolina Stańczak, Riccardo Guidotti, Isabelle Augenstein

PDF Project

Revealing Fine-Grained Values and Opinions in Large Language Models

Uncovering latent values and opinions in large language models (LLMs) can help identify biases and mitigate potential harm. Recently, …

Dustin Wright, Arnav Arora, Nadav Borenstein, Srishti Yadav, Serge Belongie, Isabelle Augenstein

PDF Project Project

Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-Checkers

The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the …

Yuxia Wang, Revanth Gangi Reddy, Zain Muhammad Mujahid, Arnav Arora, Aleksandr Rubashevskii, Jiahui Geng, Osama Mohammed Afzal, Liangming Pan, Nadav Borenstein, Aditya Pillai, Isabelle Augenstein, Iryna Gurevych, Preslav Nakov

PDF Project Project

DYNAMICQA: Tracing Internal Knowledge Conflicts in Language Models

Knowledge-intensive language understanding tasks require Language Models (LMs) to integrate relevant context, mitigating their inherent …

Sara Vera Marjanović, Haeun Yu, Pepa Atanasova, Maria Maistro, Christina Lioma, Isabelle Augenstein

PDF Project Project

Can Transformers Learn n-gram Language Models?

Much theoretical work has described the ability of transformer language models (LMs) to represent formal languages.However, linking …

Anej Svete, Nadav Borenstein, Mike Zhou, Isabelle Augenstein, Ryan Cotterell

PDF Project

Claim Verification in the Age of Large Language Models: A Survey

The large and ever-increasing amount of data available on the Internet coupled with the laborious task of manual claim and fact …

Alphaeus Dmonte, Roland Oruche, Marcos Zampieri, Prasad Calyam, Isabelle Augenstein

PDF Project Project

Grammatical Gender's Influence on Distributional Semantics: A Causal Perspective

How much meaning influences gender assignment across languages is an active area of research in modern linguistics and cognitive …

Karolina Stańczak, Kevin Du, Adina Williams, Isabelle Augenstein, Ryan Cotterell

PDF Project Project

Factuality Challenges in the Era of Large Language Models

The emergence of tools based on large language models (LLMs), like OpenAI’s ChatGPT and Google’s Gemini, has garnered immense public …

Isabelle Augenstein, Timothy Baldwin, Meeyoung Cha, Tanmoy Chakraborty, Giovanni Luca Ciampaglia, David Corney, Renee DiResta, Emilio Ferrara, Scott Hale, Alon Halevy, Eduard Hovy, Heng Ji, Filippo Menczer, Ruben Miguez, Preslav Nakov, Dietram Scheufele, Shivam Sharma, Giovanni Zagni

PDF Project Project

Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods

Language Models (LMs) acquire parametric knowledge from their training process, embedding it within their weights. The increasing …

Haeun Yu, Pepa Atanasova, Isabelle Augenstein

PDF Project Project

What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages

What can large language models learn? By definition, language models (LM) are distributions over strings. Therefore, an intuitive way …

Nadav Borenstein, Anej Svete, Robin Chan, Josef Valvoda, Franz Nowak, Isabelle Augenstein, Eleanor Chodroff, Ryan Cotterell

PDF Project

Investigating the Impact of Model Instability on Explanations and Uncertainty

Explainable AI methods facilitate the understanding of model behaviour, yet, small, imperceptible perturbations to inputs can vastly …

Sara Vera Marjanović, Isabelle Augenstein, Christina Lioma

PDF Project Project

Understanding Fine-grained Distortions in Reports of Scientific Findings

Distorted science communication harms individuals and society as it can lead to unhealthy behavior change and decrease trust in …

Amelie Wührl, Dustin Wright, Roman Klinger, Isabelle Augenstein

PDF Project Project

Semantic Sensitivities and Inconsistent Predictions: Measuring the Fragility of NLI Models

Recent studies of the emergent capabilities of transformer-based Natural Language Understanding (NLU) models have indicated that they …

Erik Arakelyan, Zhaoqi Liu, Isabelle Augenstein

PDF Project Project

Invisible Women in Digital Diplomacy: A Multidimensional Framework for Online Gender Bias Against Women Ambassadors Worldwide

Despite mounting evidence that women in foreign policy often bear the brunt of online hostility, the extent of online gender bias …

Yevgeniy Golovchenko, Karolina Stańczak, Rebecca Adler-Nissen, Patrice Wangen, Isabelle Augenstein

PDF Project

Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

While the prevalence of large pre-trained language models has led to significant improvements in the performance of NLP systems, recent …

Karolina Stańczak, Sagnik Ray Choudhury, Tiago Pimentel, Ryan Cotterell, Isabelle Augenstein

PDF Project Project

People Make Better Edits: Measuring the Efficacy of LLM-Generated Counterfactually Augmented Data for Harmful Language Detection

NLP models are used in a variety of critical social computing tasks, such as detecting sexist, racist, or otherwise hateful content. …

Indira Sen, Dennis Assenmacher, Mattia Samory, Isabelle Augenstein, Wil van der Aalst, Claudia Wagner

PDF Project Project

PHD: Pixel-Based Language Modeling of Historical Documents

The digitisation of historical documents has provided historians with unprecedented research opportunities. Yet, the conventional …

Nadav Borenstein, Phillip Rust, Desmond Elliott, Isabelle Augenstein

PDF Project

Explaining Interactions Between Text Spans

Reasoning over spans of tokens from different parts of the input is essential for natural language understanding (NLU) tasks such as …

Sagnik Ray Choudhury, Pepa Atanasova, Isabelle Augenstein

PDF Project Project

Why Should This Article Be Deleted? Transparent Stance Detection in Multilingual Wikipedia Editor Discussions

The moderation of content on online platforms is usually non-transparent. On Wikipedia, however, this discussion is carried out …

Lucie-Aimée Kaffee, Arnav Arora, Isabelle Augenstein

PDF

Thorny Roses: Investigating the Dual Use Dilemma in Natural Language Processing

Dual use, the intentional, harmful reuse of technology and scientific artefacts, is a problem yet to be well-defined within the context …

Lucie-Aimée Kaffee, Arnav Arora, Zeerak Talat, Isabelle Augenstein

PDF

Adapting Neural Link Predictors for Complex Query Answering

Answering complex queries on incomplete knowledge graphs is a challenging task where a model needs to answer complex logical queries in …

Erik Arakelyan, Pasquale Minervini, Isabelle Augenstein

PDF Project Project Project

Revisiting Softmax for Uncertainty Approximation in Text Classification

Uncertainty approximation in text classification is an important area with applications in domain adaptation and interpretability. One …

Andreas Nugaard Holm, Dustin Wright, Isabelle Augenstein

Preprint PDF Project

Detecting Harmful Content on Online Platforms: What Platforms Need vs. Where Research Efforts Go

The proliferation of harmful content on online platforms is a major societal problem, which comes in many different forms including …

Arnav Arora, Preslav Nakov, Vibha Nayak, Kyle Dent, Ameya Bhatawdekar, Sheikh Muhammad Sarwar, Momchil Hardalov, Yoan Dinkov, Dimitrina Zlatkova, Guillaume Bouchard, Isabelle Augenstein

Preprint PDF

Topic-Guided Sampling For Data-Efficient Multi-Domain Stance Detection

The task of Stance Detection is concerned with identifying the attitudes expressed by an author towards a target of interest. This task …

Erik Arakelyan, Arnav Arora, Isabelle Augenstein

PDF Project Project

Multilingual Event Extraction from Historical Newspaper Adverts

NLP methods can aid historians in analyzing textual materials in greater volumes than manually feasible. Developing such methods poses …

Nadav Borenstein, Natália da Silva Perez, Isabelle Augenstein

PDF Project

Measuring Intersectional Biases in Historical Documents

Data-driven analyses of biases in historical texts can help illuminate the origin and development of biases prevailing in modern …

Nadav Borenstein, Karolina Stańczak, Thea Rolskov, Natacha Klein Käfer, Natália da Silva Perez, Isabelle Augenstein

PDF Project

Faithfulness Tests for Natural Language Explanations

Explanations of neural models aim to reveal a model’s decision-making process for its predictions. However, recent work shows …

Pepa Atanasova, Oana-Maria Camburu, Christina Lioma, Thomas Lukasiewicz, Jakob Grue Simonsen, Isabelle Augenstein

PDF Project

Probing Pre-Trained Language Models for Cross-Cultural Differences in Values

Language embeds information about social, cultural, and political values people hold. Prior work has explored social and potentially …

Arnav Arora, Lucie-Aimée Kaffee, Isabelle Augenstein

PDF Project Project

Measuring Gender Bias in West Slavic Language Models

Pre-trained language models have been known to perpetuate biases from the underlying datasets to downstream tasks. However, these …

Sandra Martinková, Karolina Stańczak, Isabelle Augenstein

PDF Project Project

A Latent-Variable Model for Intrinsic Probing

The success of pre-trained contextualized representations has prompted researchers to analyze them for the presence of linguistic …

Karolina Stańczak, Lucas Torroba Hennigen, Adina Williams, Ryan Cotterell, Isabelle Augenstein

PDF Project Project

Generating Fluent Fact Checking Explanations with Unsupervised Post-Editing

Fact-checking systems have become important tools to verify fake and misguiding news. These systems become more trustworthy when …

Shailza Jolly, Pepa Atanasova, Isabelle Augenstein

PDF Project Project Project

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings

Learning scientific document representations can be substantially improved through contrastive learning objectives, where the challenge …

Malte Ostendorff, Nils Rethmeier, Isabelle Augenstein, Bela Gipp, Georg Rehm

PDF Project

Modeling Information Change in Science Communication with Semantically Matched Paraphrases

Whether the media faithfully communicate scientific information has long been a core issue to the science community. Automatically …

Dustin Wright, Jiaxin Pei, David Jurgens, Isabelle Augenstein

PDF Code Dataset Project Project Huggingface Model

TempEL: Linking Dynamically Evolving and Newly Emerging Entities

In our continuously evolving world, entities change over time and new, previously non-existing or unknown, entities appear. We study …

Klim Zaporojets, Lucie-Aimée Kaffee, Johannes Deleu, Thomas Demeester, Chris Develder, Isabelle Augenstein

PDF Project

Machine Reading, Fast and Slow: When Do Models 'Understand' Language?

Two of the most fundamental challenges in Natural Language Understanding (NLU) at present are: (a) how to establish whether deep …

Sagnik Ray Choudhury, Anna Rogers, Isabelle Augenstein

PDF Project Project

Can Edge Probing Tasks Reveal Linguistic Knowledge in QA Models?

There have been many efforts to try to understand what grammatical knowledge (e.g., ability to understand the part of speech of a …

Sagnik Ray Choudhury, Nikita Bhutani, Isabelle Augenstein

PDF Project Project

QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension

Alongside huge volumes of research on deep learning models in NLP in the recent years, there has been also much work on benchmark …

Anna Rogers, Matt Gardner, Isabelle Augenstein

PDF Project

A Primer on Contrastive Pretraining in Language Processing: Methods, Lessons Learned and Perspectives

Modern natural language processing (NLP) methods employ self-supervised pretraining objectives such as masked language modeling to …

Nils Rethmeier, Isabelle Augenstein

PDF Project

Quantifying Gender Biases Towards Politicians on Reddit

Despite attempts to increase gender parity in politics, global efforts have struggled to ensure equal female representation. This is …

Sara Marjanovic, Karolina Stańczak, Isabelle Augenstein

PDF Project

Habilitation Abstract: Towards Explainable Fact Checking

With the substantial rise in the amount of mis- and disinformation online, fact checking has become an important task to automate. This …

Isabelle Augenstein

PDF Project Project

A Survey on Stance Detection for Mis- and Disinformation Identification

Detecting attitudes expressed in texts, also known as stance detection, has become an important task for the detection of false …

Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

PDF Project

Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models

The success of multilingual pre-trained models is underpinned by their ability to learn representations shared by multiple languages …

Karolina Stańczak, Edoardo Ponti, Lucas Torroba Hennigen, Ryan Cotterell, Isabelle Augenstein

PDF Project Project

Counterfactually Augmented Data and Unintended Bias: The Case of Sexism and Hate Speech Detection

Counterfactually Augmented Data (CAD) aims to improve out-of-domain generalizability, an indicator of model robustness. The improvement …

Indira Sen, Mattia Samory, Claudia Wagner, Isabelle Augenstein

PDF Project Project

Fact Checking with Insufficient Evidence

Automating the fact checking (FC) process relies on information obtained from external sources. In this work, we posit that it is …

Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

PDF Project Project

Generating Scientific Claims for Zero-Shot Scientific Fact Checking

Automated scientific fact checking is difficult due to the complexity of scientific language and a lack of significant amounts of …

Dustin Wright, David Wadden, Kyle Lo, Bailey Kuehl, Isabelle Augenstein, Lucy Lu Wang

PDF Project Project Project

Multi3Generation: Multi-task, Multilingual, Multi-Modal Language Generation

This paper presents the Multitask, Multilingual, Multimodal Language Generation COST Action – Multi3Generation (CA18231), an …

Anabela Barreiro, José G. C. de Souza, Albert Gatt, Mehul Bhatt, Elena Lloret, Aykut Erdem, Dimitra Gkatzia, Helena Moniz, Irene Russo, Fabio Kepler, Iacer Calixto, Marcin Paprzycki, François Portet, Isabelle Augenstein, Mirela Alhasani

PDF Project Project

Multi-Sense Language Modelling

The effectiveness of a language model is influenced by its token representations, which must encode contextual information and handle …

Andrea Lekkas, Peter Schneider-Kamp, Isabelle Augenstein

PDF Project

Few-Shot Cross-Lingual Stance Detection with Sentiment-Based Pre-Training

The goal of stance detection is to determine the viewpoint expressed in a piece of text towards a target. These viewpoints or contexts …

Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

PDF Project Project Project

Diagnostics-Guided Explanation Generation

Explanations shed light on a machine learning model’s rationales and can aid in identifying deficiencies in its reasoning …

Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

PDF Project Project

A Survey on Gender Bias in Natural Language Processing

While the prevalence of large pre-trained language models has led to significant improvements in the performance of NLP systems, recent …

Karolina Stańczak, Isabelle Augenstein

PDF Project

A Neighbourhood Framework for Resource-Lean Content Flagging

We propose a novel framework for cross-lingual content flagging with limited target-language data, which significantly outperforms …

Sheikh Muhammad Sarwar, Dimitrina Zlatkova, Momchil Hardalov, Yoan Dinkov, Isabelle Augenstein, Preslav Nakov

PDF Project

Contrastive Text Pretraining for Zero to Few-Shot Long-Tail Learning

For natural language processing (NLP) tasks such as sentiment or topic classification, currently prevailing approaches heavily rely on …

Nils Rethmeier, Isabelle Augenstein

PDF Project

Longitudinal Citation Prediction using Temporal Graph Neural Networks

Citation count prediction is the task of predicting the number of citations a paper has gained after a period of time. Prior work …

Andreas Nugaard Holm, Barbara Plank, Dustin Wright, Isabelle Augenstein

PDF Project Project

Information fusion as an integrative cross-cutting enabler to achieve robust, explainable, and trustworthy medical artificial intelligence

Medical artificial intelligence (AI) systems have been remarkably successful, even outperforming human performance at certain tasks. …

Andreas Holzinger, Matthias Dehmer, Frank Emmert-Streib, Rita Cucchiara, Isabelle Augenstein, Javier Del Ser, Wojciech Samek, Igor Jurisica, Natalia Díaz-Rodríguez

PDF Project

Time-Aware Evidence Ranking for Fact-Checking

Truth can vary over time. Therefore, fact-checking decisions on claim veracity should take into account temporal information of both …

Liesbeth Allein, Isabelle Augenstein, Marie-Francine Moens

PDF Project

Semi-Supervised Exaggeration Detection of Health Science Press Releases

Public trust in science depends on honest and factual communication of scientific papers. However, recent studies have demonstrated a …

Dustin Wright, Isabelle Augenstein

PDF Project Project Project

How Does Counterfactually Augmented Data Impact Models for Social Computing Constructs?

As NLP models are increasingly deployed in socially situated settings such as online abusive content detection, ensuring these models …

Indira Sen, Mattia Samory, Fabian Flöck, Claudia Wagner, Isabelle Augenstein

PDF Project Project Project

Cross-Domain Label-Adaptive Stance Detection

Stance detection concerns the classification of a writer’s viewpoint towards a target. There are different task variants, e.g., …

Momchil Hardalov, Arnav Arora, Preslav Nakov, Isabelle Augenstein

PDF Project Project

Joint Emotion Label Space Modelling for Affect Lexica

Emotion lexica are commonly used resources to combat data poverty in automatic emotion detection. However, methodological issues emerge …

Luna De Bruyne, Pepa Atanasova, Isabelle Augenstein

PDF Project

Determining the Credibility of Science Communication

Most work on scholarly document processing assumes that the information processed is trust-worthy and factually correct. However, this …

Isabelle Augenstein

PDF Project Project

Inducing Language-Agnostic Multilingual Representations

Cross-lingual representations have the potential to make NLP techniques available to the vast majority of languages in the world. …

Wei Zhao, Steffen Eger, Johannes Bjerva, Isabelle Augenstein

PDF Code Project Project

Is Sparse Attention more Interpretable?

Sparse attention has been claimed to increase model interpretability under the assumption that it highlights influential inputs. Yet …

Clara Meister, Stefan Lazov, Isabelle Augenstein, Ryan Cotterell

PDF Project

CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding

Scientific document understanding is challenging as the data is highly domain specific and diverse. However, datasets for tasks with …

Dustin Wright, Isabelle Augenstein

Project Project

Multi-Hop Fact Checking of Political Claims

Recently, novel multi-hop models and datasets have been introduced to achieve more complex natural language reasoning with neural …

Wojciech Ostrowski, Arnav Arora, Pepa Atanasova, Isabelle Augenstein

PDF Code Dataset Project

White Paper - Creating a Repository of Objectionable Online Content: Addressing Undesirable Biases and Ethical Considerations

This white paper summarizes the authors’ structured brainstorming regarding ethical considerations for creating an extensive …

Thamar Solorio, Mahsa Shafaei, Christos Smailis, Isabelle Augenstein, Margaret Mitchell, Ingrid Stapf, Ioannis Kakadiaris

PDF Project

Does Typological Blinding Impede Cross-Lingual Sharing?

Bridging the performance gap between high- and low-resource languages has been the focus of much previous work. Typological features …

Johannes Bjerva, Isabelle Augenstein

PDF Project Project

Towards Explainable Fact Checking

The past decade has seen a substantial rise in the amount of mis- and disinformation online, from targeted disinformation campaigns to …

Isabelle Augenstein

PDF Project Project

University of Copenhagen Participation in TREC Health Misinformation Track 2020

In this paper, we describe our participation in the TREC Health Misinformation Track 2020. We submitted 11 runs to the Total Recall …

Lucas Chaves Lima, Dustin Wright, Isabelle Augenstein, Maria Maistro

PDF Project Project

Zero-Shot Cross-Lingual Transfer with Meta Learning

Learning what to share between tasks has been a topic of high importance recently, as strategic sharing of knowledge has been shown to …

Farhad Nooralahzadeh, Giannis Bekoulis, Johannes Bjerva, Isabelle Augenstein

PDF Code Project Project Project Project

Transformer Based Multi-Source Domain Adaptation

In practical machine learning settings, the data on which a model must make predictions often come from a different distribution than …

Dustin Wright, Isabelle Augenstein

PDF Code Project

SubjQA: A Dataset for Subjectivity and Review Comprehension

Subjectivity is the expression of internal opinions or beliefs which cannot be objectively observed or verified, and has been shown to …

Johannes Bjerva, Nikita Bhutani, Behzad Golshan, Wang-Chiew Tan, Isabelle Augenstein

PDF Dataset Project Project

Generating Label Cohesive and Well-Formed Adversarial Claims

Adversarial attacks reveal important vulnerabilities and flaws of trained models. One potent type of attack are universal adversarial …

Pepa Atanasova, Dustin Wright, Isabelle Augenstein

PDF Code Project Project

A Diagnostic Study of Explainability Techniques for Text Classification

Recent developments in machine learning have introduced models that approach human performance at the cost of increased architectural …

Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

PDF Code Project

What Can We Do to Improve Peer Review in NLP?

Peer review is our best tool for judging the quality of conference submissions, but it is becoming increasingly spurious. We argue that …

Anna Rogers, Isabelle Augenstein

PDF Project

Claim Check-Worthiness Detection as Positive Unlabelled Learning

A critical component of automatically combating misinformation is the detection of fact check-worthiness, i.e. determining if a piece …

Dustin Wright, Isabelle Augenstein

PDF Code Project Project Project

Unsupervised Evaluation for Question Answering with Transformers

It is challenging to automatically evaluate the answer of a QA model at inference time. Although many models provide confidence scores, …

Lukas Muttenthaler, Isabelle Augenstein, Johannes Bjerva

PDF Project

SIGTYP 2020 Shared Task: Prediction of Typological Features

Typological knowledge bases (KBs) such as WALS contain information about linguistic properties of the world’s languages. They …

Johannes Bjerva, Elizabeth Salesky, Sabrina J. Mielke, Aditi Chaudhary, Giuseppe G. A. Celano, Edoardo M. Ponti, Ekaterina Vylomova, Ryan Cotterell, Isabelle Augenstein

PDF Code Dataset Project Project

Disembodied Machine Learning: On the Illusion of Objectivity in NLP

Machine Learning (ML) seeks to identify and encode bodies of knowledge within provided datasets. However, data encodes subjective …

Zeerak Waseem, Smarika Lulz, Joachim Bingel, Isabelle Augenstein

PDF Project

TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP

While state-of-the-art NLP explainability (XAI) methods focus on supervised, per-instance end or diagnostic probing task evaluation[4, …

Nils Rethmeier, Vageesh Kumar Saxena, Isabelle Augenstein

PDF Code Project Project

Generating Fact Checking Explanations

This paper provides the first study of how fact checking explanations can be generated automatically based on available claim context, …

Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

Preprint Project Project Video

2kenize: Tying Subword Sequences for Chinese Script Conversion

We propose a novel Chinese character conversion model that can disambiguate between mappings and convert between the two scripts. The …

Pranav A, Isabelle Augenstein

PDF Dataset Project Video

Semantic Textual Similarity of Sentences with Emojis

In this paper, we extend the task of semantic textual similarity to include sentences which contain emojis. Emojis are ubiquitous on …

Alok Debnath, Nikhil Pinnaparaju, Manish Shrivastava, Vasudeva Varma, Isabelle Augenstein

PDF Project

Back to the Future -- Sequential Alignment of Text Representations

Language evolves over time in many ways relevant to natural language processing tasks. For example, recent occurrences of tokens …

Johannes Bjerva, Wouter Kouw, Isabelle Augenstein

PDF Code Project Project

MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims

We contribute the largest publicly available dataset of naturally occurring factual claims for the purpose of automatic claim …

Isabelle Augenstein, Christina Lioma, Dongsheng Wang, Lucas Chaves Lima, Casper Hansen, Christian Hansen, Jakob Grue Simonsen

PDF Code Dataset Project Poster

Mapping (Dis-)Information Flow about the MH17 Plane Crash

Digital media enables not only fast sharing of information, but also disinformation. One prominent case of an event leading to …

Mareike Hartmann, Yevgeniy Golovchenko, Isabelle Augenstein

PDF Code Project

X-WikiRE: A Large, Multilingual Resource for Relation Extraction as Machine Comprehension

Although the vast majority of knowledge bases KBs are heavily biased towards English, Wikipedias do cover very different topics in …

Mostafa Abdou, Cezar Sas, Rahul Aralikatte, Isabelle Augenstein, Anders Søgaard

PDF Project Project Project Project

Retrieval-Based Goal-Oriented Dialogue Generation

Task oriented dialogue systems rely heavily on specialized dialogue state tracking (DST) modules for dynamically predicting user intent …

Ana Valeria Gonzalez, Isabelle Augenstein, Anders Søgaard

PDF Project

Domain Transfer in Dialogue Systems without Turn-Level Supervision

Task oriented dialogue systems rely heavily on specialized dialogue state tracking (DST) modules for dynamically predicting user intent …

Joachim Bingel, Victor Petrén Bach Hansen, Ana Valeria Gonzalez, Paweł Budzianowski, Isabelle Augenstein, Anders Søgaard

PDF Project Project

Transductive Auxiliary Task Self-Training for Neural Multi-Task Models

Multi-task learning and self-training are two common ways to improve a machine learning model’s performance in settings with …

Johannes Bjerva, Katharina Kann, Isabelle Augenstein

PDF Project

Unsupervised Discovery of Gendered Language through Latent-Variable Modeling

Studying to what degree the language we use is gender-specific has long been an area of interest in socio-linguistics. Studies have …

Alexander Hoyle, Lawrence Wolf-Sonkin, Hanna Wallach, Isabelle Augenstein, Ryan Cotterell

PDF Code Project Project Slides

Uncovering Probabilistic Implications in Typological Knowledge Bases

The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages …

Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein

PDF Project Project

Proceedings of The Fourth Workshop on Representation Learning for NLP

The workshop has a focus on vector space models of meaning, compositionality, and the application of deep neural networks and spectral …

Isabelle Augenstein, Spandana Gella, Sebastian Ruder, Katharina Kann, Burcu Can, Alexis Conneau, Johannes Welbl, Xian Ren, Marek Rei

PDF Project Slides

Combining Sentiment Lexica with a Multi-View Variational Autoencoder

When assigning quantitative labels to a dataset, different methodologies may rely on different scales. In particular, when assigning …

Alexander Hoyle, Lawrence Wolf-Sonkin, Hanna Wallach, Ryan Cotterell, Isabelle Augenstein

PDF Code Project Slides Video

A Probabilistic Generative Model of Linguistic Typology

In the Principles and Parameters framework, the structural features of languages depend on parameters that may be toggled on or off, …

Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein

PDF Project Project Slides Video

Issue Framing in Online Discussion Fora

In online discussion fora, speakers often make arguments for or against something, say birth control, by highlighting certain aspects …

Mareike Hartmann, Tallulah Jansen, Isabelle Augenstein, Anders Søgaard

PDF Project Project

What do Language Representations Really Represent?

A neural language model trained on a text corpus can be used to induce distributed representations of words, such that similar words …

Johannes Bjerva, Robert Östling, Maria Han Veiga, Jörg Tiedemann, Isabelle Augenstein

PDF Project

Latent multi-task architecture learning

Multi-task learning (MTL) allows deep neural networks to learn from related tasks by sharing parameters with other networks. In …

Sebastian Ruder, Joachim Bingel, Isabelle Augenstein, Anders Søgaard

PDF Code Project Slides

A strong baseline for question relevancy ranking

The best systems at the SemEval-16 and SemEval-17 community question answering shared tasks – a task that amounts to question …

Ana V. González-Garduño , Isabelle Augenstein, Anders Søgaard

PDF Project Video

Parameter sharing between dependency parsers for related languages

Previous work has suggested that parameter sharing between transition-based neural dependency parsers for related languages can lead to …

Miryam de Lhoneux, Johannes Bjerva, Isabelle Augenstein, Anders Søgaard

PDF Project Project Poster

Copenhagen at CoNLL--SIGMORPHON 2018: Multilingual Inflection in Context with Explicit Morphosyntactic Decoding

This paper documents the Team Copenhagen system which placed first in the CoNLL–SIGMORPHON 2018 shared task on universal …

Yova Kementchedjhieva, Johannes Bjerva, Isabelle Augenstein

PDF Project Project

Nightmare at test time: How punctuation prevents parsers from generalizing

Punctuation is a strong indicator of syntactic structure, and parsers trained on text with punctuation often rely heavily on this …

Anders Søgaard, Miryam de Lhoneux, Isabelle Augenstein

PDF

Jack the Reader – A Machine Reading Framework

Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For …

Dirk Weissenborn, Pasquale Minervini, Tim Dettmers, Isabelle Augenstein, Johannes Welbl, Tim Rocktäschel, Matko Bošnjak, Jeff Mitchell, Thomas Demeester, Pontus Stenetorp, Sebastian Riedel

PDF Code Project

Proceedings of The Third Workshop on Representation Learning for NLP

The workshop has a focus on vector space models of meaning, compositionality, and the application of deep neural networks and spectral …

Isabelle Augenstein, Kris Cao, He He, Felix Hill, Spandana Gella, Jamie Kiros, Hongyuan Mei, Dipendra Misra

PDF Project

Character-level Supervision for Low-resource POS Tagging

Neural part-of-speech (POS) taggers are known to not perform well with little training data. As a step towards overcoming this problem, …

Katharina Kann, Johannes Bjerva, Isabelle Augenstein, Barbara Plank, Anders Søgaard

PDF Project

Multi-Task Learning of Pairwise Sequence Classification Tasks over Disparate Label Spaces

We combine multi-task learning and semisupervised learning by inducing a joint embedding space between disparate label spaces and …

Isabelle Augenstein, Sebastian Ruder, Anders Søgaard

PDF Code Project Project Slides Video

From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings

A core part of linguistic typology is the classification of languages according to linguistic properties, such as those detailed in the …

Johannes Bjerva, Isabelle Augenstein

PDF Code Project

KU-MTL at SemEval-2018 Task 1: Multi-task Identification of Affect in Tweets

We take a multi-task learning approach to the shared Task 1 at SemEval-2018. The general idea concerning the model structure is to use …

Thomas Nyegaard-Signori, Casper Veistrup Helms, Johannes Bjerva, Isabelle Augenstein

PDF Project

Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers

Rumour stance classification, defined as classifying the stance of specific social media posts into one of supporting, denying, …

Arkaitz Zubiaga, Elena Kochkina, Maria Liakata, Rob Procter, Michal Lukasik, Kalina Bontcheva, Trevor Cohn, Isabelle Augenstein

PDF Project

Tracking Typological Traits of Uralic Languages in Distributed Language Representations

Although linguistic typology has a long history, computational approaches have only recently gained popularity. The use of distributed …

Johannes Bjerva, Isabelle Augenstein

PDF Project Slides

Multi-Task Learning of Keyphrase Boundary Classification

Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to …

Isabelle Augenstein, Anders Søgaard

PDF Project Project Project Poster

Generalisation in Named Entity Recognition: A Quantitative Analysis

Named Entity Recognition (NER) is a key NLP task, which is all the more challenging on Web and user-generated content with their …

Isabelle Augenstein, Leon Derczynski, Kalina Bontcheva

PDF Project

A simple but tough-to-beat baseline for the Fake News Challenge stance detection task

Identifying public misinformation is a complicated and challenging task. An important part of checking the veracity of a specific claim …

Benjamin Riedel, Isabelle Augenstein, Georgios Spithourakis, Sebastian Riedel

PDF Code Dataset Project

A Supervised Approach to Extractive Summarisation of Scientific Papers

Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on …

Ed Collins, Isabelle Augenstein, Sebastian Riedel

PDF Code Dataset Project Project Poster

Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM

This paper describes team Turing’s submission to SemEval 2017 RumourEval: Determining rumour veracity and support for rumours …

Elena Kochkina, Maria Liakata, Isabelle Augenstein

PDF Code Dataset Project

Sequential Approach to Rumour Stance Classification

Rumour stance classification is a task that involves identifying the attitude of Twitter users towards the truthfulness of the rumour …

Elena Kochkina, Maria Liakata, Isabelle Augenstein

PDF Code Dataset Project

SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications

We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for …

Isabelle Augenstein, Mrinal Das, Sebastian Riedel, Lakshmi Vikraman, Andrew McCallum

PDF Code Dataset Project Slides

An Unsupervised Data-driven Method to Discover Equivalent Relations in Large Linked Datasets

We propose a novel similarity measure able to cope with unbalanced population of schema elements, an unsupervised technique to …

Ziqi Zhang, Anna Lisa Gentile, Isabelle Augenstein, Eva Blomqvist, Fabio Ciravegna

PDF Project