Scholarly Data Processing

26 Apr 2017

We are working on studying methods to automatically process scholarly data. This is to assist researchers in finding publications (e.g. by extracting content from papers automatically, which can be used to populate knowledge bases), writing better papers (e.g. by suggesting which sentences need citations, improving peer review), or tracking their impact (e.g. by tracking which papers are highly cited and how this relates to meta-data, such as venues or authors).

nlu scholarly-data

Publications

Modeling Public Perceptions of Science in Media

Effectively engaging the public with science is vital for fostering trust and understanding in our scientific community. Yet, with an …

Jiaxin Pei, Dustin Wright, Isabelle Augenstein, David Jurgens

PDF Scholarly Data Project

Understanding Fine-grained Distortions in Reports of Scientific Findings

Distorted science communication harms individuals and society as it can lead to unhealthy behavior change and decrease trust in …

Amelie Wührl, Dustin Wright, Roman Klinger, Isabelle Augenstein

PDF Fact Checking Project Scholarly Data Project

Modeling Information Change in Science Communication with Semantically Matched Paraphrases

Whether the media faithfully communicate scientific information has long been a core issue to the science community. Automatically …

Dustin Wright, Jiaxin Pei, David Jurgens, Isabelle Augenstein

PDF Code Dataset Scholarly Data Project Fact Checking Project Huggingface Model

Neighborhood Contrastive Learning for Scientific Document Representations with Citation Embeddings

Learning scientific document representations can be substantially improved through contrastive learning objectives, where the challenge …

Malte Ostendorff, Nils Rethmeier, Isabelle Augenstein, Bela Gipp, Georg Rehm

PDF Scholarly Data Project

Generating Scientific Claims for Zero-Shot Scientific Fact Checking

Automated scientific fact checking is difficult due to the complexity of scientific language and a lack of significant amounts of …

Dustin Wright, David Wadden, Kyle Lo, Bailey Kuehl, Isabelle Augenstein, Lucy Lu Wang

PDF Fact Checking Project Limited Data Project Scholarly Data Project

Longitudinal Citation Prediction using Temporal Graph Neural Networks

Citation count prediction is the task of predicting the number of citations a paper has gained after a period of time. Prior work …

Andreas Nugaard Holm, Barbara Plank, Dustin Wright, Isabelle Augenstein

PDF Limited Data Project Scholarly Data Project

Semi-Supervised Exaggeration Detection of Health Science Press Releases

Public trust in science depends on honest and factual communication of scientific papers. However, recent studies have demonstrated a …

Dustin Wright, Isabelle Augenstein

PDF Limited Data Project Scholarly Data Project Fact Checking Project

Determining the Credibility of Science Communication

Most work on scholarly document processing assumes that the information processed is trust-worthy and factually correct. However, this …

Isabelle Augenstein

PDF Fact Checking Project Scholarly Data Project

CiteWorth: Cite-Worthiness Detection for Improved Scientific Document Understanding

Scientific document understanding is challenging as the data is highly domain specific and diverse. However, datasets for tasks with …

Dustin Wright, Isabelle Augenstein

Limited Data Project Scholarly Data Project

Claim Check-Worthiness Detection as Positive Unlabelled Learning

A critical component of automatically combating misinformation is the detection of fact check-worthiness, i.e. determining if a piece …

Dustin Wright, Isabelle Augenstein

PDF Code Limited Data Project Fact Checking Project Scholarly Data Project

What Can We Do to Improve Peer Review in NLP?

Peer review is our best tool for judging the quality of conference submissions, but it is becoming increasingly spurious. We argue that …

Anna Rogers, Isabelle Augenstein

PDF Scholarly Data Project

Back to the Future -- Sequential Alignment of Text Representations

Language evolves over time in many ways relevant to natural language processing tasks. For example, recent occurrences of tokens …

Johannes Bjerva, Wouter Kouw, Isabelle Augenstein

PDF Code Limited Data Project Scholarly Data Project

A Supervised Approach to Extractive Summarisation of Scientific Papers

Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on …

Ed Collins, Isabelle Augenstein, Sebastian Riedel

PDF Code Dataset Knowledge Bases Project Scholarly Data Project Poster

Multi-Task Learning of Keyphrase Boundary Classification

Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to …

Isabelle Augenstein, Anders Søgaard

PDF Knowledge Bases Project Limited Data Project Scholarly Data Project Poster