Scholarly Data Processing

We are working on studying methods to automatically process scholarly data. This is to assist researchers in finding publications (e.g. by extracting content from papers automatically, which can be used to populate knowledge bases), writing better papers (e.g. by suggesting which sentences need citations, improving peer review), or tracking their impact (e.g. by tracking which papers are highly cited and how this relates to meta-data, such as venues or authors).

Publications

Most work on scholarly document processing assumes that the information processed is trust-worthy and factually correct. However, this …

Scientific document understanding is challenging as the data is highly domain specific and diverse. However, datasets for tasks with …

Citation count prediction is the task of predicting the number of citations a paper has gained after a period of time. Prior work …

A critical component of automatically combating misinformation is the detection of fact check-worthiness, i.e. determining if a piece …

Peer review is our best tool for judging the quality of conference submissions, but it is becoming increasingly spurious. We argue that …

Language evolves over time in many ways relevant to natural language processing tasks. For example, recent occurrences of tokens …

Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on …

Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to …