Knowledge Base Population

27 Mar 2016

Information extraction is concerned with extracting information about entities, phrases and relations between them from text to populate knowledge bases, such as extracting “employee-at” relations. Within this context, we have worked on automatic knowledge base completion, knowledge base cleansing and detecting scientific keyphrases in text, as well as automatic completion of typological knowledge bases.

We are currently involved in one longer-term project related to this, namely a research project funded by the Swedish Research Council coordinated by Robert Östling. Its goals are to study structured multilinguality, i.e. the idea of using language representations and typological knowledge bases to guide which information to share between specific languages.

nlu knowledge-bases

Publications

SynDARin: Synthesising Datasets for Automated Reasoning in Low-Resource Languages

Question Answering (QA) datasets have been instrumental in developing and evaluating Large Language Model (LLM) capabilities. However, …

Gayane Ghazaryan, Erik Arakelyan, Pasquale Minervini, Isabelle Augenstein

PDF Project Project

Adapting Neural Link Predictors for Complex Query Answering

Answering complex queries on incomplete knowledge graphs is a challenging task where a model needs to answer complex logical queries in …

Erik Arakelyan, Pasquale Minervini, Isabelle Augenstein

PDF Project Project Project

TempEL: Linking Dynamically Evolving and Newly Emerging Entities

In our continuously evolving world, entities change over time and new, previously non-existing or unknown, entities appear. We study …

Klim Zaporojets, Lucie-Aimée Kaffee, Johannes Deleu, Thomas Demeester, Chris Develder, Isabelle Augenstein

PDF Project

Zero-Shot Cross-Lingual Transfer with Meta Learning

Learning what to share between tasks has been a topic of high importance recently, as strategic sharing of knowledge has been shown to …

Farhad Nooralahzadeh, Giannis Bekoulis, Johannes Bjerva, Isabelle Augenstein

PDF Code Project Project Project Project

X-WikiRE: A Large, Multilingual Resource for Relation Extraction as Machine Comprehension

Although the vast majority of knowledge bases KBs are heavily biased towards English, Wikipedias do cover very different topics in …

Mostafa Abdou, Cezar Sas, Rahul Aralikatte, Isabelle Augenstein, Anders Søgaard

PDF Project Project Project Project

Uncovering Probabilistic Implications in Typological Knowledge Bases

The study of linguistic typology is rooted in the implications we find between linguistic features, such as the fact that languages …

Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein

PDF Project Project

A Probabilistic Generative Model of Linguistic Typology

In the Principles and Parameters framework, the structural features of languages depend on parameters that may be toggled on or off, …

Johannes Bjerva, Yova Kementchedjhieva, Ryan Cotterell, Isabelle Augenstein

PDF Project Project Slides Video

A Supervised Approach to Extractive Summarisation of Scientific Papers

Automatic summarisation is a popular approach to reduce a document to its main arguments. Recent research in the area has focused on …

Ed Collins, Isabelle Augenstein, Sebastian Riedel

PDF Code Dataset Project Project Poster

Generalisation in Named Entity Recognition: A Quantitative Analysis

Named Entity Recognition (NER) is a key NLP task, which is all the more challenging on Web and user-generated content with their …

Isabelle Augenstein, Leon Derczynski, Kalina Bontcheva

PDF Project

Multi-Task Learning of Keyphrase Boundary Classification

Keyphrase boundary classification (KBC) is the task of detecting keyphrases in scientific articles and labelling them with respect to …

Isabelle Augenstein, Anders Søgaard

PDF Project Project Project Poster

SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications

We describe the SemEval task of extracting keyphrases and relations between them from scientific documents, which is crucial for …

Isabelle Augenstein, Mrinal Das, Sebastian Riedel, Lakshmi Vikraman, Andrew McCallum

PDF Code Dataset Project Slides

An Unsupervised Data-driven Method to Discover Equivalent Relations in Large Linked Datasets

We propose a novel similarity measure able to cope with unbalanced population of schema elements, an unsupervised technique to …

Ziqi Zhang, Anna Lisa Gentile, Isabelle Augenstein, Eva Blomqvist, Fabio Ciravegna

PDF Project