Unknown Data – Mining and consolidating research dataset metadata on the Web

At a glance

Project duration
10/2021  – 06/2025
DFG classification of subject areas

Information Systems, Process and Knowledge Management

Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing

Funded by

DFG other programmes DFG other programmesDFG other programmesDFG other programmesDFG other programmes

Project description

Research data is essential to facilitate scientific progress, yet, many valuable datasets are hidden on web sites and small repositories or are hard to find due to insufficient metadata. Only a fraction of researchers pro-actively share dataset metadata through public portals, and curation of such metadata collections is costly. Unknown Data will provide means to automatically discover, extract, and publish metadata about research data that is hidden on the Web or in scholarly publications. Thus, the project’s goal is to improve findability and re-usability of research data by (a) improving metadata quality, in particular with respect to authority and use of existing datasets and (b) uncovering datasets that are not yet reflected in public data repositories and registries.

Open project website

Cooperation partners

  • Cooperation partner
    Non-university research institutionGermany

    GESIS – Leibniz Institute for the Social Sciences

  • Cooperation partner
    Non-university research institutionGermany

    Leibniz Center for Informatics