Data curators have a key role in ensuring that data remain understandable and usable in the future. In the absence of a previous data management plan, they are expected to add the necessary metadata to datasets that have none, or almost none, contextualization or description.
The main activities endorsed to data curators are related to the data that are going to be published (or they are already published) in data repositories:
- Promotion of best practices throughout the research process
- Helping researchers to decide how to publish data at the end of projects
- Definition of the required quality control steps to integrate data in a repository
- Creation and managing of metadata necessary to describe data
- Making data visible and available to defined target audiences
- Assuring that the access to data complies to the licenses chosen by the authors
- Giving advice on data that have a long-term value
- Definition and implementation of rules and procedures to guarentee the long term preservation of data
- Dissemination of successful stories to encourage the practice of sharing data
Which metadata standards can I recommend to researchers?
There are many metadata standards available to promote data discovery, preservation and reuse. Metadata standards can either be generic or developed within a specific community in mind.
The following are some of the metadata standards that data curators may recommend for data description:
CERIF
– Common European Research Information Format – This standard is a EU recommendation to its member states for recording research activity related information. It is developed and mantained by EuroCRIS
Darwin Core
– A body of standards, that include a glossary of terms, that targets data sharing in the biological diversity domain.
DDI Data Documentation Initiative
– A widely used standard for describing data from the social, behavioural and economic sciences.
DIF Directory Interchange Format
– An initiative from Earth sciences community for the description of scientific data, with elements that represent the instruments used to capture data, temporal and spatial data, and projects.
Dublin Core
– One of the best known and widely used domain-agnostic metadata standards.
Ecological Metadata Language
– Standard developed for and by the ecology discipline. The Global Biodiversity Information Facility (GBIF) metadata application profile is primarily based on the EML.
ISO 19115
– An internationally-adopted schema for describing geographic information and services. Its core elements provide information about the identification, extent, quality, temporal and spatial references. The EU INSPIRE metadata recommendations are based ISO 19115.
PREMIS Data Dictionary for Preservation Metadata
– International standard to support the preservation of digital objects and ensure their long-term usability.