L
  • Home
  • Award
    • 2025
    • 2024
  • Events
    • Upcoming Events
    • Past Events
      • FDM-NDS Event Portfolio
      • Workshop Datenkompetenz
    • FDM-NDS recommends
    • Data Days Niedersachsen
      • 2025
      • 2024
    • Love Data Week 2026
  • News
  • FDM-NDS
    • About us
    • Governance
    • Pillar 1
    • Pillar 2
    • Pillar 3
      • eLabFTW@FDM-NDS_eng
    • Publications
    • Newsletter
  • Ressources & Tools
    • Academic Cloud
    • Introduction into RDM
    • Step 1: Research Planning
    • Step 2: Collect data
    • Step 3: Processing and Analyzing Data
    • Step 4: Sharing and Publishing Data
    • Step 5: Data archiving
    • Step 6: Data reuse
    • Excursus: Finding and selecting repositories
    • Excursus: Legal aspects in RDM
    • Excursus: Data documentation
  • English
    • Deutsch
    • English
L
  • Home
  • Award
    • 2025
    • 2024
  • Events
    • Upcoming Events
    • Past Events
      • FDM-NDS Event Portfolio
      • Workshop Datenkompetenz
    • FDM-NDS recommends
    • Data Days Niedersachsen
      • 2025
      • 2024
    • Love Data Week 2026
  • News
  • FDM-NDS
    • About us
    • Governance
    • Pillar 1
    • Pillar 2
    • Pillar 3
      • eLabFTW@FDM-NDS_eng
    • Publications
    • Newsletter
  • Ressources & Tools
    • Academic Cloud
    • Introduction into RDM
    • Step 1: Research Planning
    • Step 2: Collect data
    • Step 3: Processing and Analyzing Data
    • Step 4: Sharing and Publishing Data
    • Step 5: Data archiving
    • Step 6: Data reuse
    • Excursus: Finding and selecting repositories
    • Excursus: Legal aspects in RDM
    • Excursus: Data documentation
  • English
    • Deutsch
    • English

Excursus: Data documentation

Documenting research data involves naming files consistently and adding additional information that describes the research project and the collection process—known as metadata. Accompanying materials relating to the research process are also an essential part of the documentation. Essentially, the documentation serves as a description of the data and functions similarly to an instruction manual.

Why?

The following objectives should be achieved with data documentation [1]:

  • Preservation of the interpretability and traceability of data
  • Visibility and retrievability of data (e.g., in data catalogs)

Documentation enables third parties to assess the reusability of research data. It ensures the traceability and reconstructability of decisions. For this reason, documentation is also indispensable for project staff, especially in projects with several team members and in the event of members leaving the project. Nevertheless, queries may arise, so a permanent contact option for the data controllers must be provided.

The use of comprehensive data descriptions enables more effective working methods and increases the findability of research data. It creates clarity and accountability. The use of metadata standards and controlled vocabulary (e.g., through the use of a thesaurus) as well as the implementation of data management software, such as an electronic lab notebook, offers the following advantages, among others:

  • significant time savings in the archiving and publication process
  • minimization of file mix-ups
  • ensuring appropriate and careful documentation that complies with the principles of good scientific practice

How?

Data documentation includes a detailed description of how data is collected, processed, analyzed, and archived. This description also contains information about the use of metadata standards and the vocabulary used. In addition, an explanation of the coding of the data is provided. These descriptions are based on the requirements and standards of the respective discipline.

Data documentation can be achieved in various ways. These include, for example, an accompanying ReadMe file, a metadata database, an internal project wiki, an (electronic) lab notebook, a data management plan (DMP), appropriate file naming within the folder structure, or appropriate documentation within the research data file itself or in the file’s meta information.

FAIR Research Data Management

The FAIR principles provide guidelines for organizing data. FAIR stands for Findable, Accessible, Interoperable, and Reusable. The FAIR principles apply to data management, infrastructure, and services. The first step toward (re)using data and complying with the FAIR principles is to make it possible to find this data.

A systematic filing system ensures that files can be found easily in different systems. Accessibility is regulated by defining clear access rules so that only authorized persons can access the files. Interoperability is ensured by allowing files and their contents to be used in different systems without impairment. Finally, reusability is increased by preparing the files in such a way that they can be easily reused by other users.

All relevant documents and files must therefore be carefully structured. It is not only about FAIR data, but also about FAIR files. The following are therefore particularly important:

  • a clear folder structure
  • clear naming conventions
  • consistent versioning

Folder and file management

Folder and file management is an important step for successful research data management. The following key questions regarding the establishment of a folder and file naming system should be answered within a research project:

  • What should be considered or is helpful when defining a folder structure?
  • What types of data should be stored?
  • Are there subprojects that require their own folders?

A folder and file naming system is planned at the beginning of the project (jointly by the team) and should be meaningful and as simple as possible. Below are some general tips for establishing a folder structure:

  • Balance between a folder structure that is as flat as possible but sufficiently deep. A deep folder structure requires many clicks to reach the required file, while a structure that is too flat can lead to overcrowding of the folders.
  • Overlapping categories should be avoided and understandable folder names should be used.

The following key questions for documentation provide guidance and should be answered before data collection begins

  • What information do third parties who are not involved in the project need in order to use the research data for secondary use?
  • What information is needed to replicate the analyses carried out in the future?
  • What information do project staff who were not involved in data collection need?
  • What information do project staff who were not involved in data preparation and/or data analysis need? [1]

Notes 7-folder system

  • A maximum of seven folders per level (Our brain can only hold a maximum of seven pieces of information (±2) in short-term memory at any one time).
  • Maximum of three folder levels (= speed)
  • Folder levels can be identified by numbering, for example: 1_Project_A, 2_Project_B, 3_Project_C, …
  • Files are always stored at the lowest folder level.

Tips for naming folders and files

  • Do not use special characters.

  • Avoid umlauts: ä, ö, ü → these are displayed incorrectly in some programs; use ae, oe, ue instead.

  • Time specifications should be sorted; YYYYMMDD → the operating system can sort automatically in ascending or descending order.

  • Work with (sub)versions (subversion if no significant changes have been made).

Example: 20240628_Student survey_V02_02

These aspects should be taken into account in the documentation

Research project (project title, persons involved)

  • Context of the survey (project objectives, hypotheses)
  • Survey method (sampling, instruments, hardware and software, secondary data sources, location and time period of the survey)
  • Structure of the data and its relationships (data structure and content, relationships between data sets, data formats)
  • Quality measures (cleaning, weighting, data verification)
  • Explanations for codes and labels (codebook)
  • Data versions and changes
  • Information on access, terms of use, and confidentiality [2]

Suggestions for the concrete implementation of a folder structure

  • Create directories and (empty) folders at the beginning
  • as many as necessary and as few as possible
  • meaningful and short names (+ numbering)
  • possibly establish a 3-level 7-folder system

ReadMe-File

A ReadMe file can also be useful for data documentation. Among other things, it helps to ensure that data can be interpreted correctly and that essential data processing steps are recorded. Furthermore, a ReadMe file contains information about the data files in the respective folder and how they should be structured and named. A ReadMe file should be available as a text document (e.g., Markdown format .md or as plain text .txt).

[1] https://www.forschungsdaten-bildung.de/daten-dokumentieren
[2] https://forschungsdaten.info/themen/beschreiben-und-dokumentieren/datendokumentation

Further information

Checklist FAIR Data

Biernacka K, Dolzycka D, Buchholz P und K Helbig (2019): Wie fair sind deine Forschungsdaten? Informationsposter. Zenodo, doi.org/10.5281/zenodo.2547339

Electronic Lab Notebook (ELN): Chemotion

Chemotion is a free open source product. The ELN and repository is particularly suitable for use in the field of chemistry; beyond that, it is only configurable and usable to a limited extent. Various partners of the Lower Saxony RDM state initiative are already using Chemotion. The integration of Chemotion into the AcademicCloud for use by all universities in Lower Saxony is to be developed as part of the FDM state initiative.

https://chemotion.net

Electronic Lab Notebook (ELN): eLabFTW

eLabFTW is a free open source product. The ELN can be configured and used generically for various subject areas. Various partners of the Lower Saxony FDM state initiative are already using eLabFTW or are currently introducing it for the working groups at their institutions.

https://www.elabftw.net

 

Impressum

Datenschutzerklärung/Privacy policy

Gefördert durch:

Impressum

Datenschutzerklärung/Privacy policy

Gefördert durch: