6. Data Sharing

Metadata

Metadata is data about data. Metadata needed to understand basic traits of the data should be easily readable in file names and structures. Those working with the data should take care to document all steps implemented in data processing. A lack of this information may impede replication. A list of dependencies should be made explicit in the metadata files for your project (Wilson et al., 2017).

Researchers must record exactly what processes were involved in preprocessing of the data, which software was used, which operating system the preprocessing was performed on, and ideally, what order tasks were performed in. Any anomalies like drop-out that could not be corrected for, or a high degree of movement in a certain subject should also be recorded in the metadata pertaining to that subject’s scans, as well as if that subject left the scanner during the session (for example, to use the bathroom). If data is compared across scanners (or significant upgrades/maintenance were performed on a scanner during the study), B0 correction may help correct for this, but these challenges should be noted, as these can impact the data in significant ways.

Researchers must be aware that the broad range of methods and software employed in analysis may create problems when it comes to reproducibility. For example, the software, software version or even operating system used can produce sometimes wide variation in results (Bowring, Maumet & Nichols, 2019). Even programming languages and their packages change over time, which can result in difficulty running the same analysis using the same scripts later (Nichols et al., 2017). Thus, documenting which programming language, its version and/or software and its version, as well as which operating system was used is important and should be part of the metadata.

BIDS

Image may contain: Text, Line, Font, Parallel. The brain imaging data structure (BIDS) (Pernet et al., 2018) is a format created to facilitate the sharing of neuroimaging data by using agreed-upon standards created by the neuroimaging community itself. BIDS offers a systematic way to organize data into folders using dedicated names, in association with text files, either as tabulated separated value file (.tsv) or JavaScript Object Notation file (.json) to store metadata. We encourage the local neuroimaging research community to share their data using this data structure as it results in greater ease of communication, reproducibility and the development of data analysis pipelines. It also facilitates compliance with the FAIR principles of Findability, Accessibility, Interoperability, and Reusability. See our handbook Structuring Data with BIDS for more information on how to use BIDS for fMRI and EEG data.

Image source: The Bids Starter-Kit

Anonymization

Neuroimaging data is inherently sensitive and requires special care to ensure safe handling of data. Data is removed from the scanner on password-protected, encrypted hard drives. At the point of transfer of data from the scanner to external encrypted hard drive, one must always remember to check the box for anonymization of the data at the scanner console and designate a participant number or alias instead. This is because DICOMs have headers that contain identifying patient data. If anonymization is not performed at the source, the DICOM headers must be anonymized by hand, which can be time-consuming. Storage of neuroimaging data is permitted only in TSD or Lagringshotellet at UiO. Some patient groups may only be stored in TSD. Prior to data archival and sharing, neuroimaging data should be defaced or skull-stripped. Some may opt to instead make preprocessed data available, however this raises questions as to whether better methods of preprocessing data might be applied in the future (Nichols et al., 2017).

It is important to anonymize data from the start of data collection by providing participants with participant numbers and keeping any identifying information like name, address, phone number, birthday or national identification number separate from the neuroimaging data. Name, contact information and the subject’s ID are not stored together. Documents where ID and name are linked are stored on encrypted storage mediums. Those devices should be stored in locked cabinets separate from the data. These steps are of even greater importance when one is working with patient populations.

Research data archives

There are national, international, and domain-specific archives that meet international standards for archiving research data and making it accessible. UiO’s researchers can choose the archiving solutions that are most appropriate to their discipline and that meet the conditions of applicable legal frameworks. Depositing data resources within a trusted digital archive can ensure that they are curated and handled according to best practices in digital preservation.

Neuroimaging is moving toward open science but there remain several hurdles to this process including policies regarding ownership of the data, individual attitudes toward sharing and ownership, fears that errors will be revealed, the resources involved in curating, storing and sharing data and anonymity (Nichols et al., 2017).

Some archival resources for MRI data are:

Re3data.ord (a global list of archives)
Zenodo (EU’s archive)
Open fMRI (domain specific)
OpenNeuro (domain specific)
Neurodata (domain specific)
Neurovault (domain specific)
The fMRI Data Center (domain specific)
NIRD/Sigma 2 (national archives)
DataverseNO (national archives)

For more information, see our guide Data Management: DMPs & Best Practices.

By Elian E. Jentoft, Andreas Voldstad & Rene Skukies

Published Aug. 31, 2020 6:38 PM - Last modified Aug. 31, 2020 11:18 PM