7. Data sharing

When sharing (or intending to share) data, scientists should keep certain things in mind to make the lives of their colleagues easier, as well as to protect privacy of the participants.

Metadata

The brain imaging data structure (BIDS) (Pernet et al., 2018) is a format to share neuroimaging data using agreed upon standards created by the neuroimaging community. BIDS offers a systematic way to organize data into folders using dedicated names, in association with text files, either as tabulated separated value file (.tsv) or JavaScript Object Notation file (.json) to store metadata. We encourage the EEG community to share their data by using this data structure as it facilitates communications, increases reproducibility and makes easier to develop data analysis pipelines. See our handbook Structuring Data with BIDS for more information.

Image may contain: Text, Font.

Research data archival

There are both national, international, and domain-specific archives that meet international standards for archiving research data and making it accessible. UiO’s researchers can choose the archiving solutions that are most appropriate to their discipline and that meet the conditions of applicable legal frameworks. Depositing data resources within a trusted digital archive can ensure that they are curated and handled according to best practices in digital preservation.

Some archival resources are:

  • Re3data.org (a global list of archives)
  • Zenodo (EU’s archive)
  • NSD (national archives)
  • NIRD/Sigma 2 (national archives)
  • DataverseNO (national archives)

For more information, see our guide

DataManagement: DMPs & Best Practices.

Anonymization

EEG data itself (as well as the related behavioral data) do not constitute sensitive data, however, it is important to anonymize data from the start of data collection by providing participants with participant numbers and keeping any identifying information like name, address, phone number, birthday or national identification number separate from the EEG data. Name, contact information and the subject’s ID are not stored together. Documents where ID and name are linked are stored on encrypted storage mediums. Those devices should be stored in locked cabinets separate from the data. These steps are of even greater importance when one is working with patient populations.

MRI or fMRI data used in conjunction with EEG data is another story. Neuroimaging data is inherently sensitive and requires safe handling of data. Data is removed from the scanner on password-protected, encrypted hard drives. At the point of transfer of data from the scanner to external encrypted hard drive, one must always remember to check the box for anonymization of the data at the scanner console and designate a participant number or alias instead. This is because DICOMs have headers that contain identifying patient data. If anonymization is not performed at the source, the DICOM headers must be anonymized by hand, which can be time-consuming. Storage of neuroimaging data is permitted only in TSD or Lagringshotellet at UiO. Some patient groups may only be stored in TSD. Prior to data archival and sharing, neuroimaging data must be defaced. This can impede locating the nasion, so this is often only done in the context of data sharing.

Creating a Data Management Plan for EEG Research

Now that you are familiar with some of the data-related topics and concerns that should be considered when conducting an EEG study, you may feel better equipped to complete a data management plan document. The following offers questions specific to EEG data to guide the creation of such a document. For a more in-depth exploration of how to create a data management plan, please see the HTD guide Data Management. DMPs & Best Practices.

  • Who is the study’s P.I.? What other team members will be involved in data collection and analysis?
  • What software will you use for stimuli/experiment presentation and behavioral data collection?
  • What software do you foresee using for analysis? 
    • Knowing what software will be used ahead of data collection will help you to make an appropriate data management plan from the start.
  • How will your data be handled in the curation phase, once the study is over? What data repository will you use? What requirements do they have for data management? Can you maintain this standard throughout the study? What will data curation cost
    • Many data repositories have requirements for file structure and naming conventions, as well as the file types that are preferred. If you are aware of these requirements beforehand, you can save time once the time for curation arrives by using those guidelines over the course of the study. You will need to know and plan for the costs that will be incurred for data sharing and curation when applying for funding.
  • Will you share your data? Under what license will you share your data? What limitations will be placed upon access to your data?
    • For a comprehensive description of the dif erent licenses that can be applied to data sharing and usage see https://www.ucl.ac.uk/library/research-support/research-datamanagement/licenses-data-sharing-creative-commons
  • How will you pilot your project? What will be done with the data from the pilot study?
  • What file types do your behavioral data programs generate?
  • What EEG system will be used in data collection? What file types does it generate?
  • What electrode setup will you use?
  • How many participants will your study need? Are any of them in a patient group?
    • Special considerations must be taken when handling data related to studies involving patient groups to ensure anonymity.
  • How many runs or sessions will the study have per participant?
  • Is the study longitudinal? How often will participants be called to return? How will the data be maintained over the course of the study? Who will be responsible for the data? Will you have to periodically re-apply for ethical approval?
    • Some studies will need to re-apply to the applicable ethics boards every 5-10 years if they are longitudinal studies.
  • Will you collect sensitive classes of data?
    • Sensitive classes of data include identification numbers, birthdates, neuroimaging, information related to health conditions, information about race/ethnicity, political affiliation, sexual orientation or in some cases gender identity/biological sex (for example when working with transgender or intersex populations).
  • What ethics committees will you need to apply to? What are their data management requirements?
    • Most studies conducted under PSI will be required to submit applications to REK, NSD or both. Their data management requirements are detailed on the committees’ websites.
  • Who is funding the study? What are the funder’s requirements for data management plans?
    • Most funding entities require at least a basic data management and data sharing plan to accompany funding applications.
  • What information will be provided to participants in advance of participation and in what form? What are your plans for collecting informed consent? How will you store the consent forms once they are collected?
  • How will you store the data key linking participants to their data?
  • If you will collect sensitive data, how will it be stored and analyzed? What protections will be put in place to ensure anonymity is maintained?
    • At UiO, sensitive data can be stored and analyzed in TSD or on Lagringshotellet, depending on the degree of sensitivity of the data.
  • How will you store non-sensitive data? In what environment do you plan to analyze it?
  • Who will have access to the data? Who is responsible for transferring data to storage once it is collected?
  • What file naming conventions will be used for the various file types that make up your dataset?
  • Will you use in-house written code? What programming language will you use? Will this code be made available to the scientific community? If so, where will it be made available?
  • Will you maintain a lab notebook? How will the lab notebook be used and who will have access?
    • Digital lab notebooks are now quite common, but can raise questions of data security.
  • Will you convert proprietary file types to more standardized file types prior to sharing your data?
  • How will the findings be disseminated?
  • How will you record metadata related to the study for future sharing?
    • Metadata is data about your data that can help researchers who later want to access your data to understand how the study was conducted. It may also help members of your current team understand the data they are working with.
  • Who will be responsible for the data in the long term once the study is completed?
    • This is important to consider for all studies, but especially important for longitudinal studies. Personnel and staf may leave your institution or be dif icult to contact in the future. A primary contact person who is responsible for the data should be designated. Should they leave the institution or retire and a new responsible party is designated, then the repository should be updated with their information.

 

PREV

NEXT

By Rene S. Skukies, Elian E. Jentoft & Olga Asko
Published Aug. 21, 2020 6:27 PM - Last modified Aug. 31, 2020 10:44 PM