Norwegian version of this page

Anonymisation/De-Identification in the Transcribing Process

It can be challenging to determine what constitutes too much or too little information for identifying an individual. With the advent of readily available artificial intelligence, it's becoming even more crucial to adequately de-identify data.

Information you should consider removing from the data material is personal names (of participants, third parties, pets, etc.), age, irrelevant identifying stories, place names, names of workplaces, clubs or other types of communities and interests the person is associated with. In addition, you should consider removing sociolect, dialect, repeated use of certain words and expressions, which may be identifying in your case. Stories and experiences can be important for your data, and this can be remedied by generalizing the stories or experiences so they are not too detailed (for example: 'Sande in Vestfold' could be changed to 'a small place in Eastern Norway'). This process is extra important if you consider sharing interview data with consent (as part of an open science practice). There is no definitive answer as to what is too little or too much information. You must consider all factors that could contribute to identifying a person.

Example: Your participant mentions that they volunteer at a youth club in the municipality you're writing about. If you conduct a Google search and discover there is only one youth club in the municipality with 20 volunteers, the likelihood of identifying the individual becomes significant when additional information is added.

For more information: General about privacy – What are de-identified and anonymous data?

Published May 23, 2024 12:25 PM - Last modified June 3, 2024 11:25 AM