University of Manchester
37 files

Research data for "Subjective data models in bioinformatics: Do wet-lab and computational biologists comprehend data differently?"

Version 2 2023-04-24, 15:58
Version 1 2022-08-26, 13:07
posted on 2023-04-24, 15:58 authored by Yochannah YehudiYochannah Yehudi, Carole Goble, Caroline JayCaroline Jay, Lukas Hughes-NoehrerLukas Hughes-Noehrer

Subjective data models dataset

This dataset is comprised of data collected from study participants, for a study into how people working with biological data perceive data, and whether or not this perception of data aligns with a person's experiential and educational background. We call the concept of what data looks like to an individual a "subjective data model".

Todo: link paper/preprint once published.

Computational python analysis code: and


  • **Transcripts** of the recorded sessions are attached and have been verified by a second researcher. These files are all in plain text .txt format. Note that participant 3 did not agree to sharing the transcript of their interview.
  • **Interview paper files** This folder has digital and photographed versions of the files shown to the participants for the file mapping task. Note that the original files are from the NCBI and from FlyBase.
  • Videos and stills from the recordings have been deleted in line with the Data Management Plan and Ethical Review.
  • `anonymous_participant_list.csv` shows which files have transcripts associated (not all participants agreed to share transcripts), what the order of Tasks A and B were, the date of interview, and what entities participants added to the set provided (if any). See the paper methods for more info about why entities were added to the set.
  • `cards.txt` is a full list of the cards presented in the tasks.
  • `background survey` and `background manual annotations` are the select survey data about participant background and manual additions to this where necessary, e.g. to interpret free text.
  • `codes.csv` shows the qualitative codes used within the transcripts.
  • `entry_point.csv` is a record of participants' identified entry points into the data.
  • `file_mapping_responses` shows a record of responses to the file mapping task.


Research ethics approval number


Usage metrics

    School of Engineering



    Ref. manager