There is a growing support for the stance that patients and research participants should have better and easier access to their raw (uninterpreted) genomic sequence data in both clinical and research contexts.
We review legal frameworks and literature on the benefits, risks, and practical barriers of providing individuals access to their data. We also survey genomic sequencing initiatives that provide or plan to provide individual access. Many patients and research participants expect to be able to access their health and genomic data. Individuals have a legal right to access their genomic data in some countries and contexts. Moreover, increasing numbers of participatory research projects, direct-to-consumer genetic testing companies, and now major national sequencing initiatives grant individuals access to their genomic sequence data upon request.
Drawing on current practice and regulatory analysis, we outline legal, ethical, and practical guidance for genomic sequencing initiatives seeking to offer interested patients and participants access to their raw genomic data.
The quantity of genomic data generated about individual patients, research participants, and consumers is rapidly increasing. The Global Alliance for Genomics and Health (GA4GH), an international public-private consortium, develops technical standards and frames policy to facilitate the sharing of health and genomic data between health care, research, and individuals. Analyzing and sharing these data leads to novel health insights and opportunities , but it raises ethical questions about the flow of data back to individuals. Debate has centered on what types of individual findings should be reported from testing or research  and has tended to focus on the clinical validity and actionability of results, and whether or not individuals want to receive them [3, 4]. A distinct but equally important question is whether or not patients or research participants should be able to access to their “raw” (uninterpreted) genomic sequence data [5, 6].
A task team of the GA4GH on individual access was established to explore how genomic data generated in both clinical and health research contexts can be more readily shared with individual patients and participants. Research participants primarily want data that is clinically relevant to them or their families [7, 8]. They also attach intrinsic value to genomic data and expect to be able to access data that “belongs to them.” Of 4140 individuals participating in an ongoing international GA4GH survey, 61% would want to be able to access their raw sequence data (with most having the intention to use the data as the basis of further exploration) . Our task team envisages a standard system that allows interested patients and participants to “pull” their genomic data from clinical laboratories or research projects on request. Processes allowing individuals to access uninterpreted data are different from policies or processes on the return of individual findings. The latter are premised on the information’s clinical relevance and/or actionability. The right to access uninterpreted data does not undermine the right not to know where it is provided on request. Even so, there are concerns over the accuracy and utility of uninterpreted data, and fears of misuse by individuals or third party services may result in psychological harms or wasted health care resources . Regardless, various research initiatives are opting to provide individual access, most notably the US “All of Us”  and UK 100,000 Genomes  initiatives, and participatory research projects such as the Personal Genome Project . Drawing on a review of current practice and analysis of the legal right to access personal health data, this paper supplies practical guidance for clinical laboratories or research projects seeking to provide participants access to uninterpreted genomic data. We recognize that it may not always be feasible or appropriate to provide individual access, especially in some (e.g., legacy) research contexts. We predict, however, that individual access will become expected or required as genomics becomes more clinically oriented and the public begins to insist on participatory data governance.
The projects providing or planning to provide individual access to uninterpreted genomic data are listed in Table 1 (adapted from ). We were only able to identify one such genomic sequencing project outside of the USA. Data types and formats may differ depending on the context, sequencing platform, analysis pipelines, and evolution of common file formats. The examples of genomic data formats currently provided to participants include reduced BAM, VCF, and FASTQ. The usefulness of the data is enhanced where it is accompanied by rich, standard metadata . Genomic sequencing initiatives may also provide individuals access to their associated health data (phenotypic, clinical, environmental). The choice of file format and the choice of when to provide access should be considered from the perspective of both the project and the individual.
A legal right to access?
In many countries, individuals have a legal right to access their personal data held by government bodies and commercial entities [16,17,18]. A general right to access personal data is included in the EU General Data Protection Regulation (GDPR) (art 15), which comes into force in May 2018 . This internationally recognized right empowers individuals to ascertain what data these entities have about them and how their personal data are used. The right also enables individuals to ensure their data are accurate, up to date, and used in a transparent, fair, and lawful manner. Upon request, individuals must be provided with a copy of their data in a reasonable timeframe, in a useful format, and for a reasonable cost. There is considerable uncertainty and jurisdictional variation over whether or not genetic data is legally considered inherently identifiable. Regardless, genomic data will still fall under broad definitions of personal data used in many jurisdictions (e.g., GDPR art 4(1)), as long as it “relates to” an identifiable individual, which is increasingly the case for linked genomic data in clinical, commercial, and translational research contexts.
Similarly, patients have a legal right to access their health record (, art. 19). This ensures transparency in the physician-patient relationship and allows patients to correct inaccurate information (which may be used by third parties such as insurers) or transfer records when changing physicians. Access to health data also empowers patients to take an active role in their health care. Though raw laboratory data are not typically considered part of the health record, this is changing for genomics. In the USA, recent legislative amendments and interpretive guidance extend the right to access under the US federal health privacy law to a broad range of records that may be used to make decisions about individuals, including information generated as part of a laboratory test . For genetic sequencing, this might include “the full gene variant information generated by the test” ; for genomic sequencing, the raw sequence data . Genomic sequencing initiatives providing a right to access should indicate this in the consent form, along with the basic information on what is available and how to request access. Consent forms should clearly distinguish between access rights and other communication policies, such as the return of individual findings of clinical relevance . As we discuss below, more detailed guidance can be provided to those individuals requesting access at the point of implementation.
The right of access is generally subject to narrow exceptions: where it would reveal confidential information (about other patients or health professionals), risk serious harm to the individual, or involve disproportionate effort . Providing an individual access to her own genomic data would not generally breach professionals’ legal duties of confidentiality to third parties or present serious risks to the individual. An important legal distinction for research contexts is that many countries limit individual access to research data, usually to protect commercial interests and scientific validity . It is often unclear, however, if research exceptions in general access to information provisions were meant to restrict participants from accessing their own data . International and national research ethics guidelines are largely silent about individual access to health data. This is surprising, given that many incorporate other data protection principles [26,27,28]. Some mention that participants have the right to access their clinical data on demand, unless temporary or permanent non-disclosure is approved by a research ethics committee with reasons (, Table 2). Regardless, research exceptions are unlikely to apply as sequencing moves to clinical or hybrid clinical-research contexts. Researchers seeking to provide individuals with access to genomic data may also have to contend with clinical services, clinical laboratory, and/or medical product regulations. The US regulations, for example, require any test results used for clinical decision-making to be done in a certified laboratory . While these restrictions may block the return of clinically relevant individual findings from research laboratories, it is not clear why they would also apply to uninterpreted genomic data.
In conclusion, it is likely that clinical laboratories have, or will soon have, a legal obligation to provide individuals their raw genomic data upon request. While it is less likely that a legal right applies in research contexts, we propose that projects should still consider providing a default right of participants to access their own individual-level genomic data upon request. Any exceptions to access should be transparently stated, clearly justified, and approved by a research ethics committee or similar body. If access compromises the primary objective of the study, it could be withheld until the objective is achieved. In both research and clinical contexts, data stewards providing individual access should make efforts to ensure data is of high quality and interoperable. Standard use agreements could accompany access explaining that the data is provided “as is,” without implied or express warranties (e.g., that the data is fit for a particular purpose––namely clinical interpretation or decision-making), and disclaiming liability for any harm resulting from the individual’s use of the data.
Handling ethical and practical concerns
There are many good reasons for researchers to provide access to individual-level uninterpreted data. Empirical studies show that many people believe that their genomic data belongs to them––that they have a right to access, use, and distribute their data as they see fit ––even if this contradicts laws or consent forms [32, 33]. Providing access may also build trust and incentivize participation . Moreover, patients are often experts in their condition and may be more motivated to determine the relevance of their health data than researchers focused on discovery . Access will enable curious citizen scientists to explore the myriad meanings of their DNA. Research may even thrive when individuals themselves share data with patient-led registries [36, 37], research projects, or public repositories like openSNP [38, 39] or Open Humans . The usefulness of raw genomic data for the individual will also increase with improvements in data quality and interoperability, expansion of the knowledge base of genotype-phenotype relationships, and the availability of reliable third party services. The more data that is held by individuals, the more portals to connect users to research initiatives [40, 41]; interpretation services to provide ancestry, genealogy, and health or wellness information; and tools to facilitate citizen science and self-driven interpretation .
There are, however, concerns third party interpretation services may provide uncertain, potentially inaccurate information of little benefit and may lead to anxiety or unnecessary medical follow-up . To promote responsible use, data stewards could provide individuals who request access information about the limitations of data quality, the limitations of self-directed or third party interpretations, and the importance of secure storage and responsible sharing. In particular, clarity is needed that the data should not be used as a basis for clinical interpretation or decision-making without seeking medical advice and confirmatory testing in an accredited laboratory. User portals could facilitate download and communication, or even direct transfer/donation to trusted storage platforms or research projects. Data stewards should also ensure access processes are privacy protective and secure. They require basic authentication processes (is this actually the participant?); tracking processes (is this actually the participant’s genome?); and a means of re-identifying a genome (how do I break the code?). Researcher confidentiality may be breached if requestors are not properly authenticated, or if data from the wrong genome is returned. Privacy concerns persist after data has been accessed. Individuals may be ill-prepared to keep their own data secure, and third party services may not offer comparable privacy and security protections . Again, research projects could provide individuals with tips on how to safeguard their data. While researchers should do their best to encourage individuals to store and use their data carefully, the ultimate responsibility to do so will rest with the individual.
There are also fears that access may divert resources away from clinical or research activities. Moreover, individuals seeking professional interpretation of their data could be a drain on primary care and genetic services within the health system. This could waste public health system resources and unfairly divert resources to the most proactive, healthy, and educated individuals. Providing access should not, however, necessitate expensive interpretation or counseling, as may be the case for the return of individual results. Costs would be limited to basic tracking, authentication, and communication processes––already common in many laboratory contexts and clinical practices––and download costs.
Currently, many researchers feel they should provide access to individual-level data to patients and participants, but do not have the appropriate resources to do so. To address this problem, research funding bodies could help by providing resources, infrastructure, and incentives. Instead of each project establishing its own system, common data management platforms could be developed to enable individual access (such as those already offered to researchers by direct-to-consumer companies) . Data sharing repositories enabling broad research community access could be modified to enable individual access. Individual access endorsements or badges could recognize laboratory or researcher efforts to share data with interested participants and patients.
We provide a summary of recommendations for sequencing initiatives providing individual access to uninterpreted genomic data in Table 2. More data and experience is needed to definitively refute paternalist concerns about individuals managing their own genomic data. This will only happen if researchers do what they do best: experiment in a responsible manner to understand how to most appropriately support and enable individual access to genomic data. Here, the variable to tweak is not the data analysis, but the participant communication pipeline. The experiment is off to a promising start.
Global Alliance for Genomics and Health
Chatzimichali EA, Brent S, Hutton B, Perrett D, Wright CF, Bevan AP, et al. Facilitating collaboration in rare genetic disorders through effective matchmaking in DECIPHER. Hum Mutat. 2015;36:941–9.
Jarvik GP, Amendola LM, Berg JS, Brothers K, Clayton EW, Chung W, et al. Return of genomic results to research participants: the floor, the ceiling, and the choices in between. Am J Hum Genet. 2014;94:818–26.
Wolf SM, Crock BN, Van Ness B, Lawrenz F, Kahn JP, Beskow LM, et al. Managing incidental findings and research results in genomic research involving biobanks and archived data sets. Genet Med. 2012;14:361–84.
Holm IA, Iles BR, Ziniel SI, Bacon PL, Savage SK, Christensen KD, et al. Participant satisfaction with a preference-setting tool for the return of individual research results in pediatric genomic research. J Empir Res Hum Res Ethics. 2015;10:414–26.
Fernandez CV, Bouffet E, Malkin D, Jabado N, O’Connell C, Avard D, et al. Attitudes of parents toward the return of targeted and incidental genomic research findings in children. Genet. Med. 2014;16:633–40.
Facio FM, Eidem H, Fisher T, Brooks S, Linn A, Kaphingst KA, et al. Intentions to receive individual results from whole-genome sequencing among participants in the ClinSeq study. Eur J Hum Genet. 2013;21:261–5.
Sanderson SC, Linderman MD, Suckiel SA, Zinberg R, Wasserstein M, Kasarskis A, et al. Psychological and behavioural impact of returning personal results from whole-genome sequencing: the HealthSeq project. Eur J Hum Genet. 2017;25:280–92.
Allen C, Gabriel J, Norkunas Cunningham T, Flynn M, Wang C. The impact of raw DNA availability and corresponding online interpretation services: a mixed-methods study. Transl. Behav. Med. 2017;(in press).
Badalato L, Kalokairinou L, Borry P. Third party interpretation of raw genetic data: an ethical exploration. Eur J Hum Genet. 2017;25:1189.
We would like to thank Gratien Dalpé, Academic Associate at the Centre of Genomics and Policy, McGill University, for his assistance preparing the final manuscript.
Adrian Thorogood and Erika Kleiderman were funded by The CanSHARE project, which is supported by the Genome Quebec, Genome Canada, the government of Canada, the Ministère de l’Économie, Innovation et Exportation du Québec, and the Canadian Institutes of Health Research (fund no. 141210).
Availability of data and materials
Data sharing is not applicable to this article as no datasets were generated or analyzed during the current study.
Authors and Affiliations
Centre of Genomics and Policy, Department of Human Genetics, McGill University Faculty of Medicine, Montreal, Quebec, H3A 0G1, Canada
Adrian Thorogood & Erika Kleiderman
Icahn School of Medicine at Mount Sinai, New York, USA
Department of Political Science, University of Vienna, Vienna, Austria
Society and Ethics Research, Connecting Science, Wellcome Genome Campus, Hinxton, UK
Icahn Institute for Genomics & Multiscale Biology, New York, USA
University of Washington, Seattle, USA
Cambridge Precision Medicine, Cambridge, UK
Genetic Alliance, Washington DC, USA
National Human Genome Research Institute, National Institutes of Health, Bethesda, USA
Laura Lyman Rodriguez
Newcastle University, Newcastle upon Tyne, UK
Department of Global Health & Social Medicine, King’s College London, London, UK
Faculty of Education, University of Cambridge, Cambridge, UK
on behalf of the Participant Values Task Team of the Global Alliance for Genomics and Health
JB led the work on the survey of projects. AT and SN led the comparative legal review. All authors contributed substantially to the conceptual design of the paper; identification of legal, ethical, and practical challenges and to the recommendations; and drafting of the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Thorogood, A., Bobe, J., Prainsack, B. et al. APPLaUD: access for patients and participants to individual level uninterpreted genomic data.
Hum Genomics12, 7 (2018). https://doi.org/10.1186/s40246-018-0139-5