Human genomic diversity, viral genomics and proteomics, as exemplified by human papillomaviruses and H5N1 influenza viruses
© Henry Stewart Publications 2009
Received: 28 January 2009
Accepted: 28 January 2009
Published: 1 July 2009
The diversity of hosts, pathogens and host-pathogen relationships reflects the influence of selective pressures that fuel diversity through ongoing interactions with other rapidly evolving molecules in the environment. This paper discusses specific examples illustrating the phenomenon of diversity of hosts and pathogens, with special reference to human papillomaviruses and H5NI influenza viruses. We also review the influence of diverse host-pathogen interactions that determine the pathophysiology of infections, and their responses to drugs or vaccines.
The availability of complete genome sequences and the wealth of large-scale biological datasets provide an unprecedented opportunity to elucidate the genetic bases of human diseases and host-pathogen interactions. The diversity of hosts, pathogens and host-pathogen relationships reflects the influence of selective pressures that fuel diversity through ongoing interactions with other rapidly evolving molecules in the environment. Such influences add another source of genetic adaptability as cells adjust to new environments and out-manoeuvre pathogenic threats. The study of genetic variation in pathogens and hosts has practical significance for developing strategies to combat and control infectious diseases. Vaccines based on highly polymorphic antigens may be confounded by allelic restriction of the host immune response. In addition, the study of the distribution of genomic polymorphisms among different hosts may provide information on responses to drugs or vaccines. Based on the concept that host damage is the most relevant outcome of the host-pathogen interaction, we need better to understand both host and pathogen polymorphisms in drug or vaccine design. A deeper knowledge of the diversity and nature of pathogens will provide valuable insights into genetic markers that may be useful for detection, identification and forensics. The ability to discriminate between virulent pathogens and their counterparts that are either less or not virulent is another major challenge. It is important to discover genetic factors that mediate the virulence process in order to devise novel methods to prevent or treat disease. Similarly, understanding how the host responds to microbial invasion, and how the pathogen evades or manipulates the immune response to subvert the host, will contribute to the development of vaccines and other prophylactic strategies. Concomitantly, the unravelling of associations between hosts and pathogens will be highly relevant to the modelling of the population biology of multi-host pathogens and their impact on co-infections. This paper reviews and discusses specific examples concerning the issue of diversity of hosts and pathogens, and the influence of diverse host-pathogen interactions that determine the pathophysiology of infections, and their responses to drugs or vaccines.
Human genome diversity and 'SNPshots'
The genetic blueprint of an individual not only determines disease susceptibility, but also his/her response to drug treatment. Numerous genes are involved in drug response and toxicity, introducing a daunting level of complexity in the search for candidate genes. Thus, genetics--particularly gene polymorphisms-- exert a significant impact on target discovery.
The HapMap constitutes a catalogue of common genetic variants that exist in humans. It describes what these variants are, where they occur in our DNA and how they are distributed among individuals within a specific population and among populations in different parts of the world. This project provides information that can link genetic variants to the risks for specific illnesses, which can lead to new methods for preventing, diagnosing and treating diseases.[1, 2] Differences in individual bases are by far the most common type of genetic variation. These genetic differences, known as single nucleotide polymorphisms (SNPs), represent DNA sequence variations that occur when a single nucleotide in the genome sequence is altered. SNPs are more common than other types of polymorphisms, and occur at a frequency of approximately one in 1,000 base pairs throughout the genome (including promoter regions and coding and intronic sequences). Some of these differences may alter gene products in ways that confer susceptibility or resistance to diseases, or contribute to disease severity or progression. Although over 99 per cent of human DNA sequences are the same across the human population, DNA sequence variations impact on how humans respond to disease; to environmental stresses such as infections, toxins and chemicals; and to drugs and other therapies. Since genetic factors also affect response to drug therapy, SNPs can help to determine why individuals differ in their abilities to absorb or clear certain drugs, as well as to ascertain the mechanisms ofadverse drug effects. Moreover, by affecting drug-target proteins such as G-protein-coupled receptors, enzymes, ion channels and proteins involved in detoxification pathways, non-synonymous coding SNPs (cSNPs) (namely substitutions resulting in alterations of encoded amino acids) significantly influence the diverse responses of efficacy and toxicity of therapeutic agents in the human population.
Some non-synonymous cSNPs associated with human disorders disrupt important structural features of the affected protein. For example, a polymorphic variant that disrupts a critical disulphide bond is C260Y in HLA-H, culminating in hereditary haemochromatosis (Figure 1d).
Human genome diversity and drug responses
The total number of SNPs reported in public SNP databases currently exceeds 9 million. Occasionally, an SNP may actually cause a disease and can therefore be exploited to search and isolate the disease causing gene. Since SNPs are genetic variations that occur at regular intervals and at high frequency throughout the human genome, they can be used as markers within the genome. If a particular marker is found to be common among individuals with a particular disease, it suggests that the gene involved is probably located near the marker. This renders SNPs of great value to biomedical research, and for developing pharmaceutical products or diagnostics.
For virtually all medications, inter-patient variability in response to drug therapy is the rule rather than the exception. Tremendous progress has been achieved in understanding the molecular basis of drug action, and in elucidating genetic determinants of disease pathogenesis and drug response.[16, 17] This inter-patient variability is potentially regulated by processes such as drug transport, drug metabolism, cellular signalling pathways (eg G-protein-coupled receptors) and response pathways (eg apoptosis, cell cycle control). Polymorphisms of drug-metabolising enzymes and/or pharmacological targets are frequently associated with adverse drug reactions or failure of efficacy.[17–20] Drug responses may also be modulated by non-genetic factors, however, especially co-medications and co-morbidities.
Thus, SNPs potentially can be applied in the development of individualised medicine and can provide an important source of information for studying the relationship between genotypes and phenotypes of human diseases. A bottleneck occurs when linking information about the variation in human genes to the variation in drug responses (pharmacogenetics) and understanding how interacting systems of genes determine individual drug responses (pharmacogenomics). A fundamental challenge in analysing disease cSNPs is the relative scarcity of alleles that can be mapped to three-dimensional protein structures. In the future, it is envisioned that knowledge of an individual's SNP genotype may provide a basis for assessing susceptibility to a disease and the optimal choice of therapies.
Viral genome diversity exemplified by papillomaviruses and influenza viruses
Viruses are exceptionally diverse in morphology, genetic organisation, replication strategies, virulence and many other characteristics. Viral genome sequences represent a treasure trove of essential information for understanding pathogenesis better, as well as developing novel diagnostics and antiviral therapies. The ability of various medically important viruses to develop high degrees of genetic diversity, and to acquire mutations to escape immune pressures, contributes to the difficulties in vaccine development.
Diversity of human papillomaviruses
There are more than 100 different types of human papillomavirus (HPV), the causative agent of benign papillomas or warts, and a cofactor in the development of carcinomas of the genital tract, skin, head and neck. HPVs are broadly divided into cutaneous and mucosal HPV types. Eight major proteins, designated as early (E) or late (L) gene products, are encoded by the HPV DNA genome. Proteins E1 and E2 are involved in viral replication, as well as the regulation of early transcription. E1 binds to the origin of viral replication (ORI) and exhibits ATPase as well as helicase activity, whereas E2 forms a complex with E1, facilitating its binding to the ORI. Furthermore, E2 acts as a transcription factor that regulates early gene expression by binding to specific E2 recognition sites. E4 plays important roles in promoting the differentiation-dependent productive phase of the viral life cycle. The E5 protein supports HPV late functions and disrupts major histocom-patibility complex (MHC) class II maturation. The E6 and E7 oncoproteins are mainly responsible for HPV-mediated malignant cell progression, leading ultimately to invasive carcinoma. Finally, L1 and L2 are the major and minor capsid proteins, respectively, and HPV vaccines based on L1 are already in clinical use.
In 1999, we determined the complete nucleotide sequence of a novel genital HPV type from a female sex worker with a wart virus infection in Singapore -- namely, HLT7474-S -- which was designated as candidate HPV type 85 (HPV-85) by the Reference Center for Papillomaviruses, German Cancer Research Center, Heidelberg, Germany. Its genomic organisation and phylogenetic relationships were analysed.[30, 31] The DNA sequence of the L1 open reading frame (ORF) of HPV-85 shares similarities of 78.3 per cent, 78.1 per cent and 78.0 per cent with those of the most closely related known types (HPV types 39, 70 and 45, respectively), thus satisfying the criteria for a new HPV type, which is defined on the basis of a dissimilarity exceeding 10 per cent in the L1 gene. In addition, the E6 and E7 ORFs of HPV-85 exhibit highest percentage similarities of 79.7 per cent and 77.9 per cent to E6 of HPV-18 and E7 of HPV-59, respectively, thus reiterating the relatedness between HPV-85 and known genital HPVs belonging to group A7 .
Phylogenetic trees[34, 35] based on the individual ORFs, putative proteins and long control regions (LCRs) reveal the relationships between HPV-85 and the high-risk HPVs from group A7. The E1, E2, E5 and L2 proteins of HPV-85 are more closely associated with those of HPV-70 and HPV-39. Greater similarities are observed for the E4, E6 and L1 proteins of HPV-85, however, compared with those of HPV-18 and HPV-45, and for the HPV-85 E7 protein and LCR compared with its HPV-59 counterparts. These data exemplify the diversity of HPV viruses, with particular reference to HPV-85 as a co-evolved member of the A7 group of genital HPVs.
HPVs and protein disorder
It is now recognised that many functional proteins or their long segments are devoid of stable secondary and/or tertiary structure, and exist instead as very dynamic ensembles of conformations. They are known by different names, including natively unfolded, intrinsically disordered, intrinsically unstructured, rheomorphic, pliable and different combinations thereof. Disordered proteins have high flexibility and are reported to be involved in regulation, signalling and control pathways in which interactions with multiple partners, as well as high-specificity and low-affinity interactions, are often required .
We conducted Tukey's multiple comparison test to compare the residue disorder values for each of the two HPV groups. The one-way analysis of variance (ANOVA) indicates that the E6 oncoproteins of oncogenic HPVs (HPV-16, HPV-18 and HPV-85) are significantly more disordered (p <0.001) than those of non-oncogenic HPVs (HPV-6 and HPV-11). Thus, the results of this analysis are consistent with the conclusion that high-risk HPVs are characterised by an increased degree of intrinsic disorder of the E6 protein. The molecular basis of this disorder, in terms of protein sequence variation of virulent HPV types, is supported by experimental evidence for the transforming ability of E6 proteins of oncogenic HPVs. Furthermore, the disorder trend is more significant for E6 than for E7, consistent with a previous report using the commercial software PONDR. The data also highlight the diversity of high-risk and low-risk HPVs at the protein structure level. These intrinsic differences in E6 protein dynamics may be exploited and targeted pharmacologically -- for example, using monoclonal antibodies or chemical inhibitors.
Certain non-oncoproteins of non-oncogenic HPVs exhibit greater 'disorder'-- for example, L1 and L2 of HPV-6 and HPV-11, and E2 of HPV6. This observation suggests that it is the disorder of critical viral protein(s) that defines the oncogenic capability and risk level of HPV. Our data illustrate an alternative computational approach to distinguishing between the non-oncogenic from the high-risk oncogenic HPV types. It will be interesting to determine whether the phenomenon of increased protein disorder of more virulent virus types can be generalised beyond the HPV family.
Genetic diversity of H5NI influenza viruses
Genetic determinants of host susceptibility to infection
Advances in genomics and understanding of pathogen variability, as well as diversity of the human immune systems have led to new trends in vaccine development that focus on epitope-based vaccines. An epitope is a small peptide fragment from an infectious agent that can induce a host immune response to eliminate the pathogen. Such a vaccine strategy shows promise in dealing with host and pathogen diversity. Compared with traditional vaccines, epitope-based vaccines are more specific, safe and more easily produced and controlled. The keys to the success of such approaches are the prediction models for rapidly scanning pathogen genomes to identify effective T-cell epitopes. A recent review article focuses on different methods available for MHC-peptide binding prediction for epitope-based vaccine design.
Conclusions and future prospects
The pharmaceutical industry is currently grappling with a tremendous number of potential drug targets against infectious diseases identified through sequence data for the human genome and many important pathogens. The challenge ahead is to delineate the factors involved in host-pathogen interactions, and to translate the enormous discovery potential of human and pathogen genomes into real products for therapeutic intervention. One of the major bottlenecks in bringing new drugs to market is our incomplete understanding of the genes and proteins central to host-pathogen interactions and the mechanisms underlying certain human diseases. Other limitations that need to be addressed include patient heterogeneity and pathogen polymorphisms in clinical trials, the existence of multiple molecular targets and the shortage of experimental models of therapeutic efficacy with good predictive validity and objective surrogate measurements of disease progression. Integration of computational biology, together with experimental approaches, can accelerate our ability rapidly and reliably to identify target proteins that can be harnessed for therapeutic intervention. The effective application of such complementary analytical methods will blaze trails for the exploitation of available genetic and molecular information. When supplemented with 'individualised therapy' based on a patient's genetic profile, these developments may lead to fewer serious adverse effects and improved responses to drug treatments and vaccine regimens. Finally, the ultimate goal must be to bridge theoretical biological, genetic and molecular phenomena, and cellular and organism biology, as well as medicinal chemistry in the relentless search for new cures in the battle against pathogens that plague humankind.
- Goldstein DB, Cavalleri GL: Genomics: Understanding human diversity. Nature. 2005, 437: 1241-1242. 10.1038/4371241a.View ArticlePubMedGoogle Scholar
- A haplotype map of the human genome. Nature. 2005, International HapMap Consortium, 437: 1299-1320. 10.1038/nature04226.Google Scholar
- Brookes AJ: The essence of SNPs. Gene. 1999, 234: 177-186. 10.1016/S0378-1119(99)00219-X.View ArticlePubMedGoogle Scholar
- Hoehe MR, Timmermann B, Lehrach H: Human inter-individual DNA sequence variation in candidate genes, drug targets, the importance of haplotypes and pharmacogenomics. Curr Pharm Biotechnol. 2003, 4: 351-378. 10.2174/1389201033377300.View ArticlePubMedGoogle Scholar
- Maitland ML, DiRienzo A, Ratain MJ: Interpreting disparate responses to cancer therapy: The role of human population genetics. J Clin Oncol. 2006, 24: 2151-2157. 10.1200/JCO.2005.05.2282.View ArticlePubMedGoogle Scholar
- Masood E: As consortium plans free SNP map of human genome. Nature. 1999, 398: 545-546.View ArticlePubMedGoogle Scholar
- Vitkup D, Sander C, Church GM: The amino-acid mutational spectrum of human genetic disease. Genome Biol. 2003, 4: R72-10.1186/gb-2003-4-11-r72.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang Z, Moult J: SNPs, protein structure, and disease. Hum Mutat. 2001, 17: 263-270. 10.1002/humu.22.View ArticlePubMedGoogle Scholar
- Zanotti G, Berni R: Plasma retinol-binding protein: structure and interactions with retinol, retinoids, and transthyretin. Vitam Horm. 2004, 69: 271-295.View ArticlePubMedGoogle Scholar
- Seelinger MW, Biesalski HK, Wissinger B, Gollnick H, et al: Phenotype in retinol deficiency due to a hereditary defect in the retinol binding protein synthesis. Invest Ophthalmol Vis Sci. 1999, 40: 3-11.Google Scholar
- Yao-Borengasser A, Varma V, Bodles AM, Rasouli N, et al: Retinol binding protein 4 expression in humans: Relationship to insulin resistance, inflammation, and response to pioglitazone. J Clin Endocrinol Metab. 2007, 92: 2590-2597. 10.1210/jc.2006-0816.PubMed CentralView ArticlePubMedGoogle Scholar
- Hash RB: Hereditary hemochromatosis. J Am Board Fam Pract. 2001, 14: 266-273.PubMedGoogle Scholar
- Kim S, Misra A: SNP genotyping: Technologies and bio-medical applications. Annu Rev Biomed Eng. 2007, 9: 289-320. 10.1146/annurev.bioeng.9.060906.152037.View ArticlePubMedGoogle Scholar
- Delrieu O, Bowman C: Visualizing gene determinants of disease in drug discovery. Pharmacogenomics. 2006, 7: 311-329. 10.2217/146224220.127.116.111.View ArticlePubMedGoogle Scholar
- Giacomini KM, Brett CM, Altman RB, Benowitz NL, et al: The pharmacogenetics research network: From SNP discovery to clinical drug response. Clin Pharmacol Ther. 2007, 81: 328-345. 10.1038/sj.clpt.6100087.View ArticlePubMedGoogle Scholar
- McLeod HL, Evans WE: Pharmacogenomics: Unlocking the human genome for better drug therapy. Annu Rev Pharmacol Toxicol. 2001, 41: 101-121. 10.1146/annurev.pharmtox.41.1.101.View ArticlePubMedGoogle Scholar
- Evans WE, Johnson JA: Pharmacogenomics: The inherited basis for interindividual differences in drug response. Annu Rev Genomics Hum Genet. 2001, 2: 9-39. 10.1146/annurev.genom.2.1.9.View ArticlePubMedGoogle Scholar
- Shah RR: Mechanistic basis of adverse drug reactions: The perils of inappropriate dose schedules. Expert Opin Drug Saf. 2005, 4: 103-128. 10.1517/14740318.104.22.168.View ArticlePubMedGoogle Scholar
- Rane A: Postgenomic prospects of success in drug development and pharmacotherapy. Pharmacogenomics J. 2001, 1: 6-9. 10.1038/sj.tpj.6500013.View ArticlePubMedGoogle Scholar
- Bosch TM, Meijerman I, Beijnen JH, Schellens JH: Genetic polymorphisms of drug-metabolising enzymes and drug transporters in the chemotherapeutic treatment of cancer. Clin Pharmacokinet. 2006, 45: 253-285. 10.2165/00003088-200645030-00003.View ArticlePubMedGoogle Scholar
- Lai E: Application of SNP technologies in medicine: Lessons learned and future challenges. Genome Res. 2001, 11: 927-929. 10.1101/gr.192301.View ArticlePubMedGoogle Scholar
- Klein TE, Chang JT, Cho MK, Easton KL, et al: Integrating genotype and phenotype information: An overview of the PharmGKB project. Pharmacogenetics Research Network and Knowledge Base. Pharmacogenomics J. 2001, 1: 167-170. 10.1038/sj.tpj.6500035.View ArticlePubMedGoogle Scholar
- Ustav M, Stenlund A: Transient replication of BPV-1 requires two viral polypeptides encoded by the E1 and E2 open reading frames. EMBOJ. 1991, 10: 449-457.Google Scholar
- Gloss B, Bernard HU, Seedorf K, Klock G: The upstream regulatory region of the human papilloma virus-16 contains an E2 protein-independent enhancer which is specific for cervical carcinoma cells and regulated by glucocorticoid hormones. EMBO J. 1987, 6: 3735-3743.PubMed CentralPubMedGoogle Scholar
- Wilson R, Fehrmann F, Laimins LA: Role of the E1-E4 protein in the differentiation-dependent life cycle of human papilloma-virus type 31. J Virol. 2005, 79: 6732-6740. 10.1128/JVI.79.11.6732-6740.2005.PubMed CentralView ArticlePubMedGoogle Scholar
- Fehrmann F, Klumpp DJ, Laimins LA: Human papillo-mavirus type 31 E5 protein supports cell cycle progression and activates late viral functions upon epithelial differentiation. J Virol. 2003, 77: 2819-2831. 10.1128/JVI.77.5.2819-2831.2003.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang B, Li P, Wang E, Brahmi Z, et al: The E5 protein of human papillomavirus type 16 perturbs MHC class II antigen maturation in human foreskin keratinocytes treated with interferon-gamma. Virology. 2003, 310: 100-108. 10.1016/S0042-6822(03)00103-X.View ArticlePubMedGoogle Scholar
- Snijders PJ, Steenbergen RD, Heideman DA, Meijer CJ: HPV-mediated cervical carcinogenesis: Concepts and clinical implications. J Pathol. 2006, 208: 152-164. 10.1002/path.1866.View ArticlePubMedGoogle Scholar
- Stanley M: Immunobiology of HPV and HPV vaccines. Gynecol Oncol. 2008, 109: S15-S21. 10.1016/j.ygyno.2008.02.003.View ArticlePubMedGoogle Scholar
- Chow VT, Leong PW: Complete nucleotide sequence, genomic organization and phylogenetic analysis of a novel genital human papillomavirus type, HLT7474-S. J Gen Virol. 1999, 80: 2923-2929.View ArticlePubMedGoogle Scholar
- de Villiers EM, Fauquet C, Broker TR, Bernard HU, et al: Classification of papillomaviruses. Virology. 2004, 324: 17-27. 10.1016/j.virol.2004.03.033.View ArticlePubMedGoogle Scholar
- Delius H, Saegling B, Bergmann K, Shamanin V, et al: The genomes of three of four novel HPV types, defined by differences of their L1 genes, show high conservation of the E7 gene and the URR. Virology. 1998, 240: 359-365. 10.1006/viro.1997.8943.View ArticlePubMedGoogle Scholar
- Forslund O, Hansson BG: Human papillomavirus type 70 genome cloned from overlapping PCR products: Complete nucleotide sequence and genomic organization. J Clin Microbiol. 1996, 34: 802-809.PubMed CentralPubMedGoogle Scholar
- Ho L, Chan SY, Chow V, Chong T, et al: Sequence variants of human papillomavirus type 16 in clinical samples permit verification and extension of epidemiological studies and construction of a phylogenetic tree. J Clin Microbiol. 1991, 29: 1765-1772.PubMed CentralPubMedGoogle Scholar
- Chan SY, Ho L, Ong CK, Chow V, et al: Molecular variants of human papillomavirus type 16 from four continents suggest ancient pandemic spread of the virus and its coevolution with humankind. J Virol. 1992, 66: 2057-2066.PubMed CentralPubMedGoogle Scholar
- Radivojac P, Iakoucheva LM, Oldfield CJ, Obradovic Z, et al: Intrinsic disorder and functional proteomics. Biophys J. 2007, 92: 1439-1456. 10.1529/biophysj.106.094045.PubMed CentralView ArticlePubMedGoogle Scholar
- Shimizu K, Hirose S, Noguchi T: POODLE-S: Web application for predicting protein disorder by using physicochemical features and reduced amino acid set of a position-specific scoring matrix. Bioinformatics. 2007, 23: 2337-2338. 10.1093/bioinformatics/btm330.View ArticlePubMedGoogle Scholar
- Hirose S, Shimizu K, Kanai S, Kuroda Y, Noguchi T: POODLE-L: A two-level SVM prediction system for reliably predicting long disordered regions. Bioinformatics. 2007, 23: 2046-2053. 10.1093/bioinformatics/btm302.View ArticlePubMedGoogle Scholar
- Uversky VN, Roman A, Oldfield CJ, Dunker AK: Protein intrinsic disorder and human papillomaviruses: Increased amount of disorder in E6 and E7 oncoproteins from high risk HPVs. J Proteome Res. 2006, 5: 1829-1842. 10.1021/pr0602388.View ArticlePubMedGoogle Scholar
- Ma HC, Chen JM, Chen JW, Sun YX, et al: The panorama of the diversity of H5 subtype influenza viruses. Virus Genes. 2007, 34: 283-287. 10.1007/s11262-006-0018-3.View ArticlePubMedGoogle Scholar
- Chow VT, Tambyah PA, Goh KT: To kill a mocking bird flu?. Ann Acad Med. 2008, 37: 451-453.Google Scholar
- Wallace RG, Hodac H, Lathrop RH, Fitch WM: A statistical phylogeography of influenza A H5N1. Proc Natl Acad Sci. 2007, USA, 104: 4473-4478. 10.1073/pnas.0700435104.Google Scholar
- Narasaraju T, Sim MK, Ng HH, Phoon MC, et al: Adaptation of human influenza H3N2 virus in a mouse pneumonitis model: Insights into viral virulence, tissue tropism and host pathogenesis's. Microbes Infect. 2009, 11: 2-11. 10.1016/j.micinf.2008.09.013.View ArticlePubMedGoogle Scholar
- Zhao B, Sakharkar KR, Lim CS, Kangueane P, et al: MHC-peptide binding prediction for epitope based vaccine design. Int J Integr Biol. 2007, 1: 127-140.Google Scholar
- Sakharkar MK, Sakharkar KR, Pervaiz S: Druggability of human disease genes. Int J Biochem Cell Biol. 2007, 39: 1156-1164. 10.1016/j.biocel.2007.02.018.View ArticlePubMedGoogle Scholar