Skip to main content

Human genomic diversity, viral genomics and proteomics, as exemplified by human papillomaviruses and H5N1 influenza viruses


The diversity of hosts, pathogens and host-pathogen relationships reflects the influence of selective pressures that fuel diversity through ongoing interactions with other rapidly evolving molecules in the environment. This paper discusses specific examples illustrating the phenomenon of diversity of hosts and pathogens, with special reference to human papillomaviruses and H5NI influenza viruses. We also review the influence of diverse host-pathogen interactions that determine the pathophysiology of infections, and their responses to drugs or vaccines.


The availability of complete genome sequences and the wealth of large-scale biological datasets provide an unprecedented opportunity to elucidate the genetic bases of human diseases and host-pathogen interactions. The diversity of hosts, pathogens and host-pathogen relationships reflects the influence of selective pressures that fuel diversity through ongoing interactions with other rapidly evolving molecules in the environment. Such influences add another source of genetic adaptability as cells adjust to new environments and out-manoeuvre pathogenic threats. The study of genetic variation in pathogens and hosts has practical significance for developing strategies to combat and control infectious diseases. Vaccines based on highly polymorphic antigens may be confounded by allelic restriction of the host immune response. In addition, the study of the distribution of genomic polymorphisms among different hosts may provide information on responses to drugs or vaccines. Based on the concept that host damage is the most relevant outcome of the host-pathogen interaction, we need better to understand both host and pathogen polymorphisms in drug or vaccine design. A deeper knowledge of the diversity and nature of pathogens will provide valuable insights into genetic markers that may be useful for detection, identification and forensics. The ability to discriminate between virulent pathogens and their counterparts that are either less or not virulent is another major challenge. It is important to discover genetic factors that mediate the virulence process in order to devise novel methods to prevent or treat disease. Similarly, understanding how the host responds to microbial invasion, and how the pathogen evades or manipulates the immune response to subvert the host, will contribute to the development of vaccines and other prophylactic strategies. Concomitantly, the unravelling of associations between hosts and pathogens will be highly relevant to the modelling of the population biology of multi-host pathogens and their impact on co-infections. This paper reviews and discusses specific examples concerning the issue of diversity of hosts and pathogens, and the influence of diverse host-pathogen interactions that determine the pathophysiology of infections, and their responses to drugs or vaccines.

Human genome diversity and 'SNPshots'

The genetic blueprint of an individual not only determines disease susceptibility, but also his/her response to drug treatment. Numerous genes are involved in drug response and toxicity, introducing a daunting level of complexity in the search for candidate genes. Thus, genetics--particularly gene polymorphisms-- exert a significant impact on target discovery.

The HapMap constitutes a catalogue of common genetic variants that exist in humans. It describes what these variants are, where they occur in our DNA and how they are distributed among individuals within a specific population and among populations in different parts of the world. This project provides information that can link genetic variants to the risks for specific illnesses, which can lead to new methods for preventing, diagnosing and treating diseases.[1, 2] Differences in individual bases are by far the most common type of genetic variation. These genetic differences, known as single nucleotide polymorphisms (SNPs), represent DNA sequence variations that occur when a single nucleotide in the genome sequence is altered. SNPs are more common than other types of polymorphisms, and occur at a frequency of approximately one in 1,000 base pairs[3] throughout the genome (including promoter regions and coding and intronic sequences). Some of these differences may alter gene products in ways that confer susceptibility or resistance to diseases, or contribute to disease severity or progression. Although over 99 per cent of human DNA sequences are the same across the human population, DNA sequence variations impact on how humans respond to disease; to environmental stresses such as infections, toxins and chemicals; and to drugs and other therapies.[4] Since genetic factors also affect response to drug therapy, SNPs can help to determine why individuals differ in their abilities to absorb or clear certain drugs, as well as to ascertain the mechanisms ofadverse drug effects. Moreover, by affecting drug-target proteins such as G-protein-coupled receptors, enzymes, ion channels and proteins involved in detoxification pathways, non-synonymous coding SNPs (cSNPs) (namely substitutions resulting in alterations of encoded amino acids) significantly influence the diverse responses of efficacy and toxicity of therapeutic agents in the human population.[5]

In general, only about 5 per cent of the disease-causing non-synonymous mutations hitherto identified have direct effects on the catalytic or ligand-binding properties of enzymes and recep-tors.[68] An interesting example is retinol-binding protein 4 (RBP4), the retinol-specific transport protein present in plasma. Elucidation of the crystal structures of different forms of RBPs have revealed their interactions with retinol, retinoids and trans-thyretin (TTR; one of the plasma carriers of thyroid hormones).[9] The core of RBP is a beta-barrel whose cavity accommodates retinol. The retinol hydroxyl group is near the protein surface, in the region of the entrance loops surrounding the opening of the binding cavity, and participates in polar interactions. The G75D mutant introduces a negative charge into the cavity, thereby interfering with retinol binding both electrostatically and sterically. The result is vitamin A deficiency with a phenotype of night blindness [10]. (Figure 1a, b). Furthermore, RBP4 is expressed and secreted by adipose tissue and is strongly associated with insulin resistance. A strong positive correlation also exists between RBP4 mRNA and adipose inflammation (monocyte chemoattractant protein-1 and CD68), and glucose transporter 4 mRNA.[11]

Figure 1
figure 1

Examples of SNP genotypes and disease susceptibility. (a,b) The mutant G75D in RBP4 introduces a negative charge into the cavity, thereby interfering with retinol binding both electrostatically and sterically. The result is vitamin A deficiency, with a phenotype of night blindness. Panels a and b show the normal and mutant forms of RBP4 protein. (c,d) An example of a polymorphic variant that disrupts a critical disulphide bond is C260Y in HLA-H, culminating in hereditary haemochromatosis. Panels c and d show the normal and mutant forms of HLA-H protein.

Some non-synonymous cSNPs associated with human disorders disrupt important structural features of the affected protein. For example, a polymorphic variant that disrupts a critical disulphide bond is C260Y in HLA-H, culminating in hereditary haemochromatosis (Figure 1d).[12]

Human genome diversity and drug responses

The total number of SNPs reported in public SNP databases currently exceeds 9 million.[13] Occasionally, an SNP may actually cause a disease and can therefore be exploited to search and isolate the disease causing gene. Since SNPs are genetic variations that occur at regular intervals and at high frequency throughout the human genome, they can be used as markers within the genome. If a particular marker is found to be common among individuals with a particular disease, it suggests that the gene involved is probably located near the marker.[14] This renders SNPs of great value to biomedical research, and for developing pharmaceutical products or diagnostics.[15]

For virtually all medications, inter-patient variability in response to drug therapy is the rule rather than the exception. Tremendous progress has been achieved in understanding the molecular basis of drug action, and in elucidating genetic determinants of disease pathogenesis and drug response.[16, 17] This inter-patient variability is potentially regulated by processes such as drug transport, drug metabolism, cellular signalling pathways (eg G-protein-coupled receptors) and response pathways (eg apoptosis, cell cycle control). Polymorphisms of drug-metabolising enzymes and/or pharmacological targets are frequently associated with adverse drug reactions or failure of efficacy.[1720] Drug responses may also be modulated by non-genetic factors, however, especially co-medications and co-morbidities.

Thus, SNPs potentially can be applied in the development of individualised medicine and can provide an important source of information for studying the relationship between genotypes and phenotypes of human diseases.[21] A bottleneck occurs when linking information about the variation in human genes to the variation in drug responses (pharmacogenetics) and understanding how interacting systems of genes determine individual drug responses (pharmacogenomics).[22] A fundamental challenge in analysing disease cSNPs is the relative scarcity of alleles that can be mapped to three-dimensional protein structures. In the future, it is envisioned that knowledge of an individual's SNP genotype may provide a basis for assessing susceptibility to a disease and the optimal choice of therapies.[6]

Viral genome diversity exemplified by papillomaviruses and influenza viruses

Viruses are exceptionally diverse in morphology, genetic organisation, replication strategies, virulence and many other characteristics. Viral genome sequences represent a treasure trove of essential information for understanding pathogenesis better, as well as developing novel diagnostics and antiviral therapies. The ability of various medically important viruses to develop high degrees of genetic diversity, and to acquire mutations to escape immune pressures, contributes to the difficulties in vaccine development.

Diversity of human papillomaviruses

There are more than 100 different types of human papillomavirus (HPV), the causative agent of benign papillomas or warts, and a cofactor in the development of carcinomas of the genital tract, skin, head and neck. HPVs are broadly divided into cutaneous and mucosal HPV types. Eight major proteins, designated as early (E) or late (L) gene products, are encoded by the HPV DNA genome. Proteins E1 and E2 are involved in viral replication, as well as the regulation of early transcription. E1 binds to the origin of viral replication (ORI) and exhibits ATPase as well as helicase activity, whereas E2 forms a complex with E1, facilitating its binding to the ORI.[23] Furthermore, E2 acts as a transcription factor that regulates early gene expression by binding to specific E2 recognition sites.[24] E4 plays important roles in promoting the differentiation-dependent productive phase of the viral life cycle.[25] The E5 protein supports HPV late functions[26] and disrupts major histocom-patibility complex (MHC) class II maturation.[27] The E6 and E7 oncoproteins are mainly responsible for HPV-mediated malignant cell progression, leading ultimately to invasive carcinoma.[28] Finally, L1 and L2 are the major and minor capsid proteins, respectively, and HPV vaccines based on L1 are already in clinical use.[29]

In 1999, we determined the complete nucleotide sequence of a novel genital HPV type from a female sex worker with a wart virus infection in Singapore -- namely, HLT7474-S -- which was designated as candidate HPV type 85 (HPV-85) by the Reference Center for Papillomaviruses, German Cancer Research Center, Heidelberg, Germany. Its genomic organisation and phylogenetic relationships were analysed.[30, 31] The DNA sequence of the L1 open reading frame (ORF) of HPV-85 shares similarities of 78.3 per cent, 78.1 per cent and 78.0 per cent with those of the most closely related known types (HPV types 39, 70 and 45, respectively), thus satisfying the criteria for a new HPV type, which is defined on the basis of a dissimilarity exceeding 10 per cent in the L1 gene.[32] In addition, the E6 and E7 ORFs of HPV-85 exhibit highest percentage similarities of 79.7 per cent and 77.9 per cent to E6 of HPV-18 and E7 of HPV-59, respectively, thus reiterating the relatedness between HPV-85 and known genital HPVs belonging to group A7 [33].

Phylogenetic trees[34, 35] based on the individual ORFs, putative proteins and long control regions (LCRs) reveal the relationships between HPV-85 and the high-risk HPVs from group A7. The E1, E2, E5 and L2 proteins of HPV-85 are more closely associated with those of HPV-70 and HPV-39. Greater similarities are observed for the E4, E6 and L1 proteins of HPV-85, however, compared with those of HPV-18 and HPV-45, and for the HPV-85 E7 protein and LCR compared with its HPV-59 counterparts. These data exemplify the diversity of HPV viruses, with particular reference to HPV-85 as a co-evolved member of the A7 group of genital HPVs.

HPVs and protein disorder

It is now recognised that many functional proteins or their long segments are devoid of stable secondary and/or tertiary structure, and exist instead as very dynamic ensembles of conformations. They are known by different names, including natively unfolded, intrinsically disordered, intrinsically unstructured, rheomorphic, pliable and different combinations thereof. Disordered proteins have high flexibility and are reported to be involved in regulation, signalling and control pathways in which interactions with multiple partners, as well as high-specificity and low-affinity interactions, are often required [36].

To elucidate whether intrinsic disorder plays a role in the oncogenic potential of different HPV types, we performed a detailed bioinformatics analysis concentrating on the E6 and E7 oncoproteins of high-risk and low-risk HPVs. Three high-risk (HPV-16, HPV-18 and HPV-85) and two low-risk (HPV-6 and HPV-11) HPV types were analysed in order to compare the extent of intrinsic protein disorder in these virus types. The amino acid sequences of the different HPV types were extracted from the Los Alamos National Laboratory ( Predictions of intrinsic disorder in HPV proteins were performed using a set of disorder predictors -- that is, POODLE-S[37] and POODLE-L.[38] We employed POODLE-S to analyse the E6, E7 and L2 proteins, since their sequences are only ~100 residues long, whereasPOODLE-L was used for other proteins. The results are presented in Figure 2.

Figure 2
figure 2

Protein disorder probability based on the results of POODLE prediction. The table illustrates Tukey's multiple test for comparing low-risk and high-risk HPVs. One-way ANOVA reveals the differences in disorder scores for the HPV proteins.

We conducted Tukey's multiple comparison test to compare the residue disorder values for each of the two HPV groups. The one-way analysis of variance (ANOVA) indicates that the E6 oncoproteins of oncogenic HPVs (HPV-16, HPV-18 and HPV-85) are significantly more disordered (p <0.001) than those of non-oncogenic HPVs (HPV-6 and HPV-11). Thus, the results of this analysis are consistent with the conclusion that high-risk HPVs are characterised by an increased degree of intrinsic disorder of the E6 protein. The molecular basis of this disorder, in terms of protein sequence variation of virulent HPV types, is supported by experimental evidence for the transforming ability of E6 proteins of oncogenic HPVs.[28] Furthermore, the disorder trend is more significant for E6 than for E7, consistent with a previous report using the commercial software PONDR. The data also highlight the diversity of high-risk and low-risk HPVs at the protein structure level.[39] These intrinsic differences in E6 protein dynamics may be exploited and targeted pharmacologically -- for example, using monoclonal antibodies or chemical inhibitors.

Certain non-oncoproteins of non-oncogenic HPVs exhibit greater 'disorder'-- for example, L1 and L2 of HPV-6 and HPV-11, and E2 of HPV6. This observation suggests that it is the disorder of critical viral protein(s) that defines the oncogenic capability and risk level of HPV. Our data illustrate an alternative computational approach to distinguishing between the non-oncogenic from the high-risk oncogenic HPV types. It will be interesting to determine whether the phenomenon of increased protein disorder of more virulent virus types can be generalised beyond the HPV family.

Genetic diversity of H5NI influenza viruses

Since 2003, highly pathogenic avian influenza A H5N1 viruses have spread from Asia to Europe and Africa, infecting wild birds, commercial poultry and humans with alarming fatality rates. Scrupulous surveillance and multidisciplinary interrogation of H5N1 evolution and 'migration patterns' are crucial for preventing further casualties in humans and poultry.[40] The escalating number of human cases of H5N1 virus infection has raised serious concerns about the potential emergence of an influenza pandemic attributed to a mutated H5N1 strain with efficient human-to-human transmissibility.[41] The two major surface glycoproteins encoded by the segmented influenza virus RNA genome are haemag-glutinin (HA) and neuraminidase (NA). HA is the major antigen for neutralising antibodies and is involved in the binding of virions to sialic acid-linked receptors on host cells. Infectivity of influenza virus depends on the cleavage of HA by specific host proteases, whereas NA mediates the release of progeny virions from the cell surface and prevents clumping of newly formed virus particles.[42, 43] HA is a ~550 amino acid polypeptide that forms homo-trimers (spikes) on the exterior of the influenza virus particle. Nascent HA is directed to the cell membrane in an infected host cell and is anchored to the cell membrane by a short transmembrane region at the C-terminus. Its biological activation involves pro-teolytic cleavage of a specific region by host enzymes. The nascent HA is also subject to extensive post-translational glycosylation that serves as a mechanism for immune evasion. Introducing new mutations in these two proteins represents the major strategy used by H5N1 to expand its host range and to avoid recognition by the host immune system. Here, we illustrate this diversity of influenza viruses by comparing HA proteins from over 270 H5N1 strains. A total of 272 HA sequences were downloaded from the National Center for Biotechnology Information (NCBI) Influenza Resource database, and phylogenetic analyses were performed using the PHYLIP and Neighbourhood-Joining (NJ) method, with a bootstrap of 1,000. This phyloinformatics analysis revealed a distinct pattern of spatial clustering ofthe strains based on their geographical origin, rather than temporal clustering or according to host range (Figure 3). The dataset includes samples isolated up to 2006, and represents globally distributed locations from Thailand, Vietnam, Indonesia, Japan, Mongolia, Russia, Europe to Africa. Interestingly, the host range across these H5N1 clades ranges from chickens to humans and includes several mammalian species. The clustering of strains does not show any bias towards the host species -- for example, the Thailand clade includes isolates from chickens, cats and tigers.

Figure 3
figure 3

Phylogenetic tree constructed for a total of 272 HA sequences of H5N1 virus strains. The HA sequences were downloaded from the NCBI Influenza Resource database.

Multiple sequence alignments using the ClustalW program reveal that HA is highly polymorphic. A total of 312 positions exhibit polymorphisms over two-thirds of the protein length (Figure 4a). Mapping of these positions on the protein databank (PDB) file using 2FK0 as the template is depicted in Figure 4b. Polymorphisms in the various residues of H5N1 HA occur individually and not in tandem -- that is, two or more polymorphic residues may change independently in different HA variants; however, it is noteworthy that the polymorphisms are concentrated in the receptor-binding domain of HA. These data can facilitate monitoring of the receptor-binding specificity of modern influenza viruses, especially H5N1. The reasons why H5N1 HA mutants segregate among more than one host species, and why there are differences in causing widespread disease, may be due in part to the differences in occurrence and distribution of cellular receptors for H5N1 (ie α2,3 or α2,6 sialic acid-linked receptors). The differences in the infectivity of various H5N1 strains in different hosts is ultimately determined by the compatibility and binding between the virus strain and the host cellular receptor. The polymorphism in the receptor-binding region also reflects the host immune response against the virus and the high mutating capacity of the virus that spawns escape mutants. These polymorphisms contribute to a greater diversity in viral virulence and to the expanded host range of H5N1. Such molecular-based surveillance can aid the designing of potential influenza vaccines, as well as a better understanding of the mechanisms of virus evolution and inter-species transfer.

Figure 4a
figure 4a

(a) The 272 HA proteins of H5NI under investigation were optimally aligned both manually and using ClustalW and the frequency of polymorphisms for each residue was calculated. The amino acid number was plotted against the frequency of polymorphisms of the HA proteins. 312 residues exhibit at least one polymorphism, while three residues display seven polymorphisms.

Figure 4b
figure 4b

(b) Mutation mapping on the HA protein structure. The frequency of polymorphism is depicted in colour.

Genetic determinants of host susceptibility to infection

Advances in genomics and understanding of pathogen variability, as well as diversity of the human immune systems have led to new trends in vaccine development that focus on epitope-based vaccines. An epitope is a small peptide fragment from an infectious agent that can induce a host immune response to eliminate the pathogen. Such a vaccine strategy shows promise in dealing with host and pathogen diversity. Compared with traditional vaccines, epitope-based vaccines are more specific, safe and more easily produced and controlled. The keys to the success of such approaches are the prediction models for rapidly scanning pathogen genomes to identify effective T-cell epitopes. A recent review article focuses on different methods available for MHC-peptide binding prediction for epitope-based vaccine design.[44]

The virulence of the pathogen and the susceptibility of the host determine the occurrence and severity of an infectious disease. The highly polymorphic glycoproteins of the MHC are the key proteins involved in the host immune response. MHC class I glycoproteins are expressed on the surface of every nucleated human cell and play important roles in viral infections. They present endogenous peptides derived from the cell itself to cytotoxic T cells. Since human viruses use their host's cellular machinery for replication, infected cells present viral proteins on their surfaces by using human leukocyte antigen (HLA) class I glyco-proteins. This co-presentation of viral peptides elicits a cell-mediated immune response that destroys the virally infected cell. Conversely, HLA class II glycoproteins expressed on antigen-presenting cells display antigenic peptides derived from the pathogen. T cells recognise these antigenic peptides as foreign and initiate an immune response to the antigen. Each antigenic peptide must fit into the peptide-binding cleft; both peptide size and composition determine the fit. Each peptide is typically nine to 14 amino acids long, and its sequence is determined by the pathogen. At the host level, depending on the particular surface of the peptide-binding cleft, some antigenic peptides may be preferentially presented, while others may not be presented at all. The great diversity of clefts across the human population translates into the ability to recognise and generate an immune response to virtually any pathogen. The type of antigenic peptide displayed in the cleft is an important factor in the immune response generated. Thus, several physical, chemical and genetic factors determine whether a given peptide will fit into the peptide-binding cleft and elicit an immune response (Figure 5). Furthermore, class I and class II MHC molecules are the most polymorphic human proteins; some of these have over 200 allelic variants. This extreme polymorphism is driven and maintained by the long-standing battle for supremacy between our immune system and infectious pathogens. The polymorphisms within both HLA class I and class II glycoproteins occur almost exclusively in the region of the glycoprotein that constitutes the peptide-binding cleft.

Figure 5
figure 5

MHC-peptide combinatorics is influenced by physical, chemical and genetic properties of the peptide and the corresponding MHC.

Conclusions and future prospects

The pharmaceutical industry is currently grappling with a tremendous number of potential drug targets against infectious diseases identified through sequence data for the human genome and many important pathogens. The challenge ahead is to delineate the factors involved in host-pathogen interactions, and to translate the enormous discovery potential of human and pathogen genomes into real products for therapeutic intervention. One of the major bottlenecks in bringing new drugs to market is our incomplete understanding of the genes and proteins central to host-pathogen interactions and the mechanisms underlying certain human diseases. Other limitations that need to be addressed include patient heterogeneity and pathogen polymorphisms in clinical trials, the existence of multiple molecular targets and the shortage of experimental models of therapeutic efficacy with good predictive validity and objective surrogate measurements of disease progression. Integration of computational biology, together with experimental approaches, can accelerate our ability rapidly and reliably to identify target proteins that can be harnessed for therapeutic intervention.[45] The effective application of such complementary analytical methods will blaze trails for the exploitation of available genetic and molecular information. When supplemented with 'individualised therapy' based on a patient's genetic profile, these developments may lead to fewer serious adverse effects and improved responses to drug treatments and vaccine regimens. Finally, the ultimate goal must be to bridge theoretical biological, genetic and molecular phenomena, and cellular and organism biology, as well as medicinal chemistry in the relentless search for new cures in the battle against pathogens that plague humankind.


  1. Goldstein DB, Cavalleri GL: Genomics: Understanding human diversity. Nature. 2005, 437: 1241-1242. 10.1038/4371241a.

    Article  CAS  PubMed  Google Scholar 

  2. A haplotype map of the human genome. Nature. 2005, International HapMap Consortium, 437: 1299-1320. 10.1038/nature04226.

  3. Brookes AJ: The essence of SNPs. Gene. 1999, 234: 177-186. 10.1016/S0378-1119(99)00219-X.

    Article  CAS  PubMed  Google Scholar 

  4. Hoehe MR, Timmermann B, Lehrach H: Human inter-individual DNA sequence variation in candidate genes, drug targets, the importance of haplotypes and pharmacogenomics. Curr Pharm Biotechnol. 2003, 4: 351-378. 10.2174/1389201033377300.

    Article  CAS  PubMed  Google Scholar 

  5. Maitland ML, DiRienzo A, Ratain MJ: Interpreting disparate responses to cancer therapy: The role of human population genetics. J Clin Oncol. 2006, 24: 2151-2157. 10.1200/JCO.2005.05.2282.

    Article  PubMed  Google Scholar 

  6. Masood E: As consortium plans free SNP map of human genome. Nature. 1999, 398: 545-546.

    Article  CAS  PubMed  Google Scholar 

  7. Vitkup D, Sander C, Church GM: The amino-acid mutational spectrum of human genetic disease. Genome Biol. 2003, 4: R72-10.1186/gb-2003-4-11-r72.

    Article  PubMed Central  PubMed  Google Scholar 

  8. Wang Z, Moult J: SNPs, protein structure, and disease. Hum Mutat. 2001, 17: 263-270. 10.1002/humu.22.

    Article  PubMed  Google Scholar 

  9. Zanotti G, Berni R: Plasma retinol-binding protein: structure and interactions with retinol, retinoids, and transthyretin. Vitam Horm. 2004, 69: 271-295.

    Article  CAS  PubMed  Google Scholar 

  10. Seelinger MW, Biesalski HK, Wissinger B, Gollnick H, et al: Phenotype in retinol deficiency due to a hereditary defect in the retinol binding protein synthesis. Invest Ophthalmol Vis Sci. 1999, 40: 3-11.

    Google Scholar 

  11. Yao-Borengasser A, Varma V, Bodles AM, Rasouli N, et al: Retinol binding protein 4 expression in humans: Relationship to insulin resistance, inflammation, and response to pioglitazone. J Clin Endocrinol Metab. 2007, 92: 2590-2597. 10.1210/jc.2006-0816.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  12. Hash RB: Hereditary hemochromatosis. J Am Board Fam Pract. 2001, 14: 266-273.

    CAS  PubMed  Google Scholar 

  13. Kim S, Misra A: SNP genotyping: Technologies and bio-medical applications. Annu Rev Biomed Eng. 2007, 9: 289-320. 10.1146/annurev.bioeng.9.060906.152037.

    Article  CAS  PubMed  Google Scholar 

  14. Delrieu O, Bowman C: Visualizing gene determinants of disease in drug discovery. Pharmacogenomics. 2006, 7: 311-329. 10.2217/14622416.7.3.311.

    Article  CAS  PubMed  Google Scholar 

  15. Giacomini KM, Brett CM, Altman RB, Benowitz NL, et al: The pharmacogenetics research network: From SNP discovery to clinical drug response. Clin Pharmacol Ther. 2007, 81: 328-345. 10.1038/sj.clpt.6100087.

    Article  CAS  PubMed  Google Scholar 

  16. McLeod HL, Evans WE: Pharmacogenomics: Unlocking the human genome for better drug therapy. Annu Rev Pharmacol Toxicol. 2001, 41: 101-121. 10.1146/annurev.pharmtox.41.1.101.

    Article  CAS  PubMed  Google Scholar 

  17. Evans WE, Johnson JA: Pharmacogenomics: The inherited basis for interindividual differences in drug response. Annu Rev Genomics Hum Genet. 2001, 2: 9-39. 10.1146/annurev.genom.2.1.9.

    Article  CAS  PubMed  Google Scholar 

  18. Shah RR: Mechanistic basis of adverse drug reactions: The perils of inappropriate dose schedules. Expert Opin Drug Saf. 2005, 4: 103-128. 10.1517/14740338.4.1.103.

    Article  CAS  PubMed  Google Scholar 

  19. Rane A: Postgenomic prospects of success in drug development and pharmacotherapy. Pharmacogenomics J. 2001, 1: 6-9. 10.1038/sj.tpj.6500013.

    Article  CAS  PubMed  Google Scholar 

  20. Bosch TM, Meijerman I, Beijnen JH, Schellens JH: Genetic polymorphisms of drug-metabolising enzymes and drug transporters in the chemotherapeutic treatment of cancer. Clin Pharmacokinet. 2006, 45: 253-285. 10.2165/00003088-200645030-00003.

    Article  CAS  PubMed  Google Scholar 

  21. Lai E: Application of SNP technologies in medicine: Lessons learned and future challenges. Genome Res. 2001, 11: 927-929. 10.1101/gr.192301.

    Article  CAS  PubMed  Google Scholar 

  22. Klein TE, Chang JT, Cho MK, Easton KL, et al: Integrating genotype and phenotype information: An overview of the PharmGKB project. Pharmacogenetics Research Network and Knowledge Base. Pharmacogenomics J. 2001, 1: 167-170. 10.1038/sj.tpj.6500035.

    Article  CAS  PubMed  Google Scholar 

  23. Ustav M, Stenlund A: Transient replication of BPV-1 requires two viral polypeptides encoded by the E1 and E2 open reading frames. EMBOJ. 1991, 10: 449-457.

    CAS  Google Scholar 

  24. Gloss B, Bernard HU, Seedorf K, Klock G: The upstream regulatory region of the human papilloma virus-16 contains an E2 protein-independent enhancer which is specific for cervical carcinoma cells and regulated by glucocorticoid hormones. EMBO J. 1987, 6: 3735-3743.

    PubMed Central  CAS  PubMed  Google Scholar 

  25. Wilson R, Fehrmann F, Laimins LA: Role of the E1-E4 protein in the differentiation-dependent life cycle of human papilloma-virus type 31. J Virol. 2005, 79: 6732-6740. 10.1128/JVI.79.11.6732-6740.2005.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  26. Fehrmann F, Klumpp DJ, Laimins LA: Human papillo-mavirus type 31 E5 protein supports cell cycle progression and activates late viral functions upon epithelial differentiation. J Virol. 2003, 77: 2819-2831. 10.1128/JVI.77.5.2819-2831.2003.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  27. Zhang B, Li P, Wang E, Brahmi Z, et al: The E5 protein of human papillomavirus type 16 perturbs MHC class II antigen maturation in human foreskin keratinocytes treated with interferon-gamma. Virology. 2003, 310: 100-108. 10.1016/S0042-6822(03)00103-X.

    Article  CAS  PubMed  Google Scholar 

  28. Snijders PJ, Steenbergen RD, Heideman DA, Meijer CJ: HPV-mediated cervical carcinogenesis: Concepts and clinical implications. J Pathol. 2006, 208: 152-164. 10.1002/path.1866.

    Article  CAS  PubMed  Google Scholar 

  29. Stanley M: Immunobiology of HPV and HPV vaccines. Gynecol Oncol. 2008, 109: S15-S21. 10.1016/j.ygyno.2008.02.003.

    Article  CAS  PubMed  Google Scholar 

  30. Chow VT, Leong PW: Complete nucleotide sequence, genomic organization and phylogenetic analysis of a novel genital human papillomavirus type, HLT7474-S. J Gen Virol. 1999, 80: 2923-2929.

    Article  CAS  PubMed  Google Scholar 

  31. de Villiers EM, Fauquet C, Broker TR, Bernard HU, et al: Classification of papillomaviruses. Virology. 2004, 324: 17-27. 10.1016/j.virol.2004.03.033.

    Article  CAS  PubMed  Google Scholar 

  32. Delius H, Saegling B, Bergmann K, Shamanin V, et al: The genomes of three of four novel HPV types, defined by differences of their L1 genes, show high conservation of the E7 gene and the URR. Virology. 1998, 240: 359-365. 10.1006/viro.1997.8943.

    Article  CAS  PubMed  Google Scholar 

  33. Forslund O, Hansson BG: Human papillomavirus type 70 genome cloned from overlapping PCR products: Complete nucleotide sequence and genomic organization. J Clin Microbiol. 1996, 34: 802-809.

    PubMed Central  CAS  PubMed  Google Scholar 

  34. Ho L, Chan SY, Chow V, Chong T, et al: Sequence variants of human papillomavirus type 16 in clinical samples permit verification and extension of epidemiological studies and construction of a phylogenetic tree. J Clin Microbiol. 1991, 29: 1765-1772.

    PubMed Central  CAS  PubMed  Google Scholar 

  35. Chan SY, Ho L, Ong CK, Chow V, et al: Molecular variants of human papillomavirus type 16 from four continents suggest ancient pandemic spread of the virus and its coevolution with humankind. J Virol. 1992, 66: 2057-2066.

    PubMed Central  CAS  PubMed  Google Scholar 

  36. Radivojac P, Iakoucheva LM, Oldfield CJ, Obradovic Z, et al: Intrinsic disorder and functional proteomics. Biophys J. 2007, 92: 1439-1456. 10.1529/biophysj.106.094045.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  37. Shimizu K, Hirose S, Noguchi T: POODLE-S: Web application for predicting protein disorder by using physicochemical features and reduced amino acid set of a position-specific scoring matrix. Bioinformatics. 2007, 23: 2337-2338. 10.1093/bioinformatics/btm330.

    Article  CAS  PubMed  Google Scholar 

  38. Hirose S, Shimizu K, Kanai S, Kuroda Y, Noguchi T: POODLE-L: A two-level SVM prediction system for reliably predicting long disordered regions. Bioinformatics. 2007, 23: 2046-2053. 10.1093/bioinformatics/btm302.

    Article  CAS  PubMed  Google Scholar 

  39. Uversky VN, Roman A, Oldfield CJ, Dunker AK: Protein intrinsic disorder and human papillomaviruses: Increased amount of disorder in E6 and E7 oncoproteins from high risk HPVs. J Proteome Res. 2006, 5: 1829-1842. 10.1021/pr0602388.

    Article  CAS  PubMed  Google Scholar 

  40. Ma HC, Chen JM, Chen JW, Sun YX, et al: The panorama of the diversity of H5 subtype influenza viruses. Virus Genes. 2007, 34: 283-287. 10.1007/s11262-006-0018-3.

    Article  CAS  PubMed  Google Scholar 

  41. Chow VT, Tambyah PA, Goh KT: To kill a mocking bird flu?. Ann Acad Med. 2008, 37: 451-453.

    Google Scholar 

  42. Wallace RG, Hodac H, Lathrop RH, Fitch WM: A statistical phylogeography of influenza A H5N1. Proc Natl Acad Sci. 2007, USA, 104: 4473-4478. 10.1073/pnas.0700435104.

    Google Scholar 

  43. Narasaraju T, Sim MK, Ng HH, Phoon MC, et al: Adaptation of human influenza H3N2 virus in a mouse pneumonitis model: Insights into viral virulence, tissue tropism and host pathogenesis's. Microbes Infect. 2009, 11: 2-11. 10.1016/j.micinf.2008.09.013.

    Article  CAS  PubMed  Google Scholar 

  44. Zhao B, Sakharkar KR, Lim CS, Kangueane P, et al: MHC-peptide binding prediction for epitope based vaccine design. Int J Integr Biol. 2007, 1: 127-140.

    CAS  Google Scholar 

  45. Sakharkar MK, Sakharkar KR, Pervaiz S: Druggability of human disease genes. Int J Biochem Cell Biol. 2007, 39: 1156-1164. 10.1016/j.biocel.2007.02.018.

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Vincent T.K. Chow.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sakharkar, M.K., Sakharkar, K.R. & Chow, V.T. Human genomic diversity, viral genomics and proteomics, as exemplified by human papillomaviruses and H5N1 influenza viruses. Hum Genomics 3, 320 (2009).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: