Skip to main content
  • Primary research
  • Published:

Genetic factors leading to chronic Epstein-Barr virus infection and nasopharyngeal carcinoma in South East China: Study design, methods and feasibility


Nasopharyngeal carcinoma (NPC) is a complex disease caused by a combination of Epstein-Barr virus chronic infection, the environment and host genes in a multi-step process of carcinogenesis. The identity of genetic factors involved in the development of chronic Epstein-Barr virus infection and NPC remains elusive, however. Here, we describe a two-phase, population-based, case-control study of Han Chinese from Guangxi province, where the NPC incidence rate rises to a high of 25-50 per 100,000 individuals. Phase I, powered to detect single gene associations, enrolled 984 subjects to determine feasibility, to develop infrastructure and logistics and to determine error rates in sample handling. A microsatellite screen of Phase I study participants, genotyped for 319 alleles from 34 microsatellites spanning an 18-megabase region of chromosome 4 (4p15.1-q12), previously implicated by a linkage analysis of familial NPC, found 14 alleles marginally associated with developing NPC or chronic immunoglobulin A production (p = 0.001-0.03). These associations lost significance after applying a correction for multiple tests. Although the present results await confirmation, the Phase II study population has tripled patient enrolment and has included environmental covariates, offering the potential to validate this and other genomic regions that influence the onset of NPC.


Nasopharyngeal carcinoma (NPC) is a disease with distinct racial and geographical distributions. In southern China Taiwan Vietnam and the Philippines, the incidence of NPC is 15-20 per 100,000 individuals per year, and in some local Chinese regions bordering the Xijiang River drainage in Guangdong and Guangxi provinces, the incidence is as high as 25-50 per 100,000 individuals [1, 2]. An intermediate incidence is observed among the Arab populations of Northern Africa [3], including Saudi Arabia; [4] in the Caribbean; and in the Eskimo populations of Alaska and Greenland [5]. Elsewhere NPC is rare, with an incidence of less than 1 per 100,000. In the USA, NPC comprises only 0.2 per cent of all malignancies, with an incidence is 1 per 100,000. The male:female ratio for NPC is usually 2 or 3:1, with an incidence peak between 50 and 59 years of age [6].

A link between NPC and Epstein-Barr virus (EBV) was reported in 1966 [7]. Ten years later, the presence of immunoglobulin (Ig) A antibodies to EBV viral capsid antigens (EBV/IgA/VCA) was found to serve as a predictive marker for the development of NPC in Chinese populations [8]. More than 95 per cent of adults in all ethnic groups across the world are healthy carriers of EBV. In high NPC incidence regions EBV infection of the nasopharyngeal epithelium induces IgA antibodies against VCA, suggesting that reactivation of EBV replication at the mucosal surface precedes the development of NPC. Consistent with this, approximately 2.5 per cent of the general population are EBV/IgA/VCA antibody positive. Of these, less than 3 per cent will develop NPC, while > 95 per cent of all NPC patients are EBV/IgA/VCA antibody positive [914]. In addition to EBV infection, case control studies have indicated a role for environmental factors, including food preservatives (carcinogenic nitrosamines), salt-preserved fish and phorbol esters in herbs and plants that are commonly consumed among ethnic populations with the highest NPC rates [15, 16].

Evidence for genetic modulation of NPC risk has accumulated recently. Familial aggregation of NPC has been observed in China and in other countries [1719]. Familial aggregation of NPC is uncommon in low-risk or non-Chinese populations. The proportion of NPC with affected first-degree family history is > 5 per cent in south China, 7.2 per cent in Hong Kong, 6.0 per cent in Yulin and 5.9 per cent in Guangzhou [20]. Descendants of south Chinese immigrants to western countries show progressively lower risk, but their NPC incidence remains higher than that of the indigenous population [21], suggesting both environmental and genetic components to disease susceptibility. Several studies have shown associations between HLA genes and NPC [2228], and the D6S1624 microsatellite within the HLA class I region has been associated with NPC [29]. Studies comparing age of NPC onset report conflicting results for familial versus sporadic NPC. In a study comparing 200 probands with and without NPC-affected first-degree relatives from Singapore, the age of onset was 48 and 49 years, respectively [30]. In another Chinese study, the average age of onset was 35.5 years in 32 Guangdong families with 4-5 relatives with NPC compared with 46.6 years for sporadic cases [20]. In a third study, however, the age of onset decreased from 44.5 years to 40.4 as the number of NPC-affected relatives increased from one to four [31]. There is, therefore, some suggestion that age of onset may be lower in families with one or more NPC-affected first-degree relatives.

A genome-wide linkage analysis of 20 NPC families from a high incidence region in Guangdong identified a susceptibility region on the short arm of chromosome 4 [32]. Two chromosome 4p15.1-q12 markers D4S405 and D4S3002, yielded high logarithm of the odds (LOD) scores (> 3.5) by both parametric and multipoint non-parametric analysis in 70 per cent of the NPC families studied. A subsequent study of 18 families from Hunan province genotyped a panel of markers on the short arms of chromosomes 3, 9 and 4 that included D4S405 and D4S3002 and failed to detect an obvious susceptibility locus on 4p15.1-q12 [33]. A region on chromosome 3p21.31-21.2 containing a tumour suppressor gene cluster, however, showed a modest association with NPC incidence [33].

Here, we describe the design of a new case -- control study population recruited for the discovery of genetic factors that are involved in the development of chronic EBV infection and in the development of NPC. In a preliminary test to resolve the discrepancy between the two family-based studies, we performed a population-based case -- control association analysis of 34 microsatellite markers within 4p15.1-q12 (Figure 1) to determine if specific alleles within the region: 1) were associated with a propensity to develop chronic EBV replication, as evidenced by IgA antibodies against EBV viral capsid antigen (EBV/IgA/VCA); or 2) were associated with NPC susceptibility.

Figure 1
figure 1

Map of markers for the nasopharyngeal carcinoma susceptibility locus on chromosome 4 [28]. Thirty-four short tandem repeat markers distributed across an 18 megabase region were selected between D4S2950 to D4S2916, with intervals of 10-3,500 kilobases. The positional relationship among markers and selected genes are indicated below the map. Asterisks indicate levels of statistical significance: *p < 0.05 and * *p < 0.01.

Materials and methods

Study design

Enrolment into the study occurred in two collection phases. The Phase I pilot was powered to detect single gene associations and to determine feasibility for meeting recruitment goals, accuracy of data collection and sample handling, and to develop the infrastructure for a large international collaboration. Cases and controls (n = 984) were recruited in 2000 from Wuzhou City and Cangwu County, bordering the Xijiang River in the Guangxi province of South East China. An effort was made to enrol triads consisting of a proband, an unaffected spouse and an adult child or parent. Family triads were enrolled for haplotype inference and for quality control assessment. Three clinically described disease categories were collected: 1) incident or prevalent NPC biopsy-confirmed (NPC+) cases (n = 350) who were EBV/IgA/VCA antibody positive (IgA+); 2) IgA+cases (n = 288) who were defined as EBV/IgA/VCA antibody positive and NPC free at the time of study enrolment (EBV/IgA/VCA titres were confirmed by serological testing at the time of study enrolment); 3) IgA- controls (n = 346). For each case, his or her spouse was tested for EBV/IgA/VCA antibodies, and the spouse and parent or adult child were invited to enrol. The IgA- group consisted of 346 spouses who were IgA- at the time of study enrolment (Table 1). A dominant model was selected for power calculations for two reasons: 1) if the true model is additive, there is little difference in power using either an additive model or a dominant model for power calculations (data not shown); 2) if, however, the true model is dominant, a dominant model is the most powerful. Assuming a dominant genetic model and at least a 10 per cent allele frequency, this number of NPC, IgA+, and IgA- cases and controls provided > 90 per cent power to detect associations with an odds ratio (OR) ≥ 3, at the p = 0.01 level for a two-tailed test (Table 2).

Table 1 Characteristics of the Phase I study groups.
Table 2 Phase I and phase II sample power.

Phase II enrolment was initiated in 2004 and after the completion of Phase I collection. The Phase II design is a cross-sectional, case control study: family members were not recruited. A questionnaire capturing environmental factors, including occupational, dietary and tobacco exposures, was administered to each study participant at enrolment (Table 3). NPC cases were recruited from the Wuzhou Red Cross Hospital in collaboration with the Cancer Institute of Wuzhou Wuzhou City and the Cangwu Institute for Nasopharyngeal Carcinoma Control and Prevention Cangwu County. NPC cases IgA+ subjects and IgA- participants were recruited from cities and villages bordering the Xijiang River. Power was determined for single gene and gene environment interactions for participants in each group (Table 2). For single-gene associations at a 10 per cent allele frequency, power will range from 83 per cent to > 99 per cent and from 35 per cent to > 99 per cent to detect associations with an odds ratio (OR) of 1.5-3.0 at p < 0.05 and p < 0.001, respectively, for the dominant genetic model and a two-sided significance level. For gene-environment interactions, there is power to detect gene-environment effects for genotype and exposures with frequencies ≥ 0.1 for genotype and exposure, if the main exposure effect and genotype have an OR ≥1 and an interaction effect of OR ≥3 [34].

Exclusion criteria for Phases I and II were ethnicity other than Han Chinese, birth or residency for more than six months outside of the NPC endemic region or failure to provide informed consent. Internal review board approval was obtained from all participating institutions and informed consent was obtained from each study participant or their guardian for subjects between 16 and 18 years of age.

Sample and data handling

A total of 10-20 ml of blood was collected in acid citrate dextrose (ACD) vacutainers for serology testing, direct DNA extraction and for cryopreservation of peripheral blood mononuclear cells (PBMCs). Blood samples were separated into plasma and PBMCs. Serum was tested at the Wuzhou Centre for EBV/IgA/VCA antibodies and antibodies to EBV early antigen by immunoenzymatic assay. The PBMCs were EBV-transformed to establish lymphoblastoid cell lines (LCLs) as a renewable DNA source. In addition, 3 cc of whole blood were preserved in DNA Tris, ethylene diamine tetraacetic acid and sodium dodecyl sulphate extraction buffer as a back-up DNA source. Questionnaires capturing demographic, laboratory and social history were administered at enrolment. Two individuals entered responses to the questionnaire and laboratory results into a FileMaker Pro database independently as a method of capturing data entry errors.

Genomic DNA extraction

DNA was extracted from whole blood or lymphoblastoid cell lines using the QIAamp DNA blood maxi kit (Qiagen Valencia CA, USA, catalog #51194). More than 80 per cent of the genotypes were determined from DNA directly extracted from whole blood.

Microsatellite genotyping

Microsatellite loci (n = 34) containing 319 alleles were selected between D4S2950 and D4S2916 (18 megabases [Mb]) on chromosome 4 (Figure 1). The markers consist of 22 dinucleotide repeats, two trinucleotide repeats and ten tetranucleotide repeats. The genetic and physical distances between marker pairs are as follows: mean = 0.51 centimorgans (cM), 562 kilobases (kb); median = 0.34 cM, 230 kb; and range = 0.00-2.79 cM, 11-4,185 kb. The primer sequences were obtained from the University of Santa Cruz Genome Bioinformatics database [35]. All of the forward primers were 5'-tailed with the M13 sequence 5'-CACGACGTTGTAAAACGAC-3'. The M13-forward primers were used in combination with an M13 primer that had the same sequence but was labelled at its 5'end with a fluorescent reagent from Applied Biosystems (ABI) Foster City CA, USA such as 6-FAM, VIC or NED. The latter primer is the sole source of label and can be used with any M13-forward primer to generate a labelled amplified allele [36]. The polymerase chain reaction (PCR) amplifications of individual microsatellite loci were performed in 10 ml volumes containing 10 mM Tris-HCl (pH 8.3), 50 mM KCl, 2.0 mM MgCl2, 0.2 mM of each dinucleotide triphosphate, 1 mM labelled M13 primer and reverse primer, 0.07 mM M13-tailed primer (15:1 molar ratio of labelled M13 primer versus a M13-tailed forward primer), 25 ng genomic DNA, 0.5 U TaqGold DNA polymerase (Applied Biosystems Foster City CA, USA). PCR amplification was performed in a PE Applied Biosystems Model 9700 using 384 high-throughput format plates. The PCR conditions were a modified touchdown PCR procedure: 95°C, 10 minutes; two cycles of 95°C, 15 seconds; annealing temperature, 30 seconds; 72°C, 45 seconds, at annealing temperatures of 60°C, 58°C, 56°C, 54°C, 52°C; 30 cycles at an annealing temperature of 50°C; 72°C, 30 minutes. Six PCR products were pooled together for multiplex loading according to the label colour and marker size. Samples were diluted appropriately, pooled and then 3 μl of sample was mixed with 9 μl of formamide containing Liz 350 size standard (ABI). Samples were electrophoresed in a 22 cm capillary array using POP5 polymer and 3700 running buffer (ABI) on an ABI Model 3100 Automated DNA Sequencer using data collection software version 1.0.1 and Genescan Analysis software version 3.7. Genotyping was performed using Genotyper Version 2.5 and allele sizes were binned using Allelogram (Carl Manaster, available at For quality control between plates DNAs from 22 per cent of subjects were duplicated across plates. Mendelian errors were tested within the triad families using the PedChek program [37].

Genetic association analyses

Allele frequencies were computed and compared between cases and controls using Pearson's χ2 test or Fisher's exact test. ORs, 95 per cent confidence intervals (CIs) and p values were computed for dominant and recessive genetic models adjusted for age and sex. Logistic regression adjusted for age and sex was used to compute ORs using SAS PROC LOGISTIC software (SAS Institute Cary NC, USA). ORs were computed for a dominant model, comparing the combined homozygous and heterozygous genotypes against all other genotypes. When the allele frequency of the minor allele was ≥5 per cent ORs were calculated for the recessive model, comparing the homozygous genotype against all other genotypes. Conformance to Hardy-Weinberg equilibrium expectations was calculated for all loci. Tests for D' as a measure of linkage disequilibrium (LD) were conducted for allele pairs using SAS Genetics software (SAS Institute Cary NC, USA).


The Phase I pilot study enrolled participants from the Cancer Institute in Wuzhou City and the Cangwu Institute for Nasopharyngeal Carcinoma Control and Prevention Cangwu County in Guangxi province in the autumn of 2000. For NPC cases, 71.3 per cent of spouses and 81 per cent of adult children were enrolled. For cases with EBV/IgA/VCA titres consistent with chronic EBV infection, 72.4 per cent of spouses and 67.4 per cent of adult children or parents were enrolled. Complete triad sets were available for 366 NPC probands. As predicted for this highly endemic NPC region, 71.8 per cent of the NPC cases were male. PBMCs cryopreserved on-site were transported to the Laboratory of Genomic Diversity-National Cancer Insitute (LGD-NCI) for EBV immortalisation: 83 per cent of 633 transformation attempts resulted in LCLs.

Sample and genotyping errors were estimated by including 10 per cent duplicate sampling with one sample derived from DNA isolated directly from peripheral blood and the second from DNA isolated from LCLs. Less than 0.5 per cent mismatches within duplicate samples were observed, all of which were resolved using family trios, indicating that tubes collected from a single individual were appropriately labelled (data not shown) and that error was not introduced during cell line development or sample handling. A second test for Mendelian errors using PedChek was performed using the chromosome 4 microsatellite data (described below). Two unresolved Mendelian errors were observed within the 366 family triads. Near-complete genotyping and complete clinical data were available for 350 NPC cases, 288 IgA seropositives and 346 IgA seronegatives (Table 1).

Phase II enrolment occurred between November 2004 and July 2005 in Guangxi province. Subjects were enrolled if at least one parent was from the Guangxi or Guangdong provinces. NPC cases were identified as seroincident or seroprevalent cases presenting at Red Cross hospitals and IgA+ and IgA- controls were identified from field stations in cities and villages bordering the Xijiang River drainage. Table 3 presents summary data of environmental exposures for the Phase II NPC+, IgA+ and IgA- groups and the numbers of participants enrolled.

Table 3 Phase II study design and non-genetic covariates.

We have addressed the questions of whether a locus within the chromosome 4p15.1-q12 region leads to the development of NPC or the development of EBV/IgA/VCA in response to EBV replication using the Phase I cases and controls. Microsatellite loci (n = 34) were distributed over an 18 Mb region on chromosome 4p15.1-q12, with intervals of 10-3,500 kb and an average distance of 530 kb. Four Phase I genetic association comparisons were made: 1) NPC cases versus EBV/IgA/VCA seropositive controls (Table 4); 2) EBV/IgA/VCA seropositive cases without NPC versus EBV/IgA/VCA seronegative controls (Table 5); 3) NPC cases plus EBV/IgA/VCA seropositive cases versus EBV/IgA/VCA seronegative controls (Table 6); and 4) NPC cases versus EBV/IgA/VCA seronegative controls (data not shown). No distortions in Hardy-Weinberg equilibrium were observed. Alleles with at least one significant result (p < 0.05) for either the dominant or recessive genetic models are reported in Tables 4, 5, 6. The results are presented without correction for multiple comparisons because the interrogated 4p15.1-q12 region was previously implicated as a susceptibility locus in a family-based study and we were specifically testing the prior hypothesis that markers within the region would also be associated with NPC in a population-based study [32]. It should be noted that associations with p > 0.0015 would not remain significant after correction for multiple comparisons considering the 34 independent loci.

Table 4 Significant allele frequencies between NPC versus IgA+ groups.
Table 5 Group 2: Markers and alleles showing significant allele frequencies among IgA+ cases without NPC and IgA- subjects.
Table 6 Group 3: Markers and alleles showing significant allele frequencies among NPC cases plus IgA+ cases and IgA- subjects.

Linkage disequilibrium among the 34 loci

The spacing of markers varied from 10-3,500 kb, with denser coverage flanking the microsatellite markers with the highest LOD scores from the family study (Figure 1). We calculated two-point D' as a measure of LD between all alleles at neighbouring short tandem repeat loci; however, a D' value of 1 (complete LD) was observed for only 60 two-point allele combinations. Using HapMap single nucleotide polymorphism (SNP) data, we examined whether the microsatellites were included in reasonably strong LD blocks. The r2 between any given marker pairs were set at a 0.8 cut-off threshold to determine the LD blocks. Only 11 of the 34 microsatellite markers occurred within an LD block: D4S396 and D4S401 occurred within the same 17 kb block. Of the two NPC-linked markers [32], D4S405 was not within a block and D4S3002 occurred within an 8 kb block. The mean size of the blocks was 17.9 kb (range 8-50 kb).

Genetic association with NPC

Table 4 presents the locus name, location, allele length, allele frequencies ORs, p values and 95 per cent CIs for 350 NPC cases and 288 EBV/IgA/VCA seropositive controls (93.0-99.7 per cent of NPC cases and 91.0-97.6 per cent of IgA+ subjects were genotyped successfully). The genotype frequencies among NPC cases were significantly higher than those among control subjects for five alleles (OR 1.51-5.36; p = 0.01-0.03): for the recessive model, D4S3040-215 and D4S1547-251; and for the dominant model, D4S3040-213, D4S2974-137 and D4S2916-204. The genotype frequencies among NPC cases was statistically lower than among control subjects for four alleles (OR 0.3-0.71; p = 0.02-0.045): for the recessive model, D4S2950-141 and D4S2974-135; and for the dominant model, D4S3357-271 and D4S2381-277.

Genetic association with persistent IgA+status

To test the hypothesis that genetic factors may influence EBV/IgA/VCA formation in response to EBV infection, we compared genotype frequencies between 288 IgA+ cases and 346 IgA- controls (Table 1). Table 5 provides the allele frequencies, p values ORs and 95 per cent CIs in cases and controls for significant results. Eleven alleles were significantly associated with IgA+ persistence: five risk alleles (OR 1.51-2.38; p = 0.004-0.040) and six protective alleles (OR 0.33-0.70; p = 0.002-0.050).

Because all NPC cases in our study were IgA+, we then pooled NPC and IgA+ cases together to increase power, with the hypothesis being that the alleles associated with IgA+ serostatus would be shared among NPC+ IgA+ and NPC- IgA+ individuals. Significant associations are presented in Table 6: four alleles were associated with risk for IgA+ (OR 1.5-1.63; p = 0.001-0.030) and seven were protective (OR 0.46-0.76; p = 0.001-0.050). Based on the two comparisons (Tables 5 and 6), ten alleles associated with IgA were shared in both comparisons. Five were highly significant (p < 0.01) associations with IgA+ serostatus. Alleles D4S190-170 (p = 0.005; OR 1.5, 95% CI 1.13-2.0), D4S3241-136 (p = 0.004; OR 1.91, 95% CI 1.2-3.0) and D4S3347-213 (p = 0.001; OR 1.58, 95% CI 1.2-2.1) significantly increased the risk of developing EBV/IgA/VCA. Alleles D4S174-202 (p = 0.001; OR 0.46, 95% CI 0.3-0.7) and D4S2950-137 (p = 0.0036; OR 0.56, 95 per cent CI 0.38-0.83) significantly decreased the risk of EBV/IgA/VCA. Within a single locus (D4S3357), one allele increased susceptibility (D4S3357-271) and the other allele was protective (D4S3357-275).


We have described the design and recruitment efforts for a genetic association study to investigate the role of host genetic factors in the development of chronic EBV infection leading to NPC in subjects born and living in a region with one of the world's highest incidence rates of NPC. This study was conducted in two phases. Phase I was a pilot study to explore the feasibility of conducting a cross-sectional study in China (Table 1). The pilot provided strong support for expanding the study in several important ways: export permits for genetic material were obtained, sample handling was excellent--with few detectable errors--and recruitment goals were attainable. Upon the successful completion of Phase I, we increased the catchment area for IgA+ cases to cities and villages along the Xijiang River and tributaries, expanded the study to include more subjects and added a detailed questionnaire to capture environmental exposures that may interact with host genes in the development of NPC (Table 2). Complementing previous studies, we also attempted to determine if Phase II of the study was powered for the detection of both gene-gene and gene-environment interactions.

To revisit the recent linkage analysis in NPC families implicating a susceptibility locus linked to chromosome 4p15.1-q12, we selected 34 microsatellite loci spanning the 18 Mb region at intervals of 10-3,500 kb. Unlike in previous studies, we first also attempted to determine if the chromosome 4 region was associated with EBV/IgA/VCA antibody formation and, secondly, if the chromosome 4 region was associated with NPC incidence in the setting of EBV replication as indicated by EBV/IgA/VCA. We identified several loci that showed significant associations with either EBV/IgA/VCA or NPC status. The associations tended to be marginally significant for NPC (Table 4), with somewhat stronger associations observed for EBV/IgA/VCA (IgA+) (Tables 5 and 6).

Few NPC families have been identified outside of NPC endemic areas. More than 90 per cent of all NPC cases do not show familial aggregation or family history, implying either environmental causes or geographical family clustering. Two family-based NPC linkage studies implicated different chromosomes as harbouring an NPC susceptibility locus [32, 33]. Although the studies differed in strategy, both used multiple families with two or more NPC cases from two separate high NPC incident provinces in China and included similar numbers of families and affected cases. Although it is possible that environmental exposures may differ between the two provinces, it is unlikely that different environmental factors account for the lack of concordance between the studies. More likely, multiple genes predispose to chronic EBV replication and the development of NPC, each of which may contribute only a small part of the total genetic influence. Family-based linkage studies are ideal for identifying single genes with large effects, but are relatively insensitive for localising genetic factors with small effects. By contrast, casecontrol association studies are ideal for identifying genetic factors with small or moderate effects once a candidate gene or region has been identified [38].

We cannot exclude the possibility that there may be causal alleles in the chromosome 4 region that may be associated with chronic EBV replication or a predisposition to develop NPC. Marker associations within this region (Tables 4, 5, 6) may be tracking a susceptible locus through LD. Because included alleles predominantly occurred at very low frequencies and haplotype inferences were unreliable, we could not reliably assess associations with either EBV persistence or NPC (data not shown). A denser placement of polymorphic markers is required to survey the genetic variation content of the region more thoroughly.

Although this study did not find associations with robust p values for NPC, making conclusions tentative, a number of loci did show moderate to strong risk, suggesting that this region warrants further attention, particularly for chronic EBV replication. For one of the microsatelite loci D4s3347 (Tables 5 and 6), two alleles were associated with EBV/IgA/VCA, suggesting that these alleles may be tracking a potential causative allele (see Figure 1). Of potential interest is the association of two microsatellites with IgA incidence: D4S3347, which shows three significant associations with p < 0.01 and one with p < 0.05 for two alleles (213 and 217), and the tightly linked (< 20 kb) D4S1577 locus, which also shows four significant associations (p < 0.05) (Tables 5 and 6, Figure 1). Microsatellite D4S190 occurs within the oncogene ARHH. D4S190 was associated with risk for EBV/IgA/VCA seropositive status but not with NPC. ARHH, a member of the ras homolog gene family, encodes a small GTP-binding protein belonging to the RAS superfamily and is transcribed by only haemopoietic cells. ARHH non-coding variants that may affect expression are observed in 46 per cent of diffuse large-cell lymphomas [39]. It is possible that one or more variant alleles of ARHH in LD with associated D4S190-170 may modify EBV replication.

Given the similar geographical distribution of familial and non-familial NPC, it is likely that both forms share similar aetiological risk factors, particularly environmental and viral factors; however, it is likely that the genetic factors underpinning familial, early-onset and non-familial NPC susceptibility may also overlap. It is also possible that different genes contribute to familial NPC cases, analogous to the situation in breast cancer, where BRCA1 and BRCA2 account for only a small proportion of non-familial breast cancer cases [40, 41]. The best approach to identifying NPC susceptibility factors may be the organisation of well-designed and highly powered case-control studies for whole-genome and targeted candidate gene association investigations, as we describe here.


  1. deThe G: Epidemiology of Epstein Barr Virus and associated diseases in man. The Herpesviruses. Edited by: Roizman B. 1982, Springer, New York, NY, 25-103.

    Chapter  Google Scholar 

  2. deThe G: Viruses and human cancers: Challenges for preventive strategies. Environ Health Perspect. 1995, 103 (Suppl 8): 269-273. 10.1289/ehp.95103s8269.

    Article  Google Scholar 

  3. Jeannel D, Hubert A, de Vathaire F, et al: Diet, living conditions and nasopharyngeal carcinoma in Tunisia -- A case-control study. Int J Cancer. 1990, 46: 421-425. 10.1002/ijc.2910460316.

    Article  CAS  PubMed  Google Scholar 

  4. Laramore GE, Clubb B, Quick C, et al: Nasopharyngeal carcinoma in Saudi Arabia: A retrospective study of 166 cases treated with curative intent. Int J Radiat Oncol Biol Phys. 1988, 15: 1119-1127. 10.1016/0360-3016(88)90193-9.

    Article  CAS  PubMed  Google Scholar 

  5. Johansen LV, Mestre M, Overgaard J: Carcinoma of the nasopharynx: Analysis of treatment results in 167 consecutively admitted patients. Head Neck. 1992, 14: 200-207. 10.1002/hed.2880140307.

    Article  CAS  PubMed  Google Scholar 

  6. Lee AW, Foo W, Mang O, et al: Changing epidemiology of nasopharyngeal carcinoma in Hong Kong over a 20-year period (1980-99): An encouraging reduction in both incidence and mortality. Int J Cancer. 2003, 103: 680-685. 10.1002/ijc.10894.

    Article  CAS  PubMed  Google Scholar 

  7. Old LJ, Boyse EA, Oettgen HF, et al: Precipitating antibody in human serum to an antigen present in cultured Burkitt's lymphoma cells. Proc Natl Acad Sci USA. 1966, 56: 1699-1704. 10.1073/pnas.56.6.1699.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  8. Henle G, Henle W: Epstein-Barr virus-specific IgA serum antibodies as an outstanding feature of nasopharyngeal carcinoma. Int J Cancer. 1976, 17: 1-7. 10.1002/ijc.2910170102.

    Article  CAS  PubMed  Google Scholar 

  9. Deng H, Zeng Y, Lei Y, et al: Serological survey of nasopharyngeal carcinoma in 21 cities of south China. Chin Med J (Engl). 1995, 108: 300-303.

    CAS  Google Scholar 

  10. Sham JS, Wei WI, Zong YS, et al: Detection of subclinical nasopharyngeal carcinoma by fibreoptic endoscopy and multiple biopsy. Lancet. 1990, 335: 371-374. 10.1016/0140-6736(90)90206-K.

    Article  CAS  PubMed  Google Scholar 

  11. Zeng Y, Zhang LG, Li HY, et al: Serological mass survey for early detection of nasopharyngeal carcinoma in Wuzhou City, China. Int J Cancer. 1982, 29: 139-141. 10.1002/ijc.2910290204.

    Article  CAS  PubMed  Google Scholar 

  12. Zeng Y, Zhong JM, Li LY, et al: Follow-up studies on Epstein-Barr virus IgA/VCA antibody-positive persons in Zangwu County, China. Intervirology. 1983, 20: 190-194. 10.1159/000149391.

    Article  CAS  PubMed  Google Scholar 

  13. Zeng Y, Zhang LG, Wu YC, et al: Prospective studies on nasopharyngeal carcinoma in Epstein-Barr virus IgA/VCA antibodypositive persons in Wuzhou City, China. Int J Cancer. 1985, 36: 545-547. 10.1002/ijc.2910360505.

    Article  CAS  PubMed  Google Scholar 

  14. Zong YS, Sham JS, Ng MH, et al: Immunoglobulin A against viral capsid antigen of Epstein-Barr virus and indirect mirror examination of the nasopharynx in the detection of asymptomatic nasopharyngeal carcinoma. Cancer. 1992, 69: 3-7. 10.1002/1097-0142(19920101)69:1<3::AID-CNCR2820690104>3.0.CO;2-7.

    Article  CAS  PubMed  Google Scholar 

  15. Jalbout M, Bel Hadj Jrad B, Bouaouina N, et al: Autoantibodies to tubulin are specifically associated with the young age onset of the nasopharyngeal carcinoma. Int J Cancer. 2002, 101: 146-150. 10.1002/ijc.10586.

    Article  CAS  PubMed  Google Scholar 

  16. Yu MC, Yuan JM: Epidemiology of nasopharyngeal carcinoma. Semin Cancer Biol. 2002, 12: 421-429. 10.1016/S1044579X02000858.

    Article  PubMed  Google Scholar 

  17. Brown TM, Heath CW, Lang RM, et al: Nasopharyngeal cancer in Bermuda. Cancer. 1976, 37: 1464-1468. 10.1002/1097-0142(197603)37:3<1464::AID-CNCR2820370331>3.0.CO;2-Z.

    Article  CAS  PubMed  Google Scholar 

  18. Coffin CM, Rich SS, Dehner LP: Familial aggregation of nasopharyngeal carcinoma and other malignancies. A clinicopathologic description. Cancer. 1991, 68: 1323-1328. 10.1002/1097-0142(19910915)68:6<1323::AID-CNCR2820680623>3.0.CO;2-S.

    Article  CAS  PubMed  Google Scholar 

  19. Yu MC, Garabrant DH, Huang TB, et al: Occupational and other non-dietary risk factors for nasopharyngeal carcinoma in Guangzhou, China. Int J Cancer. 1990, 45: 1033-1039. 10.1002/ijc.2910450609.

    Article  CAS  PubMed  Google Scholar 

  20. Jia WH, Feng BJ, Xu ZL, et al: Familial risk and clustering of nasopharyngeal carcinoma Guangdong, China. Cancer. 2004, 101: 363-369. 10.1002/cncr.20372.

    Article  PubMed  Google Scholar 

  21. Buell P: The effect of migration on the risk of nasopharyngeal cancer among Chinese. Cancer Res. 1974, 34: 1189-1191.

    CAS  PubMed  Google Scholar 

  22. Hildesheim A, Apple RJ, Chen CJ, et al: Association of HLA class I and II alleles and extended haplotypes with nasopharyngeal carcinoma in Taiwan. J Natl Cancer Inst. 2002, 94: 1780-1789. 10.1093/jnci/94.23.1780.

    Article  CAS  PubMed  Google Scholar 

  23. Li PK, Poon AS, Tsao SY, et al: No association between HLA-DQ and -DR genotypes with nasopharyngeal carcinoma in southern Chinese. Cancer Genet Cytogenet. 1995, 81: 42-45. 10.1016/0165-4608(94)00205-3.

    Article  CAS  PubMed  Google Scholar 

  24. Lu CC, Chen JC, Jin YT: Genetic susceptibility to nasopharyngeal carcinoma within the HLA-A locus in Taiwanese. Int J Cancer. 2003, 103: 745-751. 10.1002/ijc.10861.

    Article  CAS  PubMed  Google Scholar 

  25. Mokni-Baizig N, Ayed K, Ayed FB, et al: Association between HLA-A/-B antigens and -DRB1 alleles and nasopharyngeal carcinoma in Tunisia. Oncology. 2001, 61: 55-58.

    Article  CAS  PubMed  Google Scholar 

  26. Pimtanothai N, Charoenwongse P, Mutirangura A, Hurley CK: Distribution of HLA-B alleles in nasopharyngeal carcinoma patients and normal controls in Thailand. Tissue Antigens. 2002, 59: 223-225. 10.1034/j.1399-0039.2002.590308.x.

    Article  CAS  PubMed  Google Scholar 

  27. Thomas JA, Iliescu V, Crawford DH, et al: Expression of HLA-DR antigens in nasopharyngeal carcinoma: An immunohistological analysis of the tumour cells and infiltrating lymphocytes. Int J Cancer. 1984, 33: 813-819. 10.1002/ijc.2910330616.

    Article  CAS  PubMed  Google Scholar 

  28. Wu SB, Hwang SJ, Chang AS, et al: Human leukocyte antigen (HLA) frequency among patients with nasopharyngeal carcinoma in Taiwan. Anticancer Res. 1989, 9: 1649-1653.

    CAS  PubMed  Google Scholar 

  29. Ooi EE, Ren EC, Chan SH: Association between microsatellites within the human MHC and nasopharyngeal carcinoma. Int J Cancer. 1997, 74: 229-232. 10.1002/(SICI)1097-0215(19970422)74:2<229::AID-IJC16>3.0.CO;2-8.

    Article  CAS  PubMed  Google Scholar 

  30. Loh KS, Goh BC, Lu J, et al: Familial nasopharyngeal carcinoma in a cohort of 200 patients. Arch Otolaryngol Head Neck Surgery. 2006, 132: 82-85. 10.1001/archotol.132.1.82.

    Article  Google Scholar 

  31. Zeng YX, Jia WH: Familial nasopharyngeal carcinoma. Semin Cancer Biol. 2002, 12: 443-450. 10.1016/S1044579X02000871.

    Article  CAS  PubMed  Google Scholar 

  32. Feng BJ, Huang W, Shugart YY, et al: Genome-wide scan for familial nasopharyngeal carcinoma reveals evidence of linkage to chromosome 4. Nat Genet. 2002, 31: 395-399.

    CAS  PubMed  Google Scholar 

  33. Xiong W, Zeng ZY, Xia JH, et al: A susceptibility locus at chromosome 3p21 linked to familial nasopharyngeal carcinoma. Cancer Res. 2004, 64: 1972-1974. 10.1158/0008-5472.CAN-03-3253.

    Article  CAS  PubMed  Google Scholar 

  34. Saunders CL, Barrett JH: Flexible matching in case-control studies of gene-environment interactions. Am J Epidemiol. 2004, 159: 17-22. 10.1093/aje/kwg250.

    Article  PubMed  Google Scholar 

  35. []

  36. Boutin-Ganache I, Raposo M, Raymond M, Deschepper CF: M13-tailed primers improve the readability and usability of microsatellite analyses performed with two different allele-sizing methods. Biotechniques. 2001, 31: 24-26. 28

    CAS  PubMed  Google Scholar 

  37. O'Connell JR, Weeks DE: PedCheck: A program for identification of genotype incompatibilities in linkage analysis. Am J Hum Genet. 1998, 63: 259-266. 10.1086/301904.

    Article  PubMed Central  PubMed  Google Scholar 

  38. Risch N, Merikangas K: The future of genetic studies of complex human diseases. Science. 1996, 273: 1516-1517. 10.1126/science.273.5281.1516.

    Article  CAS  PubMed  Google Scholar 

  39. Preudhomme C, Roumier C, Hildebrand MP, et al: Nonrandom 4p13 rearrangements of the RhoH/TTF gene, encoding a GTP-binding protein, in non-Hodgkin's lymphoma and multiple myeloma. Oncogene. 2000, 19: 2023-2032. 10.1038/sj.onc.1203521.

    Article  CAS  PubMed  Google Scholar 

  40. Malone KE, Daling JR, Neal C, et al: Frequency of BRCA1/BRCA2 mutations in a population-based sample of young breast carcinoma cases. Cancer. 2000, 88: 1393-1402. 10.1002/(SICI)1097-0142(20000315)88:6<1393::AID-CNCR17>3.0.CO;2-P.

    Article  CAS  PubMed  Google Scholar 

  41. Peto J, Collins N, Barfoot R, et al: Prevalence of BRCA1 and BRCA2 gene mutations in patients with early-onset breast cancer. J Natl Cancer Inst. 1999, 91: 943-949. 10.1093/jnci/91.11.943.

    Article  CAS  PubMed  Google Scholar 

Download references


We gratefully acknowledge Beth Binns-Roemer and Maidar Jamba for excellent technical assistance and Dr Michael Smith for valuable discussions. This project has been funded, in whole or in part, by federal funds from the National Cancer Institute National Institutes of Health, under contract N01-CO-12400. The content of this paper does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products or organisations imply endorsement by the US Government. The publisher or recipient acknowledges the right of the US Government to retain a non-exclusive, royalty-free licence in and to any copyright covering the paper.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Yi Zeng or Stephen J. O'Brien.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guo, X.C., Scott, K., Liu, Y. et al. Genetic factors leading to chronic Epstein-Barr virus infection and nasopharyngeal carcinoma in South East China: Study design, methods and feasibility. Hum Genomics 2, 365 (2006).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: