Combination of 16S rRNA variable regions provides a detailed analysis of bacterial community dynamics in the lungs of cystic fibrosis patients

Chronic bronchopulmonary bacterial infections remain the most common cause of morbidity and mortality among patients with cystic fibrosis (CF). Recent community sequencing work has now shown that the bacterial community in the CF lung is polymicrobial. Identifying bacteria in the CF lung through sequencing can be costly and is not practical for many laboratories. Molecular techniques such as terminal restriction fragment length polymorphism or amplicon length heterogeneity-polymerase chain reaction (LH-PCR) can provide many laboratories with the ability to study CF bacterial communities without costly sequencing. The aim of this study was to determine if the use of LH-PCR with multiple hypervariable regions of the 16S rRNA gene could be used to identify organisms found in sputum DNA. This work also determined if LH-PCR could be used to observe the dynamics of lung infections over a period of time. Nineteen samples were analysed with the V1 and the V1_V2 region of the 16S rRNA gene. Based on the amplicon size present in the V1_V2 region, Pseudomonas aeruginosa was confirmed to be in all 19 samples obtained from the patients. The V1 region provided a higher power of discrimination between bacterial profiles of patients. Both regions were able to identify trends in the bacterial population over a period of time. LH profiles showed that the CF lung community is dynamic and that changes in the community may in part be driven by the patient's antibiotic treatment. LH-PCR is a tool that is well suited for studying bacterial communities and their dynamics.


Introduction
Cystic fibrosis (CF) is an autorecessive disease affecting one in 3,500 Caucasian live births in the USA. 1 CF results from a mutation in the gene that encodes the CF transmembrane conductance regulator (CFTR) protein. 2,3 A defect in the CFTR protein leads to a malfunctioning cyclic AMP-activated chloride channel in secretory epithelia. 4 This defect in the lung leads to the inability to secrete chloride and to the excess re-absorption of sodium. 4 Thus, there is decreased fluid secretion and the mucus becomes immobilised and adheres to the epithelial cells. Overproduction of mucus in the airway results in congestion of the respiratory tract and increases susceptibility to bronchopulmonary infection. CF patients often suffer from infections with Staphylococcus aureus, Haemophilus influenzae, Pseudomonas aeruginosa and Burkholderia cenocepacia. 5 These chronic infections by highly adapted lung microbes cause inflammation and, eventually, lung damage; therefore, lung disease is the major cause of morbidity and mortality among these patients. 6,7 It has also been estimated that less than 1 per cent of eubacteria in the environment can be cultured. 8,9 Thus, these identification methods will fail to detect all pathogens that might be causing the lung infection. Fortunately, with the advent of molecular techniques, culturing for identification purposes can be circumvented. 10 Recent molecular studies using terminal restriction fragment length analysis and sequencing have shown that the lung community is complex. Achromobacter (Alcaligenes) xylosoxidans, Rhodotorula mucilaginosa, Abiotrophia spp, Bacteroides gracilis, Eubacterium brachy, Mycobacterium mucilaginosus, Mycoplasma salivarium, Porphyromonas salivae, Ralstonia spp, Staphylococcus hominis, Streptococcus anginosus, Treponema vincentii, Veillonella spp, Burkholderia gladioli, Stenotrophomonas maltophilia and Pandoraea atypical have all been demonstrated to be members of this polymicrobial infection. 11 -18 Amplicon length heterogeneity -polymerase chain reaction (LH-PCR) is another molecular method that has been used to study various diverse microbial communities, 19,20 including those of CF sputum. 18 LH-PCR utilises the 16S rRNA molecular marker to analyse microbial populations. 20 -22 The sequence and length of the variable regions within the 16S rRNA gene are often used to determine phylogeny. 23 It should be noted that some bacteria have multiple copies of the 16S rRNA gene, which can show intragenomic heterogeneity in sequence and gene length. 23 This variation is unlikely to have an effect on classification at the genus and species level, and the marker is routinely used for identification. 24 Identification based on length variation in the hypervariable regions of the gene is exploited by LH-PCR. In this technique, forward and reverse primers bind to the conserved regions of the 16S rRNA gene and then amplify the varying lengths seen in the hypervariable regions. LH-PCR has proven to be a robust, timeefficient and reproducible method. 20,21,25 The main advantages of LH-PCR are that it surveys relative gene frequencies within complex mixtures of DNA, is reproducible, requires small sample sizes and can be performed with many samples simultaneously. 20,25 Furthermore, some of the size classes emerging from LH-PCR analyses can be related at the genus level to archived database sequences. 20 Overall, the attributes of LH-PCR make it useful for quick assessment of the diversity of microbial communities for comparative purposes. 20 Prior studies have shown that LH-PCR and the hypervariable regions 1 and 2 (V1_V2) of the 16S rRNA gene can be used to detect eubacteria in sputum from CF patients. 18,26 The technique had limited efficacy when determining overall diversity and attaining organism identification, and it was concluded that the small differences in amplicon lengths generated from the organisms were at fault. 18 In this study, community profiles of CF sputa were obtained using two regions of the 16S rRNA gene (hypervariable region 1 [V1] and V1_V2). Bacterial identification based on amplicon length was attempted using a computer program, AmpliQué, which determined the theoretical V1 and V1_V2 fragment lengths for known organisms. Community profiles and identification were analysed using each region separately and combined to determine if a combinatorial approach to LH-PCR: 1) could provide a more detailed analysis of the bacterial community in the CF lung and 2) could be used to observe the dynamics of this community over a period of time. Our work has shown that the V1 region provides a more detailed community profile, which leads to a higher power of discrimination between profiles than the V1_V2 region. The LH analyses provided snapshots of the CF bacterial communities; however, the analyses are limited by their ability to identify community members with the V1 or V1_V2 regions separately or combined. In addition, through the comparisons of V1_V2 or V1 LH profiles over time, it was established that the CF lung community is not only diverse, but also dynamic.

Sample collection
Nineteen sputum samples were obtained from CF patients attending the University of Miami Adult CF Clinic (Miami, FL). Samples are referred to by UM followed by a letter. The study was carried out in accordance with the Declaration of Helsinki (2000) of the World Medical Association and was approved by the appropriate institutional review board (FIU IRB approval # 033004-02). Appropriate consent was obtained from human subjects. All 19 expectorated sputum samples were frozen at 2208C prior to DNA extraction. Sputum was liquefied using the method of Reischl et al. 27 and the metagenomic DNA was extracted and cleaned using the GeneClean Spin Kit (Qbiogene, Irvine, CA, USA) according to the manufacturer's protocol. The eluted DNA was quantified and diluted to 10 ng/ml and refrigerated at 48C until further use.

Control bacteria isolates
Standard DNA manipulations were used to extract chromosomal DNA from CF-related pathogens: P. aeruginosa strains PAO1 and PA14 (burn wound isolates) and four additional P. aeruginosa strains isolated from CF, 28 B. cenocepacia (ATCC BAA-246; CF isolate), S. aureus (ATCC 12600; pleural fluid isolate) and S. maltophilia (ATCC 17672; sputum isolate), were used as controls in the LH-PCR experiments.

LH-PCR
The metagenomic DNA derived from the extracted sputum samples were amplified in triplicate using eubacterial primers for the 16S rRNA gene. The V1_V2 region of the 16S rRNA gene were amplified using 27F-6FAM forward primer (5 0 -6FAM -AGA GTT TGA TCM TGG -3 0 ) and the 355R reverse primer (5 0 -GCT GCC TCC CGT AGG AGT -3 0 ). 29 The V1 region was amplified using fluorescent primer P1F-6FAM (5 0 -6FAM -GCG GCG TGC CTA ATA CAT GC -3 0 ) and reverse primer P2R (5 0 -TTC CCC ACG CGT TAC TCA CC -3 0 ). 29,30 All forward primers were fluorescently labelled with 6-FAM TM (Integrated DNA Technologies, Shokie, IL, USA). The final V1_V2 PCR reaction mixture was composed of: 1X PCR buffer (supplied with the enzyme by Applied Biosystems, Foster City, CA, USA), 2.5 mM MgCl 2 , 0.25 mM dNTPs, 0.5 mM forward and reverse primers, 0.1 per cent bovine serum albumin (BSA), 0.025 U of AmpliTaq Gold LD DNA polymerse TM (Applied Biosystems) and diethylpyrocarbonate (DEPC) water for a final volume of 20 ml. A half a nanogram per microlitre of DNA template was added per reaction. The following parameters were used to amplify the selected fragments on an MJ Research Peltier Thermal Cycler 200 (Waltham, MA, USA): initial denaturation at 958C for 11 minutes, 25 cycles of denaturation at 958C, annealing at 558C and extension at 728C, each for 1 minute, and a final elongation at 728C for 10 minutes. A negative control containing water was amplified for every PCR master mix to check for contamination. Agarose gel electrophoresis was run to confirm the success of amplification.
LH-PCR analysis LH-PCR analysis was performed by the Forensic DNA Profiling Facility (Florida International University, Miami, FL, USA). The PCR products for the three replicate reactions were detected using the ABI Prism 310 Genetic Analyzer (Applied Biosystems). A formamide-size standard mix was prepared using a 96:1 ratio of Hi-Di TM (highly deionised) formamide and GeneScan TM 500 ROX TM internal standard (Applied Biosystems). Each PCR product was denatured by adding 0.50 ml to 9.5 ml of the formamide-size standard mix for 2 minutes at 958C and snap cooling for 5 minutes on ice. Each sample was run for 28 minutes on the ABI Prism 310. Fragments were separated by capillary electrophoresis using polymer POP-4, matrix DS-3O_6FAM_HEX_NED_ROX and filter D (Applied Biosystems).

Electropherogram analysis
The output was collected and analysed using GeneMapperw Software Version 3.7 (Applied Biosystems) and Microsoft Excel (Microsoft, Redmond, WA, USA). The ABI Prism TM Genotyper software analysis parameters were set to the local Southern Size calling, no correction, and the minimum noise threshold was set at 70 fluorescent units. 21 For V1_V2, amplicons were called if between 300 and 400 base pairs (bp). Amplicons for V1 region were between 60 and 120 bp.

Statistical analysis
Data were imported into MS Excel (Microsoft), to determine mean relative ratios for each triplicate PCR reaction. The normalised data for each sample were exported into PRIMER 5 statistical software for further analyses (PRIMER E Ltd., Plymouth Marine Laboratory, Plymouth, UK). The Bray-Curtis similarity index and non-metric multi-dimensional scaling (MDS) were used to compare sample data between patients and patient profiles over time. 31,32 Presumptive identity analysis GeneScan TM 500 ROX TM internal standard was used to determine amplicons sites in bp. Several approaches were used presumptively to identify an organism in the CF community based upon the length of the amplicon generated. Chromosomal DNA from isolates of B. cenocepacia, Staphylococcus aureus, Stenotrophomonas maltophilia and P. aeruginosa were used to determine the experimental amplicon length for the V1_V2 and V1 regions. Next, in silico analysis was performed, in which the expected amplicon length was determined based on where the primers of interest would theoretically bind to the published sequence. Lastly, the lengths of the expected amplicons for all bacteria were determined using AmpliQué, a newly-designed bioinformatics program. 29,33 The Perl scripts are available at http://biorg.cs.fiu.edu/ AmpliQue/. AmpliQué uses the 16S rRNA sequences downloaded from the Ribosomal Database Project II (Release 9.60). It takes a forward and a reverse primer (both of which may be potentially degenerate) as input. The stringency of the primer binding can be set in the program to control the exactness of the primer binding site within the 16S rRNA sequences. The stringency was first set to 100 per cent (E ¼ 10) to identify the organisms that have exactly the same conserved sequences as the primer and subsequently lowered to E ¼ 5,000, which allowed a larger database to be created as the conserved regions of the 16S rRNA gene are not 100 per cent identical in sequence. The resulting output shows the identity of all the 16S rRNA sequences that will produce an amplicon with a particular pair of primers, along with the length(s) of the amplicons. Thus, one can query the database using any primer to determine the theoretical fragment lengths that would be generated from an organism when analysed. In addition, a list of all bacteria that produce a specific fragment length can be generated. One amplicon length may represent more than one bacterium, and thus only a presumptive identity of that amplicon can be achieved.

Results
PCR amplification of 16S rRNA genes for metagenome profiling Variable regions V1 and V1_V2 were successfully amplified using universal eubacterial primers from the metagenomic DNA samples (see Methods). As expected, the amplicon sizes from the 19 samples ranged from 335 to 365 bp for the V1_V2 region (Table 1 and Supplementary Figure 1). Sample UMC had the most complex V1_V2 profile, with eight amplicons (Figure 1a and Table 1). Samples UMF, UMG, UMH, UMJ, UMK, UMP and UMT had the least complex LH profiles, with only one amplicon present at 342 bp ( Figure 1b and Table 1). The remaining 11 samples produced LH profiles containing multiple amplicons (Table 1 and Supplementary Figure 1). Amplicon 342 was observed in all 19 samples ( Table 1). Out of these 19 samples, 16 samples (84.2 per cent) had amplicon 342 as the dominant member of the profile. The most abundant amplicons in UMA, UML and UMR were 349, 360 and 360, respectively ( Table 1). As predicted, V1 profiles ranged from 60 to 120 bp in length (Table 1 and Supplementary Figure 2). Unlike V1_V2 profiles, all 19 samples produced more diverse profiles containing multiple amplicons in each profile (Table 1 and Supplementary Figure 2). The most complex profile contained 30 amplicons for UME ( Figure 2 and Table 1). The least diverse profile with ten amplicons belonged to UMD (Table 1 and Supplementary Figure 2). All the samples contained a 98 bp amplicon, although it was not the most abundant amplicon ( Table 1). The 86 bp amplicon was present in ten out of the 19 profiles and was the most dominant peak in UMA (17 per cent) ( Table 1). The 84 bp amplicon was found in five profiles. Seventeen out of 19 samples contained a 73 bp amplicon.

Comparison of LH eubacterial profiles within CF centre
The profiles for each sample were compared with each other to determine if the bacterial communities were similar between patients that were treated at the same CF centre. The normalised abundance data from the V1_V2 region were analysed using the Bray-Curtis similarity index. The similarity between samples was shown in a dendograph using group linkage clustering. The UMF profile had 100 per cent similarity to UMG, UMH, UMJ, UMK, UMP and UMT as they all produced 100 per cent abundance for amplicon 342 (Table 1 and Figure 3a). The UMR profile had 36.0 per cent or less similarity with 13 of the  samples, but had a 91.0 per cent similarity with UML ( Figure 3a). The UMR and UML profiles had two amplicons in common. The V1 region was able to discriminate between all samples; no two profiles were exactly the same (Table 1 and Figure 3b). The highest similarity between two profiles was 79.2 per cent between UMF and UMG due to the presence of many shared amplicons ( Figure 3b). The most dissimilar profiles were those of UMD and UMQ, with just 6.2 per cent similarity between them. Four out of the ten amplicons were common between UMD and UMQ ( Table 1). The similarity of the patients' bacterial population with both the V1_V2 and the V1 data was also compared. The combination of regions decreased the degree of similarity between V1_V2 profiles, yet the LH profiles were most dissimilar to the V1 region data (data not shown).

Patient CF flora changes over time
The lung flora changes over time in the CF lung. It is also known that the composition of the bacterial community in the lung at any one time varies due to the compartmentalised structure of the organ. 34 When working with expectorated sputum, it was assumed that the bacterial community within a given patient may vary from sample to sample. In order to study the dynamics of lung flora over a period of time, sampling bias was addressed -that is, any changes seen in community profiles over time may potentially be an artefact from the sampling method. A control experiment was performed to understand the effect of sampling. Five samples were taken on the same day, every three hours for 12 hours (M22.1-M22.5). The sampling scheme was designed to account for the potential variability contributed by the lung structure. All samples were analysed using the V1 and V1_V2 regions. The Bray-Curtis similarity index was used to determine the variation between the community profiles of the patient samples. The relatedness of bacterial communities was shown with dendographs created from the calculated similarity ( Figure 4). In addition, the relationship between profiles was graphically depicted using non-metric MDS, which represents the samples as points in a low-dimensional space, so that the relative distances of the symbols is correlated with the same rank order determined by the Bray-Curtis similarity index ( Figure 5). These analyses showed that samples taken on the same day were more similar than LH profiles from earlier time points (Figures 4 and 5). For the V1_V2 region, the samples in M22 ranged from 70.6 per cent to 93.4 per cent in similarity. The first three samples M1, M3 and M16 had less than 42.9 per cent similarity to any profile produced from the hourly samples. The LH profile M17 and the five profiles in M22 were related, with a range of 52.8 per cent to 78.3 per cent similarity. M22.4 was more similar to M17 than the other same-day samples (Figure 4a).  The V1 profiles for these samples were also analysed and compared. Overall, the similarities between samplings were much lower when using the V1 versus the V1_V2 region (Figures 4b and  5b). The highest profile similarities were seen within M22 samples; similarities as high as 96.0 per cent were seen between M22.3 and M22.5. M1 and M3 had a similarity of 48.1 per cent; M16 and M17 were 47.1 per cent similar. Some of the samplings taken months prior to the daily sampling in M22 were completely different (Figure 4b). The MDS analysis clearly showed that samples taken on the same day (M22) clustered (Figure 5b).
Combining LH data from both the V1 and the V1_V2 regions further indicated the overall similarity of the bacterial community within one day versus over a period of time (Figures 4c and 5c). The additive effect of the regions created a tighter clustering of the same-day samples and increased distances between samples taken in different months (Figure 4c).
Antibiotics drive the CF community After the initial time study with patient UML, two more patients were followed for a period of time to determine the effect of antibiotics on the lung microbial flora. Azithromycin, co-trimoxazole (Bactrim), tobramycin and polymyxin E (Colistin) were administered at different time points for each patient. Azithromycin, which is most often used to treat Gram-negative bacteria, is a macrolide antibiotic which interferes with protein synthesis, preventing growth of bacteria. 35 Co-trimoxazole (Bactrim) is a combination of sulfamethoxazole and trimethoprim and interferes with DNA synthesis in bacteria that cause respiratory infections, especially in Stenotrophomonas maltophilia. 36,37 Tobraymcin, an aminoglycoside, prevents translation in bacteria. The inhaled form of the drug targets P. aeruginosa in the lungs. 38 Polymyxin E is effective against most Gram-negative bacilli and is often used against multidrug-resistant strains of P. aeruginosa. 39 The drug regime and sputa samples from two patients, referred to as patient UMX and patient UMY, were analysed every other month for five months. The three time points are referred to as M0, M3 and M5. The amplicons of size 60-120 bp and 300 -400 bp were obtained from the V1 and the V1_V2 regions, respectively ( Table 2).
Patient UMX was under triple combination therapy with azithromycin, co-trimoxazole (Bactrim) and tobramycin at least two weeks prior and up to the sampling at month 0 (M0) and month 3 (M3). Two weeks prior to month 5 (M5), the patient stopped taking tobramycin but    Table 2).

Presumptive identity analysis
Based on amplicon lengths for both the V1_V2 and V1 regions, separately or combined, a presumptive identity for an organism can be determined. In an attempt to achieve identification of an amplicon, chromosomal DNA from several CF-related pathogens was amplified using V1_V2 primers and V1 primers. All strains of P. aeruginosa used in this study amplified a 342 and an 83 bp amplicon for the V1_V2 and V1 regions, respectively. Stenotrophomonas maltophilia amplified a 350 and an 87 bp amplicon for the V1_V2 and V1 regions, respectively. A 353 bp amplicon was generated for Staphylococcus aureus for the V1_V2 region; however, V1 primers amplified two peaks at 77 and 88 bp for S. aureus. B. cenocepacia produced a 339 bp and an 85 bp amplicon for V1_V2 and V1, respectively.
Many LH profiles contained amplicons that correspond to these bacterial isolates' fragment sizes. Nineteen LH V1_V2 profiles may have contained the pathogen P. aeruginosa but only one sample contained the corresponding V1 amplicon (83 bp) ( Table 1). Five samples had the 350 bp amplicon, which may be from Stenotrophomonas maltophilia, but only one sample had the corresponding V1 amplicon. None of the samples had the Staphylococcus aureus-specific V1_V2 fragment of 353 bp; however, the V1-specific fragments (77/88) were present in 16 samples; nine of these samples contained both fragments. The 339 and 85 bp amplicons, which may correspond to B. cenocepacia, were present in two and ten samples, respectively. Two samples contained both amplicons.
The theoretical amplicon lengths for bacteria were determined manually using in silico analysis and AmpliQué and then compared with the LH profiles of the strains. The manual in silico analysis determined the V1_V2 fragment length for one specific strain of a CF pathogen. These results agreed with the experimental data. The AmpliQué program was developed to determine the V1_V2 fragment lengths for all bacterial strains which had a 16S rRNA sequence in Ribosomal Database Project II (RDPII). The data generated from the program agreed with the experimental data for some strains, yet showed different fragments for other strains. For example, in the database, some strains of Stenotrophomonas maltophilia produced the same 350 bp fragment as our LH result. In addition, many more fragment lengths (304-344 and 346-351 bp) were identified for various strains of Stenotrophomonas maltophilia. Various isolates of P. aeruginosa produced theoretical fragments ranging from 337-345 bp. Only two fragments, 347 bp and 348 bp, were produced for this region based on H. influenzae sequences. AmpliQué determined that different isolates of Staphylococcus aureus have a V1_V2 of 340-344 and 346-351 bp. Based on B. cenocepacia sequences, any of the lengths 327, 335, 337, 338, 340, 344 or 346 bp fragments could be generated using the bacterial primers. Most bacterial species produced multiple hits for the V1_V2 region (data not shown).
AmpliQué was unable to determine the fragment sizes for many CF-related pathogens, including P. aeruginosa. In silico analysis was able to determine the fragment lengths of CF pathogens, as the site of primer binding could be detected manually. Again, fragment lengths generated from sequences of individual strains varied from the actual LH results for the specific control strain. Four different isolates of P. aeruginosa produced an 86 bp fragment for the V1 region, while experimentally the region was 83 bp long. Some strains of Stenotrophomonas maltophilia produced the same 87 bp fragment as the control strain, while others were determined to have a fragment of 85 or 88 bp in length. Staphylococcus aureus strains were 91 bp long in that region, but the test strain did not produce this fragment. The fragment from B. cenocepacia was three bp longer (88 bp) than the LH isolate's profile (85 bp).

Discussion
LH analysis of 16S rRNA genes has been used to detect both known and novel organisms that may be present in many complex microbial communities, including sputa from CF patients. 18,20,21,26 In these previous studies, only one region of the 16S rRNA was used to identify bacteria present in the community at one time point. 18,26 Identification based on one region is difficult, and it is hypothesised that the use of multiple regions might provide more information about the community. In this study, the V1 and V1_V2 regions (separately and combined) were analysed with LH-PCR to determine the diversity and dynamics of the CF lung. Identification of the most abundant amplicon lengths in the LH profiles was attempted through in silico analysis and AmpliQué, a newly designed bioinformatics program. Lastly, the dynamics of the eubacterial community in the sputum over a period of time was further studied using data generated from the LH profiles.
The V1 region of the 16S rRNA gene provides a more detailed look at the complex bacterial community in the CF lung The 16S rRNA gene contains nine variable regions. In this study, the V1 region and the V1_V2 regions were used to produce bacterial LH profiles of the CF sputa samples ( Table 1). The V1_V2 profiles were identical for seven patients and hence were unable to discriminate between patients. The 342 bp amplicon produced a high relative fluorescence in the electropherograms, which may have caused the less dominant amplicons to be below the threshold limit and inadvertently ignored (Table 1). PCR amplification bias may also have decreased the detection of other bacteria that were less abundant than the organism(s) represented by the 342 bp amplicon. Therefore, it appeared that seven CF samples had identical bacterial communities. Statistically, the V1_V2 profiles cannot discriminate between patients at the UM centre (Figure 3a). The V1 LH profiles were more complex, and statically discriminated between samples to a higher degree than the V1_V2 region ( Figure 3). The V1 profiles which were unique to each patient contained multiple amplicons per sample ( Table 1).
The use of multiple regions to increase discrimination between samples was not beneficial. The identical communities shown in the V1_V2 profiles negated the discriminatory power of the V1 region when analysed in conjunction with each other. The discriminatory strength of the V1 region, as compared with the V1_V2 region, was also evident in a soil community LH study. 25 Prior LH analysis on CF sputa used only the V1_V2 region, and more information may have been gained by using the V1 region. 18 Using the profiles from the two regions together did not differentiate profiles to the same extent as the V1 region. The combinatorial approach may be of more use in amplicon identification or observing community dynamics.

CF patients harbour diverse bacterial communities
The 19 sputum samples obtained from South Florida patients produced unique profiles. No two LH profiles for the V1 region were the same (Table 1 and Figure 3b). The V1_V2 region profiles all contained an amplicon of length 342 bp, which presumptively could be identified as Pseudomonas species, Burkholderia species (but not B. cenocepacia) or Ralstonia species. The 342 bp amplicon is indicative of P. aeruginosa, since control CF isolates produced a fragment of this size. In addition, the patients were clinically diagnosed with P. aeruginosa. The high relative fluorescence of the 342 bp amplicon also indicated a severe infection, which would be in agreement with culturing results. The intense fluorescence of the fragment in seven samples may have skewed the community profile, causing it to appear as if there was only one type of bacteria present. The V1 region for these seven samples show some shared amplicons, but largely they had very diverse microbial communities. Although some amplicons (presence of, not abundance) are common to patients, no two overall profiles were identical to each other (Table 1 and Figure 3b).
It was thought that there are only a few pathogens commonly found in the lungs of CF patients, due to clinical diagnostic procedures. 40 Molecular techniques have revealed that many more organisms plague the CF lung, which is in agreement with our findings. 17,18 Many of the samples in this study produced LH profiles that contained numerous amplicons, some in low abundance. These less dominant amplicons may represent bacteria that are not routinely cultured and are potentially responsible for some of the differences seen in disease manifestation between CF patients with the same genetic mutations. Recent Sanger sequencing projects have identified uncultured bacteria in the CF lung. 11,17 Deeper sequencing techniques, such as 454, could be used to identify other, less abundant and potentially uncultured organisms. 41 In order to understand the significance of these organisms within the CF lung, proper statistical tools for comparative metagenomics need to be developed.

Challenges of identifying organisms in complex communities
One of the downsides to the LH technique is that the organism found in a sample cannot be conclusively identified. We hypothesised that organisms could be presumptively identified based on the V1_V2 and V1 amplicon lengths that would be generated from LH-PCR. In this study, the fragment lengths for both regions were experimentally determined for one strain of four known CF pathogens. In addition, in silico analysis and AmpliQué were used to determine the expected amplicon length of a variable region for any given organism. This information was correlated with the LH profile in an attempt to identify the bacteria present in the CF sputum. The bioinformatics approach highlighted the presence of intra-species length variation present in both regions. The high degree of variation seen in different isolates drastically decreased the ability to identify a bacteria based on the length of one hypervariable region. It was previously known that one amplicon length could represent multiple species, thereby allowing only a presumptive identification. Based on the absence of a peak, a specific genus or species could be ruled out from being a member of the community. The AmpliQué output clearly demonstrates that one V1_V2 fragment could represent a multitude of species or genera. The wide range of fragment lengths generated for different isolates of the species may be accurate or it may be due to problems related to the database used in the program. The database contains sequences which are determined by various users. It is possible that some of the length heterogeneity at the isolate level may arise from poor sequencing reads, incorrect trimming of sequences, and old taxonomy references (some organism names have changed). Database issues will change as more genome-wide sequencing is performed. As it stands currently, identification is nearly impossible based solely on amplicon length from the V1_V2 region.
The V1 region, which was more informative when looking at the community profile as a whole, proved to be more problematic when analysing individual peaks. Chromosomal DNA of P. aeruginosa, Staphylococcus aureus, Stenotrophomonas maltophilia and B. cenocepacia were amplified using the V1 primers and the fragment lengths were detected. In silico analysis was used to determine fragment lengths for the CF-associated lung pathogens, including those listed above (although not always the same isolate). The initial analysis indicated the presence of length heterogeneity within species. AmpliQué was used to determine the degree of strain variation present in the V1 region. The primer binding parameter was initially set to 100 per cent stringency, which resulted in a very small database of organisms and their corresponding fragment lengths. Many CF pathogens, including P. aeruginosa, were not present at this high stringency. By lowering the stringency, AmpliQué generated more results, yet no information was given for P. aeruginosa. The V1 region primers (P1F and P2R) experimentally amplified P. aeruginosa PAO1. To determine the putative primer binding site, the 16S rRNA sequence for PAO1 was manually examined. Both the forward and reverse primer sequences show some degeneracy. Lowering the stringency parameter was unable to correct for this degeneracy. This issue can be resolved by determining the binding motif of the primers using programs such as iterative enhancement of motifs (IEM) 42 and then using that motif in AmpliQué. Although P1F is a common V1 primer used extensively in the laboratory, other, less degenerate V1 forward primers, such as 27F (which was used to amplify V1_V2), could be used when attempting to identify organisms based on amplicon length. Selection of primers for LH-PCR is critical to ensure that all bacteria are being amplified and to increase the success of identification based on fragment length. The widely used primers, 27F and 355R, are based on Escherichia coli sequence and should be thought of as generalised primers and not universal primers. 43 Primers that have been developed with a bioinformatics approach may prove to be more useful in future LH studies. 43 Due to the limitations of the V1 AmpliQué database, identification using two regions simultaneously could not be performed. Therefore, identification based on two regions has yet to be proven. Further modification of AmpliQué and/or the use of different primers or variable regions may eventually lead to LH-PCR being used as an identification technique. At this time, LH-PCR can still be useful when looking at a bacterial community as a whole.
The eubacterial communities in CF lungs are dynamic The LH-PCR profiles provided a view of the CF bacterial community as a whole. It is known that in soil and water samples, the members of a community and their abundances change, and these changes can be driven by external factors. 44,45 Thus, it is hypothesised that the bacterial community within the CF lung also changes based on external factors. To understand the changes in the community, LH-PCR was used to profile three patients over a period of time. Samples from patient UML were used to determine the bacterial community variation over a short period of time (three hours) and over a long period of time (two years). Samples from patients UMY and UMX were used to study how antibiotics may affect the lung community.
Sputum samples from patient UML were analysed over the course of two years (M0, M3, M16, M17, M22.1, M22.2, M22.3, M22.4 and M22.5). Five samples were taken on the same day, every three hours for 12 hours (M22.1 -M22.5). For both the V1 and the V1_V2 LH profiles, the overall bacterial community was more similar within the one day than across multiple time periods over two years (Figures 4 and 5). The LH profiles from the samples taken in one day were not identical. These variations may arise from transient bacteria that were present in the oral cavity at the time of sampling (such as Lactobacillus from yogurt). Analysing the samples with both regions further demonstrated that bacterial communities change over time. By combining regions, the data indicated increased similarity of the samples taken in one day, which was due to the more abundant organisms, not the less dominant amplicons in the profile that may represent the transient bacteria in the oral community. The dissimilarity between the samples taken months apart was increased due to the presence of the more dominant, possibly disease-causing, organisms (Figures 4c and 5c). This LH analysis is the first to show that colonisation of the CF lung is dynamic and that these changes can be tracked.
To date, the dynamics of the bacterial community in the CF lung in response to external drivers such as the presence of antibiotics has yet to be studied. Thus, a short-term study was performed in an attempt to understand how the overall community in the lung changes when under attack by antibiotics. To begin, CF sputum samples from patient UMX and patient UMY were analysed every other month for five months. During this time, patients were taking a combination of antibiotics.
During the five-month period, patient UMX was taking azithromycin, co-trimoxazole and tobramycin. Shortly before the third time point, the patient stopped taking tobramycin. The patient's profiles contained a large number of V1 amplicons, whose presence and abundance changed throughout the time course (Table 2). For V1_V2, patient UMX's profiles consisted predominantly of amplicon 342 at all time points (Table 2). This fragment was presumptively identified as P. aeruginosa, which concurred with the patient's clinical diagnosis. Interestingly, UMX had been prescribed tobramycin (which targets P. aeruginosa) for the majority of the sampling period, yet the 342 bp amplicon was always present in large abundance (74-100 per cent). LH analysis with both the V1 and V1_V2 regions indicated a dynamic environment within the lungs of UMX. These changes seen over the time period are likely to be an effect of the antibiotic regime.
Patient UMY's profile was more dynamic than that of patient UMX. Patient UMY's sputum showed a high level of abundance for the 342 bp amplicon between the first two samplings, followed by a drastic decrease in the abundance of the 342 bp amplicon in the profile from the third sampling (Table 2). This change in profile may correspond to the patient's antibiotic regime polymyxin E specifically, replacing levofloxacin. This may have caused the pathogen associated with that amplicon to disappear. As the abundance of the 342 bp amplicon decreased, a 348 bp amplicon emerged. It is possible that as one pathogen was cleared, it was replaced with another bacterium that fitted that niche. 46 The decreasing 342 bp amplicon may represent P. aeruginosa, which had been cultured from the patient. Levofloxacin and polymyxin E are both used to target against this pathogen.
There are studies that have demonstrated that levofloxacin is less effective in treating P. aeruginosa than the first-generation fluoroquinolones. 47 Antibioticresistant strains of P. aeruginosa are often treated with polymyxin E. 48,49 It could be hypothesised that patient UMY was infected with an antibiotic-resistant strain of P. aeruginosa which may have been susceptible to the stronger antibiotic. The patient's lung infection was then dominated by a bacterium represented by the 348 bp amplicon, which may belong to H. influenzae or Stenotrophomonas maltophilia. Changes in the bacterial community were also seen in the V1 profiles (Table 2). Due to the limitations of AmpliQué, it is not clear if similar changes in abundance were seen in peaks that may represent P. aeruginosa.
The microbial community was more dynamic in UMY, a 60-year-old Caucasian man, than in UMX, a 23-year-old Hispanic woman. The number of unique amplicons in UMX and UMY were eight and 14, respectively. UMX had five (73, 77, 103, 112 and 342 bp) and UMY had 11 (62, 72, 76, 78, 81, 91, 94, 98, 102, 112 and 342 bp) amplicons that were present at all three time points. These constant amplicons may come from chronic colonisers that no longer respond to the treatment regimen. The uniqueness, diversity and selective abundance of certain amplicons could be attributed to the age, gender and ethnicity. The patients' respective lung flora may have adapted to their immune response and the presence of antibiotics differently. The variations in profiles due to age and antibiotic treatment have been previously demonstrated. 5,50 Clearly, LH can be used to analyse changes in a microbial community. Although this study used a small sample size over a relatively short period of time, changes in the presence and abundance of amplicons in the LH profiles could be seen. To identify the drivers of the community accurately, a more comprehensive long-term study needs to be performed. Armed with the patient's antibiotic information and long-term sampling, it should be possible to determine the effectiveness of a drug against pathogens in the CF lung. This could eventually lead to more effective treatments.
In conclusion, LH-PCR analysis detected the dynamic and complex flora that is present in South Florida CF patients. Although LH-PCR could not be used to identify bacteria in this study, it still proved to be a useful community profiling technique. LH-PCR of the V1_V2 and, to a greater extent, the V1 region of the 16S rRNA gene can be used to compare microbial communities between samples. Interpreting LH profiles may provide insight into the evolution of a microbial community and identify the factors that drive these changes. Technologies such as pyrosequencing will probably be used to identify the members present in the bacterial community; however, LH-PCR is still an accessible tool that can be implemented to study microbial communities.