Salamander Hox clusters contain repetitive DNA and expanded non-coding regions: a typical Hoxstructure for non-mammalian tetrapod vertebrates?
© Voss et al.; licensee BioMed Central Ltd. 2013
Received: 10 January 2013
Accepted: 25 January 2013
Published: 5 April 2013
Hox genes encode transcription factors that regulate embryonic and post-embryonic developmental processes. The expression of Hox genes is regulated in part by the tight, spatial arrangement of conserved coding and non-coding sequences. The potential for evolutionary changes in Hox cluster structure is thought to be low among vertebrates; however, recent studies of a few non-mammalian taxa suggest greater variation than originally thought. Using next generation sequencing of large genomic fragments (>100 kb) from the red spotted newt (Notophthalamus viridescens), we found that the arrangement of Hox cluster genes was conserved relative to orthologous regions from other vertebrates, but the length of introns and intergenic regions varied. In particular, the distance between hoxd13 and hoxd11 is longer in newt than orthologous regions from vertebrate species with expanded Hox clusters and is predicted to exceed the length of the entire HoxD clusters (hoxd13–hoxd4) of humans, mice, and frogs. Many repetitive DNA sequences were identified for newt Hox clusters, including an enrichment of DNA transposon-like sequences relative to non-coding genomic fragments. Our results suggest that Hox cluster expansion and transposon accumulation are common features of non-mammalian tetrapod vertebrates.
KeywordsHox Salamander Genome Evolution
Bilaterian body plans are determined in part by DNA transcription factors called Hox genes [1–4]. Excepting fish, vertebrate Hox genes are ordered among four unlinked clusters that each span relatively short segments of genomic DNA (generally 100–200 Kb). The arrangement of Hox genes on chromosomes is co-linear with their pattern of transcription along the anterior-posterior and proximal-distal body axes during embryonic development [5, 6]. The organization and structure of Hox gene clusters and associated non-coding regulatory elements are mostly conserved across vertebrates [7, 8]. However, as genomic studies extend to non-genetic model organisms, variations in Hox cluster structure are being discovered, including variations in gene number, repetitive sequence content, cluster length, and non-coding sequence conservation [9–15]. These variations suggest that the evolution of Hox cluster structure may correlate with phylogeny, unique modes of vertebrate development, and/or derived morphological characteristics.
In tetrapod vertebrates, stereotypic patterns of Hox expression are observed along the proximal-distal axes of developing limbs . In most species, Hox developmental genetic programs are only expressed during limb development. However, salamanders reactivate Hox gene expression throughout life to correctly pattern tissues within regenerating limbs [17–21]. While some patterns of Hox expression in regenerating limbs recapitulate the expression pattern in developing limbs, spatial and temporal differences are observed [18–21]. This raises the possibility that salamander Hox clusters may contain non-coding elements that uniquely regulate post-embryonic, tissue regeneration; such elements may not be expected within Hox clusters of vertebrates incapable of limb regeneration. There is another reason to suspect that salamander Hox clusters may differ from other vertebrate taxa—salamanders as a group have extremely large genomes. An average sized salamander genome is approximately 10× larger than the Homo sapiens genome; some salamanders have genomes that are 30× larger . This larger genome size is reflected in the structure of genes, as salamander introns are longer on average than orthologous introns in other vertebrates [23, 24].
Belleville et al.  reported that two pairs of adjacent Hox cluster genes from the red spotted newt (Notophthalamus viridescens) presented highly conserved coding and non-coding sequences relative to orthologous mammalian Hox sequences. These results suggested that Hox cluster evolution is constrained even within the context of a very large vertebrate genome (>20 pg/haploid nucleus) . However, the results that we present below show that newt Hox clusters are more variable than originally thought. Sequencing of large genomic fragments (>100 Kb) reveals regional variation in length across newt Hox cluster regions and higher proportions of DNA transposon-like sequences within Hox introns and intergenic sequences than non-coding genomic regions. Our results show that expanded non-coding regions and relatively high repetitive DNA sequence content are typical of Hox clusters in amphibians and other non-mammalian tetrapod vertebrates.
Results and discussion
BAC library screening, sequencing, assembly, and annotation
Species comparison of HoxC and HoxD intron lengths
In annotating HoxD cluster genes, we discovered that hoxd11 was located approximately 73 kb from the terminus of NV_H385F1. This distance, which provides a minimum estimate to the expected position of hoxd13 (hoxd12 is not known for amphibians [15, 26]), predicts the newt hoxd13–11 segment to be > 4.5× and 1.5× longer than orthologous HoxD regions from frog and lizard. It also exceeds the length of hoxd11–13 segments in the coelacanth and a caecilian amphibian (Typhlonectes natans) , which until this study was thought to be longest among vertebrates (Figure 4). While it is possible that the expanded region is explained by an evolutionary loss of the newt hoxd13 gene, this seems unlikely because hoxd13 orthologs are known for related salamanders , and we did not detect the presence of a pseudogene nucleotide signature. Because expanded Hox clusters have been shown for a representative caecilian  and anuran species , parsimony suggests the expansion of the hoxd11–13 region to be a shared derived characteristic of amphibians, with convergent expansion of the same region in lizard.
Interspersed repeat sequences in BACs
Percent coverage of salamander genomic sequences by newt-specific and RepBase repetitive elements
Total RepBase repeats
Long terminal repeat retrotransposons
Non-long terminal repeat retrotransposons
Salamander Hox genomic regions show elements of conservation and diversity in comparison to other vertebrate species. Whereas the structure and organization of Hox coding genes is conserved, newt Hox clusters show variation in the lengths of introns and intergenic regions, and the hoxd13–11 region exceeds the lengths of orthologous segments even among vertebrate species with expanded Hox clusters. We posit that the hoxd13–11 expansion predated a basal salamander genome size increase that occurred approximately 180 million years ago  as it is preserved in all three extant amphibian groups. Over more recent timescales, additional evidence supports the idea that Hox clusters are amenable to structural evolution: there is variation in the lengths of introns and intergenic regions, relatively high numbers of repetitive sequences, and non-random accumulations of DNA transposons in newts and lizards. The non-random accumulation of DNA-like transposons could potentially alter developmental programming by creating sequence motifs for transcriptional regulation [33–35]. Overall, available data from several non-mammalian tetrapods suggest that Hox structural flexibility is the rule, not the exception. We speculate that such flexibility may contribute to developmental variation across non-mammalian taxa, both in embryogenesis and during the re-deployment of Hox genes during post-embryonic developmental processes, such as metamorphosis and regeneration.
BAC library construction, screening, and sequencing
The Clemson University Genomics Institute constructed a BAC library from partially restriction digested and size-selected genomic DNA that was isolated from the erythrocytes of a single Notophthalamus viridescens female (University of Dayton Institutional Animal Care and Use Committee Protocol # 011–12). A total of 41,472 clones were arrayed in 108 × 384 well plates. Superpools of clones were made by combining clones from twelve 384 well plates into a single pool. DNA was extracted from 400 ml of overnight cultures of superpools using the Plasmid MaxiPrep kit (Qiagen, Valencia, CA, USA), and the DNA pellet was re-suspended in 250 μl of water. PCR primers for newt hoxc10 (forward: CAAAGAGAAAACGCGGAAAG; reverse: CGATACCGTCCCTTCCATAA) and hoxd10 (forward: TTTCCATTGTCGGTTTTTCC; reverse: TCCTACCACGGACATTACCC) were used to identify two Hox gene-containing BACs and two BACs that did not contain protein-coding sequence. The four clones were grown in 400 ml L-broth, and DNA was isolated using the Qiagen Large Construction Kit (Qiagen); genomic DNA contamination was reduced using Plasmid–Safe DNAse treatment (Epicentre Biotechnologies, Madison, USA). The Roche GS FLX Titanium platform (Basel, Switzerland) was used to sequence BACs; the work was accomplished by the staff of the University of Iowa Sequencing Core Facility. The termini of BAC inserts were end-sequenced using Sanger technology and ABI Big-Dye 3.1 (Invitrogen, Grand Island, USA).
DNA sequence assembly and annotation
Sequences were screened to trim vector, adapters, and contaminating Escherichia coli sequences. After an initial assembly using GS De Novo Assembler (454 Life Sciences, Branford, USA), contigs and singletons were assembled further using DNASTAR SeqMan (DNASTAR, Inc., Madison, USA). Contiguous sequences of assembled BACs were searched (blastn) against salamander expressed sequence tagged contigs at Sal-Site ; non-redundant nucleotide and protein databases at NCBI (blastx and tblastp)  were used to identify and annotate gene regions. For multispecies comparisons, genomic sequences for H. sapiens (GRCh37.10), and M. musculus (GRCh38.1), were obtained from NCBI. Anolis carolinesis (AnoCar 2.0) and D. rerio (Zv9) were obtained from Ensembl . X. tropicalis (build 7.1) was obtained from Xenbase . Sequences were aligned using MultiPipMaker . Annotated repeats were identified by searching re-assembled BAC clones against all deposited repeats in RepBase . Newt-specific repeats were identified using MultiPipmaker  by aligning re-assembled BAC clones against each other and by performing self-self BAC alignments. The “search both strands” and “high sensitivity” options were used in MultiPipmaker to identify significantly similar non-coding sequences that are located to different positions either within or between BACs. The terminal base pair positions for these alignments were recorded to denote the positions of repetitive sequences within BACs. If the two repeats occurred within 50 bp of each other, they were compiled as a single repetitive sequence with the most terminal base positions denoting the repeat span. The base pair coordinates for newt-specific repetitive sequences were combined with base pair coordinates for RepBase repetitive sequences to generate an underlay file (Additional file 1: Table S1), and this was used to create maps of repetitive elements for the HoxD and HoxC genomic regions.
The research was supported by grant R24-OD010435 (SRV) and EY-10540 (PAT) from the National Center for Research Resources (NCRR), a component of the National Institutes of Health (NIH). The project also used resources developed under Multidisciplinary University Research Initiative grant (W911NF-09-1-0305) from the Army Research Office (SRV) and resources from the Ambystoma Genetic Stock Center, which is funded by grant DBI-0951484 from the National Science Foundation (SRV). The contents of this paper are solely the responsibility of the authors and do not necessarily represent the official views of NCRR, NIH, ARO, or NSF.
- Lewis EB: A gene complex controlling segmentation in Drosophila. Nature. 1978, 276: 565-570. 10.1038/276565a0.View ArticlePubMedGoogle Scholar
- Krumlauf R: Hox genes in vertebrate development. Cell. 1994, 78: 191-201. 10.1016/0092-8674(94)90290-9.View ArticlePubMedGoogle Scholar
- Duboule D: The rise and fall of Hox gene clusters. Development. 2007, 134: 2549-2560. 10.1242/dev.001065.View ArticlePubMedGoogle Scholar
- Lemons D, McGinnis W: Genomic evolution of Hox gene clusters. Science. 2006, 313: 1918-1922. 10.1126/science.1132040.View ArticlePubMedGoogle Scholar
- Kmita M, Duboule D: Organizing axes in time and space; 25 years of colinear tinkering. Science. 2003, 301: 331-333. 10.1126/science.1085753.View ArticlePubMedGoogle Scholar
- Tschopp P, Duboule D: A genetic approach to the transcriptional regulation of Hox gene clusters. Annu Rev Genet. 2011, 45: 145-166. 10.1146/annurev-genet-102209-163429.View ArticlePubMedGoogle Scholar
- Garcia-Fernandez J: The genesis and evolution of homeobox gene clusters. Nat Rev Genet. 2005, 6: 881-892.View ArticlePubMedGoogle Scholar
- Duboule D: Temporal colinearity and the phylotypic progression: a basis for the stability of a vertebrate Bauplan and the evolution of morphologies through heterochrony. Dev Suppl. 1994, 1994: 135-142.Google Scholar
- Aparicio S, Chapman J, Stupka E, Putnam N, Chia JM, Dehal P, Christoffels A, Rash S, Hoon S, Smit A, Gelpke MD, Roach J, Oh T, Ho IY, Wong M, Detter C, Verhoef F, Predki P, Tay A, Lucas S, Richardson P, Smith SF, Clark MS, Edwards YJ, Doggett N, Zharkikh A, Tavtigian SV, Pruss D, Barnstead M, Evans C: Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science. 2002, 297: 1301-1310. 10.1126/science.1072104.View ArticlePubMedGoogle Scholar
- Hoegg S, Meyer A: Hox clusters as models for vertebrate genome evolution. Trends Genet. 2005, 21: 421-424. 10.1016/j.tig.2005.06.004.View ArticlePubMedGoogle Scholar
- Kurosawa G, Takamatsu N, Takahashi M, Sumitomo M, Sanaka E, Yamada K, Nishii K, Matsuda M, Asakawa S, Ishiguro H, Miura K, Kurosawa Y, Shimizu N, Kohara Y, Hori H: Organization and structure of Hox gene loci in Medaka genome and comparison with those of pufferfish and zebrafish genomes. Gene. 2006, 370: 75-82.View ArticlePubMedGoogle Scholar
- Woltering JM, Vonk FJ, Müller H, Bardine N, Tuduce IL, de Bakker MAG, Knöchel W, Sirbu IO, Durston AJ, Richardson MK: Axial patterning in snakes and caecilians: evidence for an alternative interpretation of the Hox code. Dev Biol. 2009, 332: 82-89. 10.1016/j.ydbio.2009.04.031.View ArticlePubMedGoogle Scholar
- Di-Poi N, Montoya-Burgos JI, Duboule D: A typical relaxation of structural constraints in Hox gene clusters of the green anole lizard. Genome Res. 2009, 19: 602-610. 10.1101/gr.087932.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Matsunami M, Sumiyama K, Saitou N: Evolution of conserved non-coding sequences within the vertebrate Hox clusters through the two-round whole genome duplications revealed by phylogenetic footprinting analysis. J Mol Evol. 2010, 71: 427-436. 10.1007/s00239-010-9396-1.View ArticlePubMedGoogle Scholar
- Liang D, Wu R, Geng J, Chaolin Wang C, Zhang P: A general scenario of Hox gene inventory variation among major sarcopterygian lineages. BMC Evol Biol. 2011, 11: 25-10.1186/1471-2148-11-25.PubMed CentralView ArticlePubMedGoogle Scholar
- Zakany J, Duboule D: The role of Hox genes during vertebrate limb development. Curr Opin Genet Dev. 2007, 17: 359-366. 10.1016/j.gde.2007.05.011.View ArticlePubMedGoogle Scholar
- Savard P, Gates PB, Brockes JP: Position dependent expression of a homeobox gene transcript in relation to amphibian limb regeneration. EMBO J. 1988, 7: 4275-4282.PubMed CentralPubMedGoogle Scholar
- Gardiner DM, Blumberg B, Komine Y, Bryant SV: Regulation of HoxA expression in developing and regenerating axolotl limbs. Development. 1995, 121: 1731-1741.PubMedGoogle Scholar
- Torok MA, Gardiner DM, Shubin NH, Bryant SV: Expression of HoxD genes in developing and regenerating axolotl limbs. Dev Biol. 1998, 200: 225-233. 10.1006/dbio.1998.8956.View ArticlePubMedGoogle Scholar
- Khan PA, Tsilfidis C, Liversage RA: Hox C6 expression during development and regeneration of forelimbs in larval Notophthalmus viridescens. Dev Genes Evol. 1999, 209: 323-329. 10.1007/s004270050260.View ArticlePubMedGoogle Scholar
- Carlson MR, Komine Y, Bryant SV, Gardiner DM: Expression of Hoxb13 and Hoxc10 in developing and regenerating axolotl limbs and tails. Dev Biol. 2001, 229: 396-406. 10.1006/dbio.2000.0104.View ArticlePubMedGoogle Scholar
- Gregory TR: Animal Genome Size Database. 2010, http://www.genomesize.com,Google Scholar
- Casimir CM, Gates PB, Ross-Macdonald PB, Jackson JF, Patient RK, Brockes JP: Structure and expression of a newt cardio-skeletal myosin gene: implications for the C value paradox. J Mol Biol. 1992, 202: 287-296.View ArticleGoogle Scholar
- Smith JJ, Putta S, Zhu W, Pao GM, Verma IM, Hunter T, Bryant SV, Gardiner DM, Harkins TT, Voss SR: Genic regions of a large salamander genome contain long introns and novel genes. BMC Genomics. 2009, 10: 19-10.1186/1471-2164-10-19.PubMed CentralView ArticlePubMedGoogle Scholar
- Belleville S, Beauchemin M, Tremblay M, Noiseux N, Savard P: Homeobox-containing genes in the newt are organized in clusters similar to other vertebrates. Gene. 1992, 114: 179-186. 10.1016/0378-1119(92)90572-7.View ArticlePubMedGoogle Scholar
- Gerard M, Duboule D, Zakany J: Structure and activity of regulatory elements involved in the activation of the Hoxd-11 gene during late gastrulation. EMBO J. 1993, 12: 3539-3550.PubMed CentralPubMedGoogle Scholar
- Shashikant CS, Bolanowsky SA, Anand S, Anderson SM: Comparison of diverged Hoxc8 early enhancer activities reveals modification of regulatory interactions at conserved cis-acting elements. J Exp Zool B Mol Dev Evol. 2007, 308: 242-249.View ArticlePubMedGoogle Scholar
- Hornstein E, Mansfield JH, Yekta S, Hu JK, Harfe BD, McManus MT, Baskerville S, Bartel DP, Tabin CJ: The microRNA miR-196 acts upstream of Hoxb8 and Shh in limb development. Nature. 2005, 438: 671-674. 10.1038/nature04138.View ArticlePubMedGoogle Scholar
- Mannaert A, Amemiya CT, Bossuyt F: Comparative analyses of vertebrate posterior HoxD clusters reveal atypical cluster architecture in the caecilian Typhlonectes natans. BMC Genomics. 2010, 11: 658-10.1186/1471-2164-11-658.PubMed CentralView ArticlePubMedGoogle Scholar
- Smit AFA, Hubley R, Green P: RepeatMasker Open-3.0. 1996–2010,Google Scholar
- Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R, Hardison R, Miller W: PipMaker—a resource for aligning two genomic sequences. Gen Res. 2000, 10: 577-586. 10.1101/gr.10.4.577.View ArticleGoogle Scholar
- Zhang P, Wake DB: Higher-level salamander relationships and divergence dates inferred from complete mitochondrial genomes. Mol Phylogenet Evol. 2009, 53: 492-508. 10.1016/j.ympev.2009.07.010.View ArticlePubMedGoogle Scholar
- Lowe CB, Bejerano G, Haussler D: Thousands of human mobile element fragments undergo strong purifying selection near developmental genes. Proc Natl Acad Sci USA. 2007, 104: 8005-8010. 10.1073/pnas.0611223104.PubMed CentralView ArticlePubMedGoogle Scholar
- Mikkelsen TS, Wakefield MJ, Aken B, Amemiya CT, Chang JL, Duke S, Garber M, Gentles AJ, Goodstadt L, Heger A, Jurka J, Kamal M, Mauceli E, Searle SM, Sharpe T, Baker ML, Batzer MA, Benos PV, Belov K, Clamp M, Cook A, Cuff J, Das R, Davidow L, Deakin JE, Fazzari MJ, Glass JL, Grabherr M, Greally JM, Gu W: Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences. Nature. 2007, 447: 167-177. 10.1038/nature05805.View ArticlePubMedGoogle Scholar
- Borque G: Transposable elements in gene regulation and in the evolution of vertebrate genomes. Curr Opin Genet Dev. 2009, 19: 607-612. 10.1016/j.gde.2009.10.013.View ArticleGoogle Scholar
- Smith JJ, Putta S, Walker JA, Kump DK, Samuels AK, Monaghan JR, Weisrock DW, Staben C, Voss SR: Sal-Site: integrating new and existing ambystomatid salamander research and informational resources. BMC Genomics. 2005, 6: 181-10.1186/1471-2164-6-181.PubMed CentralView ArticlePubMedGoogle Scholar
- National Center for Biotechnology Information. http://www.ncbi.nlm.nih.gov,
- Ensembl. http://www.ensembl.org,
- Xenbase: A Xenopus Laevis and Xenopus Tropicalis Resource. http://www.xenbase.org,
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.