Update of human and mouse forkhead box (FOX) gene families

The forkhead box (FOX) proteins are transcription factors that play complex and important roles in processes from development and organogenesis to regulation of metabolism and the immune system. There are 50 FOX genes in the human genome and 44 in the mouse, divided into 19 subfamilies. All human FOX genes have close mouse orthologues, with one exception: the mouse has a single Foxd4, whereas the human gene has undergone a recent duplication to a total of seven (FOXD4 and FOXD4L1 → FOXD4L6). Evolutionarily ancient family members can be found as far back as the fungi and metazoans. The DNA-binding domain, the forkhead domain, is an example of the winged-helix domain, and is very well conserved across the FOX family and across species, with a few notable exceptions in which divergence has created new functionality. Mutations in FOX genes have been implicated in at least four familial human diseases, and differential expression may play a role in a number of other pathologies -- ranging from metabolic disorders to autoimmunity. Furthermore, FOX genes are differentially expressed in a large number of cancers; their role can be either as an oncogene or tumour suppressor, depending on the family member and cell type. Although some drugs that target FOX gene expression or activity, notably proteasome inhibitors, appear to work well, much more basic research is needed to unlock the complex interplay of upstream and downstream interactions with FOX family transcription factors.


Introduction
Gene expression is controlled at multiple levels, including modulation of transcriptional activity, mRNA processing, and post-translational modification of proteins. The forkhead box (FOX) gene family encodes proteins that regulate the transcription of genes participating in a number of functions -including development of various organs, regulation of senescence or proliferation, and metabolic homeostasis. The first FOX gene to be discovered was forkhead ( fkh) in Drosophila, which, when mutated, gives the insect a fork-headed appearance. 1 Independently, another group characterised FOXA1 in the rat. 2 In 1990, Weigel et al. discovered that these two proteins shared a similar DNA-binding domain, and named this domain the forkhead domain (also referred to as the wingedhelix domain); this domain is well conserved among all FOX family members. 3 At about 100 amino acids in length, the prototypical forkhead domain is monomeric and consists of three alphahelices, three beta-sheets and two large loops ('wing' regions) that flank the third beta-sheet. 4 In 1993, the crystal structure of the forkhead domain  bound to DNA was solved for FOXA1. 5 Since  then, several other structures have been solved,  including the DNA-binding domains of FOXK1,  FOXK2, FOXM1, FOXO1, FOXO3, FOXO4,  FOXP1 and FOXP2 (Protein Data Bank search 6 ).
Before 2000, FOX genes lacked a unified naming convention and were assigned a confusing array of names by the researchers who discovered them. The winged helix/forkhead nomenclature committee defined the FOX family as all genes/ proteins having sequence homology to the canonical winged helix/forkhead DNA-binding domain; subclasses FOXA to FOXO were defined based on a phylogenetic analysis of the forkhead domain (other domains were highly divergent, and alignment was unclear between subclasses). 7 Since then, the family has been expanded to include subclasses FOXP to FOXS.

The FOX gene family
The FOX family consists of 50 members in the human genome ( plus two known pseudogenes, FOXO1B and FOXO3B) and 44 members in the mouse genome (Table 1). Ancient and diverse, the FOX family plays a role in a wide array of developmental, general and tissue-specific functions, and thousands of papers and hundreds of reviews have been published on the topic (PubMed search, April 2010). In 2007, Tuteja and Kaestner published a snapshot of human FOX genes, including a table of potential regulatory partners, cellular and developmental roles, known mouse mutant phenotypes and roles in human disease. 8,9 FOXO subfamily A great deal of attention has been given to the FOXO subfamily, which plays a role in the regulation of metabolism, oxidative stress resistance and cellcycle arrest. Under fasting conditions, FOXO transcriptionally activates insulin-responsive genes, which include genes encoding enzymes responsible for gluconeogenesis (including glucose-6-phosphatase and phosphoenolpyruvate carboxykinase) in the liver. If nutrients are plentiful, however, insulin activates the phosphatidylinositol 3-kinase (PI3K) pathway, which causes AKT/protein kinase B to phosphorylate FOXO, excluding it from the nucleus and turning off insulin-responsive genes. 10 FOXO1 appears to be the primary protein in this pathway, but there is overlap in many FOXO functions. In pancreatic beta cells, FOXO1 has been shown to protect against glucotoxic and lipotoxic oxidative stress by upregulating manganese superoxide dismutase and catalase. 11 FOXO1 has also been shown to be a cell-cycle inhibitor in beta cells 12 and in lipocytes with a correlated increased expression of p2. 13 The posttranscriptional regulation of the FOXO family allows for complex and sensitive actions from each family member. The variety of possible phosphorylation, acetylation, methylation and O-linked glycosylation has been dubbed the FOXO code. 14 FOXA subfamily FOXA proteins have been shown to act as 'pioneer' factors -proteins that can open tightly compacted chromatin without the involvement of the switching -sucrose non-fermenting (SWI-SNF) chromatin remodelling complex. 15 This is accomplished by interaction of the C-terminal domain of FOXA proteins with histones H3 and H4. This helps FOXA factors to regulate the development of multiple organ systems -including liver, pancreas, lung, prostate and kidney. 16 The 'pioneer' function of FOXA has also been shown to facilitate the binding of nuclear hormone receptors, including the glucocorticoid receptor and oestrogen receptor (ER). In the adult, FOXA proteins have been shown to play a role in metabolism -for example, in the expression of gluconeogenic enzymes in the liver in response to fasting, and energy utilisation by adipose tissue in response to excess caloric intake.

FOXP subfamily
The FOXP subfamily plays a role in immune response; specifically, constitutively expressed FOXP3 is considered a critical biomarker for thymus-derived natural T reg cells. FOXP3 expression is required for self-tolerance and immune homeostasis. 17 The forkhead domain of FOXP members is different from that in other FOX family members: wing 1 is truncated and wing 2 forms a helix rather than a loop; in addition, the forkhead domain is located near the C-terminus, rather than the N-terminus, as in most subclasses. FOXP members are also unusual in that they can form dimers by domain swapping (two monomers interact by exchanging helix H3 and strands S2 and S3). 18 The orientation of the dimers requires the protein to bind opposing (non-adjacent) DNA sites; the result is that FOXP proteins may participate in the regulation of higherorder protein -DNA complexes. FOXP2 is involved in the development of speech, and a mutation in this gene has been linked to speech and language disorders. 19

Evolution of the FOX genes
As mentioned previously, the 100-amino acid forkhead-binding domain is well conserved across species and families. This domain is often exclusively used for phylogenetic analysis. Figure 1 shows a neighbour-joining dendrogram of human and mouse forkhead domains. Nineteen subfamilies, denoted by different letters (A, B, C, etc.) can be distinguished based on evolutionary divergence. Note that FOXN proteins are split into two distinct subgroups, at the top and bottom of the dendrogram, and that the FOXL proteins are also split into different branches. Figure 2 shows the same alignment using the full FOX protein sequence. Due to high divergence on either side of the forkhead domains, the full sequence is difficult to align between subfamilies, but has been helpful in defining each subfamily. 7 In Figure 2, however, one again can see the splitting of the FOXN proteins and the FOXL proteins into distinctly separate branches. This global alignment also divides the FOXJ members into far different branches. One can conclude that alignment of the forkhead domain only provides a better assessment of evolutionary divergence.
Where possible, the nomenclature committee gave the same name to orthologues across species. Based on analysis of full sequences, mice have orthologues of all human FOX genes with high sequence similarity, with one exception. The murine FOXD4 protein clusters together with seven human proteins -FOXD4, and the FOXD4-like FOXD4L1 to FOXD4L6, and shares the most identity with FOXD4L1. The duplications that gave rise to the FOXD4 group appear to be relatively recent -that is, during hominid evolution. Very little information is available about these proteins, but researchers have shown that at least two FOXD4L genes are transcriptionally active; furthermore, evidence of purifying selection in the forkhead domain of these proteins suggests that they may play a physiological role. 24 Many authors have performed detailed phylogenetic analyses, but the first analysis and naming scheme has been generally upheld, 4 with a few criticisms illustrated in Figures 1 and 2. In analysis of the forkhead domain, FOXR falls within the FOXN subclass and some authors have proposed combining these two groups. In analysis of the full sequence, however, FOXR1 and FOXR2 are associated more closely with FOXN1 and FOXN4, but not with FOXN2 and FOXN3. Thus, some researchers have proposed splitting these into three subclasses. Finally, in many analyses, FOXL1 and FOXL2 do not cluster together.
In an analysis of the origin and expansion of early transcription factors, it was found that FOXJ1 was probably the oldest family member, present in the opisthokont last common ancestor of the fungi and metazoans. 25 Expansion occurred early, with bilaterians having 19 FOX genes and most mammals more than 40.

FOXs in disease
Immune dysregulation/polyendocrinopathy/enteropathy/X-linked syndrome (IPEX) is a human disease caused by a mutation in FOXP3. The disorder is characterised by a wide range of autoimmune symptoms, including type 1 diabetes, eczema, food allergy, thyroid disorders and inflammatory bowel disease. 26 The link between FOXP3 and autoimmune disease has been ascribed to the function of T reg cells; replacement or stimulation of these cells has been suggested as an aetiology for a number of autoimmune disorders. 17 Given their key role in the expression of many genes that affect cell proliferation and survival, FOX family members have been suggested as possible cancer therapeutic targets. FOX family members have been shown to be up-or downregulated in many cancers. FOXP1 has been suggested as a tumour promoter and an oncogene, depending on the cell type, although these observations are primarily based on correlations between mRNA levels and clinical outcomes. 27 FOXA1 is a pioneer factor, which is required for the expression of many oestrogen-responsive genes. Nakshatri and Badve 28 suggest that maintenance of FOXA1 expression may force ER-alpha-positive breast cancers to remain oestrogen dependent, increasing their responsiveness to anti-oestrogens. There is a correlation between retinoic acid (a FOXA1 inducer) and growth inhibition in cells, and insulin (which inhibits FOXA1) and anti-oestrogen resistance. 14  FOXM1 is an oncogene that is highly expressed in most carcinomas, but expressed in low amounts in normal cells, and has been recently identified as a key target of both well known and new classes of proteasome inhibitors. 29 The Online Mendelian Inheritance in Man (OMIM) database lists four known FOX genes that cause human diseases: FOXC1 mutations elicit dominantly inherited glaucoma phenotypes, FOXC2 mutations lead to lymphoedemadistichiasis syndrome, FOXP2 loss of function leads to language acquisition defects and FOXP3 mutations are associated with severe autoimmune disorders such as IPEX. 4

Conclusions
It is clear that the FOX family is an important and complex family of proteins that is a tantalising therapeutic target in many disease states -especially given their role in the regulation of metabolism (metabolic syndromes), the immune system (autoimmunity) and proliferation (cancer). The family includes 50 genes in human and 44 in mouse. Each of these targets requires a protein-and tissue-specific approach. Thus, more basic research will be required to understand their regulation and activity by identifying upstream and downstream protein partners, mechanisms of action, spatiotemporal expression and post-translational modifications.