Organization, evolution and functions of the human and mouse Ly6/uPAR family genes

Members of the lymphocyte antigen-6 (Ly6)/urokinase-type plasminogen activator receptor (uPAR) superfamily of proteins are cysteine-rich proteins characterized by a distinct disulfide bridge pattern that creates the three-finger Ly6/uPAR (LU) domain. Although the Ly6/uPAR family proteins share a common structure, their expression patterns and functions vary. To date, 35 human and 61 mouse Ly6/uPAR family members have been identified. Based on their subcellular localization, these proteins are further classified as GPI-anchored on the cell membrane, or secreted. The genes encoding Ly6/uPAR family proteins are conserved across different species and are clustered in syntenic regions on human chromosomes 8, 19, 6 and 11, and mouse Chromosomes 15, 7, 17, and 9, respectively. Here, we review the human and mouse Ly6/uPAR family gene and protein structure and genomic organization, expression, functions, and evolution, and introduce new names for novel family members.


Introduction
The lymphocyte antigen-6 (Ly6)/urokinase-type plasminogen activator receptor (uPAR) superfamily of structurally related proteins is characterized by the LU domain, an 80 amino acid domain containing ten cysteines arranged in a specific spacing pattern that allows distinct disulfide bridges which create the three-fingered (3F) structural motif [1,2]. Ly6/uPAR proteins were first identified in the mouse over 35 years ago using antisera against lymphocytes [3]. Human homologs were subsequently isolated, leading to the recognition that they represent a well-conserved family with wide-ranging expression patterns and important functions. The fully annotated human and mouse genomes contain 35 and 61 Ly6/uPAR family members, respectively. Research over the last decade has begun to unravel the important functions of the encoded proteins. In this review, we provide an overview of the Ly6/uPAR gene family and their genomic organization, evolution, as well as functions, and provide a nomenclature system for the newly identified members of this family.
Inclusion and approved nomenclature for novel Ly6/uPAR family members Although Ly6/uPAR family members are related by their structure, absence of a uniform naming convention resulted in arbitrary nomenclature for these genes as they were discovered. As many of the currently approved gene symbols for Ly6/uPAR family members (e.g., CD59 and PLAUR) have been widely used in the scientific literature for many years, we have refrained from a family-wide attempt to standardize their well-established names, avoiding the potential for additional confusion. In compiling this update, we came across many novel members of the Ly6/uPAR gene family, especially in the mouse genome, that did not yet have a systematic name. We named these novel family members in line with the Ly6/uPAR genes that they are most related to, based on a phylogenetic analysis (see below) using either the established LY6/Ly6# root for those that fell within the LY6 clades, or the LYPD/Lypd# (LY6/PLAUR domaincontaining) root for those outside the LY6 clades. The new symbols for these genes (1 human and 18 mouse), approved by the HGNC (HUGO Gene Nomenclature Committee) [4,5] and MGNC (Mouse Genomic Nomenclature Committee) [6], are listed in Tables 1 and 2, respectively. We use the newly approved names for these genes in the rest of this update. HGNC have also created a gene family web-page for the human Ly6/uPAR family members (http://www.genenames.org/cgi-bin/genefamilies/set/1226).
Typical Ly6/uPAR gene structure Ly6/uPAR family members typically contain one LU domain, with the exception of LYPD3 [7] and CD177 [8] which contain two, and PLAUR [9], which contains three direct repeats of the LU domain (Tables 1  and 2). The mouse Cd177 differs from its human ortholog in that it contains four direct repeats of the LU domain. A typical Ly6/uPAR family gene consists of three exons and two introns (Fig. 1a), with the signal peptide being encoded in the first exon. The mature polypeptide is encoded by the last two exons, with the GPI-anchor domain encoded by the third exon.
Based on their subcellular localization, Ly6/uPAR family members are further subdivided into two groups: membrane-tethered (through a GPI-anchor domain) or secreted (lacking the GPI-anchor domain). GPIanchored Ly6/uPAR family members tend to congregate on lipid rafts on the cell surface, which promotes their interactions with other proteins. A fraction of the GPI-anchored Ly6/uPAR family proteins such as PLAUR are secreted after their GPI-anchor domain is proteolytically cleaved [10][11][12]. Experimental evidence supports the presence of a GPI-anchoring signal peptide in a majority of Ly6/uPAR family members, while it is absent in a few (Table 3). For those with no experimental evidence, the GPI-anchor signal predictor 'Pre-dGPI' program (http://gpcr.biocomp.unibo.it/predgpi/) [13] predicted the presence of a GPI-anchor signal within mouse and human LYPD8 and LY6L, and in mouse LYPD10, LYPD11, LYPD9, LY6F, and LY6M, while predicting its absence in mouse and human LYPD4, LY6G6F, and PINLYP, and in mouse GML2 and LY6G6 ( Table 3).

Structure of the LU domain
The Ly6/uPAR family members have a well-conserved LU domain with a characteristic three-finger structure formed by disulfide bridges connecting the conserved cysteine residues in a specific pattern. LU domains are topologically similar to the three-finger structure of snake venom neurotoxins, which have three β-sheet loops fixed in space by virtue of their unique disulfide bridges. The structure of the extracellular region of CD59 was first solved by 2D NMR methods [14,15] and further refined by crystallography [16] revealing it to be a flat, disk-shaped molecule consisting of a twostranded beta-sheet finger loosely packed against a protein core formed by a three-stranded beta-sheet and a short helix.
Alignment of LU domain amino acid sequences of selected human LY6/UPAR proteins performed using ProbCons (http://toolkit.tuebingen.mpg.de) revealed the location of conserved cysteines (Fig. 1b). Five wellconserved disulfide bridges between cysteine pairs 3 and 26, 6 and 13, 19 and 39, 45 and 63, and 64 and 69 stabilize the hydrophobic core, from which three βsheet-based fingers protrude (Fig. 1b). The sequence of the amino acids exposed at the tips of each finger as well as the length of each of the fingers is variable, providing the three-finger motif with the flexibility for a wide range of intermolecular interactions. In addition to the LU domain, Ly6/uPAR family proteins possess a wellconserved LeuXxxCysXxxXxxCys motif at the aminoterminus and CysCysXxxXxxXxxXxxCysAsn motif at the carboxyl-terminus (Fig. 1b). Functional relevance of these motifs is not yet known. Most Ly6/uPAR family proteins maintain the ten cysteines characteristic of the LU domain, with some notable exceptions. In PLAUR, which consists of three LU domains (designated D1, D2 and D3), only domain D2 is fully intact with ten cysteines, while domains D1 and D3 have seven and eight cysteines, respectively. Isoforms of proteins such as human LY6G5C maintain conservation throughout the LU domain in almost every isoform. In contrast, different isoforms of human LYNX1 maintain the necessary cysteines, but little else is conserved (Fig. 1b).

Expression of Ly6/uPAR family genes
The expression pattern, interacting factors, and cellular functions of the mouse and human Ly6/uPAR family members are summarized in Table 3. Expression of Ly6/uPAR proteins is (i) widespread and variable across diverse cell types and tissues, (ii) tightly regulated in a spatiotemporal manner, and (iii) often correlated with cellular differentiation. Although the Ly6/ uPAR family protein structures are well-conserved across species, their expression patterns tend to vary, indicating divergence among their regulatory networks. Many Ly6/uPAR family members are expressed in hematopoietic precursors in a lineage-specific fashion making them useful cell surface markers for leukocytes, facilitating identification of individual leukocyte subgroups [17][18][19]. For example, mouse myeloid differentiation marker LY6G (also called Gr-1) is expressed by the myeloid lineage cells in a developmentally regulated manner in the bone marrow. Anti-LY6G antibodies are routinely used to identify neutrophils in the mouse but not humans as there is no human ortholog for Ly6g. Ly6/uPAR family members are generally upregulated during inflammatory conditions or infections and in cancerous cells, with a notable exception of SLURP1, which is invariably downregulated in proinflammatory conditions [9,[20][21][22][23][24].

Functions of Ly6/uPAR family proteins
Commensurate with their varied expression patterns, Ly6/uPAR proteins have a wide range of functions in cell proliferation, migration, cell-cell interaction, immune cell maturation, macrophage activation, and cytokine production. They typically exert their influence by targeting nicotinic acetylcholine receptors (nAChRs) (reviewed in [1]). GPI-anchored Ly6/uPAR proteins lacking a cytoplasmic tail are unable to directly participate in intracellular signaling but can initiate signaling by interacting with other transmembrane proteins. Such interactions of GPI-anchored proteins are further facilitated by their tendency to congregate in lipid rafts on the cell surface, where other signaling molecules also are enriched. While GPI-anchored Ly6/uPAR proteins control signaling through interaction with their ligand(s), secreted Ly6/uPAR proteins may serve as agonists for   other receptors including nAChR and/or competing scavengers of their ligands [1,20,21,[25][26][27]. Many Ly6/ uPAR family members have a prominent role in neutrophils (Table 3) [28]. Below, we summarize the functions of a few well-studied members.

Prostate and testis expressed genes
Human chromosome 11 contains 5 prostate and testis expressed (PATE) genes while the syntenic region on murine Chromosome 9 contains 15 genes [29]. Recent evidence demonstrates that PATE proteins are much more predominantly expressed in the epididymis with a significantly lower expression in the prostate and testis, suggesting that their names are misnomers [30]. PATE proteins secreted by epithelial cells to the epididymal lumen facilitate spermatozoan maturation as they leave the testis and travel through the epididymis. PATE proteins localized in the sperm head assist in spermoolemma fusion and penetration [31]. Defects in PATE1 result in decreased sperm motility in aged men and young asthenozoospermia patients, revealing the molecular basis for the decline in sperm quality with age [32]. PATE4 is abundantly expressed in the mouse prostate, spermatozoa, and seminal vesicles. Pate4−/− mice remain fertile and do not display any histological abnormalities [33]. PATE proteins are also expressed in neuron-rich tissues, where they function by modulating nAChR activities [29].

Plasminogen activator, urokinase receptor
Also known as the urokinase-type plasminogen activator receptor (uPAR), plasminogen activator, urokinase receptor (PLAUR) is the most well-studied family member [9]. It is widely expressed in different cell types and plays a key regulatory role in cell surface plasminogen activation, influencing many normal and pathologic processes [9,23]. PLAUR consists of three direct repeats of the LU domain, which together bind urokinase-type plasminogen activator (PLAU/uPA) in both the proprotein and mature forms. PLAUR (i) expression is regulated by KLF4 [34] and is upregulated in cancer cells [35,36] and in response to pro-inflammatory conditions [37], (ii) facilitates neutrophil recruitment in response to bacterial infection [38], (iii) facilitates clearance of Borrelia infection [39], and (iv) interacts with multiple partners including vitronectin and different integrins.
Although the bulk of PLAUR exists as GPI-anchored, some of it is known to be secreted as "soluble uPAR" (suPAR), the expression level of which is correlated with disease conditions [10][11][12]40]. PLAUR is a multi-functional protein with important roles in regulating cell-matrix interaction, motility, and immune response. PLAUR expression levels directly correlate with the invasive potential of endometrial carcinomas, suggesting that it is a valuable prognostic marker for aggressive endometrial tumors [35]. PLAUR expression is normally low in healthy glomeruli and is elevated in glomeruli from individuals with focal segmental glomerulosclerosis, consistent with its role in regulating renal permeability [41]. PLAUR is required for neutrophil recruitment into alveoli and lungs in response to S. pneumoniae infection [42]. Plaur−/− macrophages display an enhanced ability to engulf wild-type neutrophils, but Plaur−/− neutrophils do not, suggesting that PLAUR plays an essential role in recognition and clearance of neutrophils [43]. Plaur−/− mice exhibit abnormal interneuron migration from the ganglionic eminence, and reduced interneurons in the frontal and parietal cortex [44,45].

CD177
Expressed by neutrophils, neutrophilic metamyelocytes, and myelocytes, CD177 mediates neutrophil migration across the endothelium by binding PECAM1 (CD31). Anti-CD177 antibodies inhibit neutrophil transmigration across the endothelial monolayer, potentially by interfering with an interaction between CD177 and PECAM1 [46]. Mutations in CD177 or its dysregulated expression are associated with myeloproliferative diseases, secondary to a gain-of-function mutation in JAK2 [8]. Exposure of human neutrophils to pulmonary endotoxin results in strong upregulation of CD177 [47]. Expression of CD177 mRNA is highly upregulated following endotoxin exposure. Overexpression of CD177 is a biomarker for thrombocythemia patients with elevated risk of thromboembolic complications [8]. While human CD177 contains nine exons that encode a protein with two LU domain repeats, mouse Cd177 is substantially larger with 17 exons that encode a larger protein with four LU domain repeats. Surprisingly, Cd177−/− mice displayed no discernible phenotype or any change in immune cells, other than decreased neutrophil counts in peripheral blood [47]. Absence of CD177 had no significant impact on CXCL1/KC-or fMLP-induced mouse neutrophil migration, but led to significant cell death [47].

Complement regulatory protein CD59
CD59 is an essential regulatory protein that protects hematopoietic and neuronal cells against complement-mediated osmolytic pore formation by binding C8 and/ or C9 and inhibiting the incorporation of C9 into the membrane attack complex [17,[48][49][50][51]. CD2-mediated CD59 stimulation results in secretion of IL1A (IL-1α), IL6, and CSF2 (GM-CSF) in keratinocytes [52]. Inadequate complement regulation is associated with agerelated macular degeneration [53]. Mutations in CD59 cause uncontrolled complement activation in hemolytic anemia, thrombosis, and cerebral infarction in paroxysmal nocturnal hemoglobinuria [54]. The mouse genome contains two homologs of CD59, termed Cd59a, and Cd59b. Mouse CD59B has approximately a sixfold higher specific activity than CD59A and is considered a true ortholog of human CD59. Cd59a deficiency exacerbated the skin disease and lymphoproliferative characteristic of the MRL/lpr murine lupus model suggesting that CD59A inhibits systemic autoimmunity in the MRL/lpr lupus model through a complement-independent mechanism [55]. Consistent with its higher specific activity, Cd59b−/− mice display a stronger phenotype including hemolytic anemia, anisopoikilocytosis, echinocytosis, schistocytosis, hemoglobinuria with hemosiderinuria, and platelet activation [56]. Cd59b−/− males suffer from progressive loss of fertility after 5 months of age [56].    terminal GPI-anchoring sequence [57]. It was initially identified as a prostate-specific cell surface antigen in normal male tissues and found to be highly expressed in human prostate cancer [58,59]. Later studies have revealed it to be more widely expressed. A genome-wide association study of Japanese patients with gastric cancer revealed that genetic variation in PSCA is associated with susceptibility to diffuse-type gastric cancer [60]. Psca−/− mice are viable, and fertile, with similar rates of spontaneous or radiation-induced primary epithelial tumor formation as the wild-type mice. However, Psca −/− mice display an increased frequency of metastasis suggesting that PSCA may play a role in limiting tumor progression, and deletion of Psca promotes tumor migration and metastasis [61].

Prostate stem cell antigen
GPI-anchored high density lipoprotein-binding protein 1 GPI-anchored high density lipoprotein-binding protein 1 (GPIHBP1) is an endothelial cell protein expressed on the luminal face of capillaries in brown adipose tissue, heart, lung, and liver. GPIHBP1 binds high density lipoprotein and provides a platform for lipoprotein lipase (LPL)-mediated processing of chylomicron lipoprotein particles which transport dietary lipids from the intestines to other locations in the body. GPIHBP1 mutations that affect its ability to bind LPL or chylomicrons are associated with chylomicronemia [62][63][64]. Gpihbp1−/− mice cannot transport lipoprotein lipase to the capillary lumen, resulting in mislocalization of lipoprotein lipase within tissues, defective lipolysis of triglyceride-rich lipoproteins, and chylomicronemia [62][63][64]. Defective lipolysis causes reciprocal metabolic perturbations in Gpihbp1−/− mouse adipose tissue and liver. The essential fatty acid content of triglycerides is decreased and lipid biosynthetic gene expression is increased in adipose tissue, while the opposite changes occur in the liver [65].

Ly6/neurotoxin-1
As an allosteric modulator of nAChR function, Ly6/ neurotoxin-1 (LYNX1) serves as a cholinergic brake that limits neuronal plasticity, balancing neuronal activity, and survival in the adult visual cortex [25,[66][67][68]. LYNX1 also inhibits SRC activation, suppressing mucin expression in the airway epithelium [69]. The LYNX1 gene is positioned in close proximity to SLURP2, leading to the mistaken idea that they are alternatively spliced isoforms of the same gene, a theory which was disproved recently [70]. LYNX1 is one of the genes that has shown accelerated evolution in humans relative to other primates, correlating with the increased brain size and complexity [71]. The juvenile brain exhibits high plasticity which is severely restricted in adulthood. Adult Lynx1−/− mice exhibited visual cortex plasticity similar to that of juveniles, suggesting that LYNX1 serves as a break for cortical plasticity [68]. Using the mouse model, it was demonstrated that LYNX1 plays a modulatory role in the aging brain, and that soluble LYNX1 may be useful for adjusting cholinergic-dependent plasticity and learning mechanisms [72][73][74].
Secreted Ly6/urokinase-type plasminogen activator receptor-related protein 2 (SLURP2) SLURP2 is expressed by human epidermal and oral keratinocytes, from where it is secreted into sweat and saliva, respectively [100]. SLURP2 expression is strongly induced in psoriatic skin lesions possibly by IL22, and is blocked by IFNG [70,101]. SLURP2 blocks the effect of acetylcholine by binding CHRNA3 (α3nAChR), and delays keratinocyte differentiation and prevents apoptosis [100]. Although the SLURP2 and LYNX1 genes are closely linked leading to a mistaken idea that they are isoforms, it is now clear that they constitute separate transcription units that are differently regulated [70]. Slurp2−/− mice also develop signs of palmoplantar keratoderma and neuromuscular abnormality (hind-limb clasping) reminiscent of those seen in Slurp1−/− mice [99,102].

Ly6/Plaur domain containing 1
Ly6/Plaur domain containing 1 (LYPD1), also known as LYNX2, is a prototoxin gene that is expressed in postmitotic central and peripheral neurons including subpopulations of motor neurons, sensory neurons, interneurons, and neurons of the autonomous nervous system [103]. LYPD1 is expressed at high levels in anxiety associated brain areas and plays an important role in regulating anxiety by binding and modulating neuronal nicotinic receptors [104,105]. Ablation of Lypd1 alters the actions of nicotine on glutamatergic signaling in the prefrontal cortex, resulting in elevated anxiety-like behaviors [104].

Evolution of Ly6/uPAR family proteins
Ly6/uPAR family genes are conserved across species suggesting that they are evolutionarily ancient. Organization of the genes in this family in clusters on multiple chromosomes suggests that both gene duplications and translocations have played a role in their evolution. Comparison of the mouse and human Ly6/uPAR family genes reveals that while there are many orthologs, some Ly6 genes are only present in the mouse. The Ly6 gene complexes on human chromosomes 8, 19, 11, and 6 are syntenic with their counterparts on mouse Chromosomes 15, 7, 9, and 17, respectively, suggesting that these gene complexes were already present in their common ancestor. There are no human orthologs for the subcluster of murine Ly6 genes Ly6i, Ly6a, Ly6c1, Ly6c2, Ly6a2, Ly6g, Ly6g2, and Ly6f on Chromosome 15, and Pate10, Pate7, Pate6, Pate5, Pate12, Pate11, Pate9, Pate8, and Pate14 on Chromosome 9, suggesting that these regions may have arisen in the mouse through gene duplication after evolutionary divergence of these two species. What their functions are in the murine neutrophils and epididymis, respectively, where they are abundantly expressed, and Tree Scale: 0.1 Fig. 2 Phylogram revealing the evolutionary relationship among mouse (ms-) and human (hu-) Ly6/uPAR family proteins. The phylogram was generated using the amino acid sequences in Clustal-Omega web-based program [106,107] (http://www.ebi.ac.uk/Tools/msa/ clustalo/). The display was generated using the methods described [108]. The length of each branch from the most recent branch point indicates the evolutionary distance, or the relative period of time the protein has been in its current state. Known GPI-anchored Ly6/uPAR family proteins are shown in red, and those secreted (without GPI-anchor) are shown in green. Those predicted to contain a GPI-anchor (but not yet experimentally proven) are in purple, and those predicted to not contain a GPI-anchor sequence (but not yet experimentally proven) are in black. Novel genes named in this study are indicated with an asterisk (*) how they are compensated in the corresponding human tissues, remains to be determined. In order to evaluate the evolutionary relatedness of LU family proteins, we generated a phylogram by multiple sequence alignment of their amino acid sequences using web-based Clustal-Omega software, and visualized it with web-based software from Interactive Tree of Life (Fig. 2) [106][107][108]. Where multiple isoforms exist, we only used the sequence of the longest isoform. Analysis of the phylogenetic relationship among human and mouse Ly6/uPAR family proteins revealed that (i) human LY6K and mouse GML2 are the most ancestral Ly6 proteins with the longest unbranched streak in these two species, (ii) human and mouse LYPD6 are the most recent addition to the family closely followed by mouse LY6C1 and LY6C2, (iii) most of the secreted family proteins (with the notable exception of SLURP1 and SLURP2) form a separate cluster distinct from the GPIanchored proteins, and (iv) several mouse PATE proteins (PATE4, 5, 6, 7, 8, 9, 10, 13, and 14) have long unbranched streaks suggesting that they have ancient origin and that the important function(s) that they serve have not changed much (Fig. 2).

Concluding remarks
In this gene family update, we have summarized the current literature on the organization, expression patterns, functions, and evolution of human and mouse Ly6/uPLAR family genes. In addition, we identified and named many novel Ly6/uPAR family members. Considering that Ly6/uPLAR family members play critical roles in regulating immunological and physiological responses to infections and varying environmental conditions, it is imperative that we understand them in greater detail. Their involvement in regulating a wide range of functions such as progression of inflammation, complement activity, neuronal activity, angiogenesis, wound healing, and cancer growth indicates that Ly6/uPAR family members will be useful therapeutic targets. Additional insight into (i) the biological functions of individual family proteins, (ii) signaling cascades that regulate their expression and functions, and (iii) the identity of their interacting partners is expected to herald new modalities for diagnosis and treatment of diverse diseases. Authors' contributions CLL, EAB, MSM, EED, SS, and SKS were involved in drafting the manuscript and revising it critically for important intellectual content. Each author has participated sufficiently in the work to take public responsibility for appropriate portions of the content. All authors read and approved the final manuscript. Each author has agreed to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.