Cyclophilin nomenclature problems, or, 'a visit from the sequence police'
© Henry Stewart Publications 2004
Received: 4 May 2004
Accepted: 4 May 2004
Published: 1 August 2004
Why is agreement on one particular name for each gene important? As one genome after another becomes sequenced, it is imperative to consider the complexity of genes, genetic architecture, gene expression, gene-gene and gene-product interactions and evolutionary relatedness across species. To agree on a particular gene name not only makes one's own research easier, it aids automated text-mining algorithms and search engines, which are increasingly employed to find relationships in the millions of abstracts in the medical research literature and sequence databases. A common nomenclature system will also be helpful to the present generation, as well as future generations, of graduate students and postdoctoral fellows who are about to enter genomics research. In this paper, the authors present some problems that arose when two separate research communities decided to choose the same root, CYP, for naming their gene families. They then offer a logical solution, by renaming the cyclophilin genes with a common root, such a cyn- in Caenorhabditis and CYN- in mammals (Cyn in mouse), and using evolutionary divergence to cluster genes of the highest level of relatedness.
Keywordshuman genome mouse genome Caenorhabditis elegans genome cytochrome P450 (CYP) gene superfamily cyclophilin gene family immunophilins peptidylprolyl cis-trans isomerases FK506-binding proteins tacrolimus parvulin
A previous paper in this series  summarised the steps that one is strongly encouraged to follow in order to ensure proper nomenclature of any gene. Three examples were given to illustrate how and why one should strive for a standardised gene nomenclature system. In these examples, the focus of the paper was on using the gene names as search terms, rather than comparing a DNA or protein sequence that has just been determined by searching via BLAST . The three examples included: PTGS1 and PTGS2 as the correct gene names for prostaglandin G/H synthase-1 and -2, also known as cyclooxygenase-1 and -2 and commonly erroneously nicknamed 'COX-1' and 'COX-2' in many journals; the short- and long-chain fatty acid synthase gene families, for which there is currently no official agreed-upon nomenclature (although FASN on human chromosome 17q25 is the official symbol for the fatty acid synthase gene); an POR as the correct name for the NADPH-P450 oxidoreductase gene . Before deciding upon a new gene symbol, the reader is encouraged to visit the website describing this topic .
This theme is extended in the current paper, which shows how two completely separate research communities adopted the same gene root name, while not realising that the other group had done the same thing.
Cyclophilins as 'cyp-' in a Caenorhabditis elegansdatabase
List of cyclophilin and P450 genes in C. elegan s.
WormPep accession #
CYP for cytochrome P450 genes in all species
The mammalian cytochrome P450 (CYP) superfamily encodes enzymes involved in: the metabolism of pharmaceuticals, foreign chemicals and pollutants; arachidonic acid metabolism and eicosanoid biosynthesis; cholesterol, sterol and bile acid biosynthesis; steroid synthesis and catabolism; vitamin D3 synthesis and catabolism; retinoic acid hydroxylation; biogenic amine and neuroamine metabolism; and several orphan CYP genes still of unknown function . There are 102 and 57 putatively functiona CYP genes in the mouse and human, respectively . To date, more than 3,400 P450 sequences have been named with the three-letter root of CYP. This nomenclature has been in place [8, 9] since 1987, and is growing every day . The official root names for mouse and human P450s are Cyp and CYP, respectively. The Drosophila nomenclature  also use Cyp. There are now 727 genes in rice an Arabidopsis that have been named CYP . It is anticipated that the number of named P450 genes will exceed 4,000 by the end of 2004.
Whereas continuing to use the CYP root for cyclophilin genes will be a nightmare for cyclophilin researchers, P450 researchers might find this an annoyance but not really much of a problem. To prevent conflicts over nomenclature, it becomes increasingly urgent to rename the cyclophilin genes. What is the best root name for these genes?
Finding the best root for the cyclophilin genes
The three families of immunophilins, known as peptidylprolyl cis-trans isomerases (PPIases), include the cyclophilins, the FK506-binding proteins (FKBPs) and parvulin [11–13]. All three gene families are found in animals, plants and eubacteria. While two cyclophilins and two types of FKBPs exist in archaebacteria, no parvulin homologue has been found. Parvulin is unique among the immunophilins. A search of the LocusLink , HUGO Gene Nomenclature Committee , and the National Center for Biotechnology Information (NCBI) UniGene  websites using 'parvulin', shows a single gene; Pin4 and PIN4 are the approved mouse and human gene names, respectively. 'PIN' is an abbreviation for peptidylprolyl cis-trans isomerase NIMA-interacting-4. 'NIMA' stands for 'never-in-mitosis-gene-a', which was first isolated as a series of conditional cell cycle mutants that failed to enter mitosis in Aspergillus nidulans [17, 18]. There are 11 genes (NEK1, NEK2, ... NEK11) in the human genome that encode NIMA-related mitotic kinases and are involved in DNA replication and genotoxic stress responses [19, 20]. Although parvulin has peptidylprolyl cis-trans isomerase activity, it shares no evolutionary homology with the FKBPs or cyclophilins.
Immunophilins are defined as receptors for immuno-suppressive drugs including cyclosporin-A, FK506 and rapamycin. FK506 is also called tacrolimus, a macrolide of fungal origin (produced by Streptomyces tsukubaensis) and having strong immunosuppressive actions. FK506- and rapamycin-binding proteins are abbreviated as FKBPs and share no evolutionary homology with the cyclophilins or parvulin. A search of the LocusLink, HUGO Gene Nomenclature Committee and the NCBI UniGene websites using 'fkbp', shows more than 80 FKBP genes in the human and mouse (FKBP1, FKBP2, ... FKBP82). These gene products have many unique features, such as targeting BCL2 to the mitochondria and inhibiting apoptosis .
Cyclophilins, the third and last class of the PPIases, comprise cyclosporin-A-binding proteins  ranging in size from 17 kDa to 324 kDa . This class of immunophilins carries out a wide range of functions -- including acting as a chaperone to facilitate the nuclear transport of the somatolactogenic hormones , facilitating the calcium-regulated mitochondrial permeability transition pore which precedes apoptosis  and participating in the pre-mRNA splicing machinery . Cyclophilin-binding drugs are emerging as potential leads to novel targets for interference with interleukin-12 production  and, therefore, to the possibility of treating conditions such as multiple sclerosis and rheumatoid arthritis. Cyclosporin-A also has activity against helminth and protozoan parasites .
List of putatively functional human cyclophilin genes.
Approved gene symbol
Approved gene name
Peptidylprolyl isomerase A (cyclophilin A)
Peptidylprolyl isomerase A (cyclophilin A)-like-3
Peptidylprolyl isomerase B (cyclophilin B)
Peptidylprolyl isomerase C (cyclophilin C)
Peptidylprolyl isomerase D (cyclophilin D)
Peptidylprolyl isomerase E (cyclophilin E)
Peptidylprolyl isomerase F (cyclophilin F)
Peptidylprolyl isomerase G (cyclophilin G)
Peptidylprolyl isomerase H (cyclophilin H)
Peptidylprolyl isomerase (cyclophilin)-like 1
Peptidylprolyl isomerase (cyclophilin)-like 2
Peptidylprolyl isomerase (cyclophilin)-like 3
Peptidylprolyl isomerase (cyclophilin)-like 4
Peptidylprolyl isomerase (cyclophilin)-like 5
Peptidylprolyl isomerase (cyclophilin)-like 6
PPID has the synonym 'CYP-40', but this is no longer the official name. Unfortunately, the mouse RIKEN full-length cDNAs that match this sequence are being called CYP40, not PPID, so the name is propagating itself in the literature and into the databases in an uncontrollable way. The cloning and naming of 11 cyclophilin genes from C. elegans (Cyp-1 to Cyp-11) was reported in 1996. A search of GenBank for CYP20 finds AY568517, a Arabidopsis thylakoid lumen cyclophilin , named CYP20-2. (CYP20A1 is a chordate cytochrome P450 of unknown function, possibly involved in development.) The date on this Arabidopsis CYP20 GenBank entry is 15th April, 2004, showing that the problem is not going away. In fact, the PubMed link from the GenBank entry leads to a publication  in which a nomenclature system for the 29 cyclophilin genes in the Arabidopsis thaliana genome is presented using CYP as the root.
What is the solution?
Solutions -- like politics -- are local. We have contacted the C. elegans community and alerted them to this nomenclature conflict. They are responding and will select a new root for cyclophilins and change their P450 gene names to cyp-, from the current ccp- root. This will go into the official WormPep and WormBase nomenclature and will eventually prevent use of the cyp- root in C. elegans (and, hopefully, C. briggsae) for cyclophilins. Additional effort will be needed for the Arabidopsis community, as well as for the human and mouse gene databases.
What might be the best root for the cyclophilin gene family? Cyn has been used for cyclone, a mouse gene in LocusLink; CPN1 and CPN2 are being used for carboxypeptidase N-1 and -2; Cph was considered, but CPH1 has been used to refer to a cryptochrome or phytochrome (light-sensing protein) . Because of the sharing of this paper (while still being written) with Lois Maltais of Mouse Genome Informatics (MGI), she consulted with the authors of the mouse cyclone gene paper and they have now agreed to use Cycn, in order to free up Cyn and CYN for the mouse and human cyclophilin, respectively. After searching databases and search engines for conflicts, the present authors suggest that Cyn- might be the most suitable root for C. elegans cyclophilins, but this needs to be decided among members of the worm community. It is unfortunate that some databases (eg worm, yeast and bacteria) are mandating that gene names be limited to three letters. The authors suspect that three-letter root names for the ~19,000 C. elegans genes may not be enough. For example, 10,000 families will require the same number of roots. 26 cubed is only 17,576; this will require the use of odd letter combinations that have no symbolic meaning, such a xyz1, cxq, rzx, etc. Also, the nature of language is to use some letters more often than others, which will put great pressure on naming the genes that begin with the most often-used letters. CYN has now been officially approved as the root to unify all mammalian cyclophilins.
The vertical lines in Figure 2 are suggested break-points for family and subfamily designations. Branches on the tree intersected by the lines would define family and subfamily clusters. The lines could be moved to modify the number of families and subfamilies. As drawn, there are six subfamilies in family 1, and one each in families 2 and 3. Moving the subfamily line to the left could reduce the number of subfamilies in family 1 from six to three. If cyn were used, CE28157 (at the top of Figure 2) would be named cyn3a1 and CE20374 (at the bottom of Figure 2) would be named cyn1a1, and so on.
A method for creating a network of 'gene co-occurrences' from the literature, and portioning it into communities of related genes, has recently been presented . In that paper, a program is described (but not named) which searches all Medline titles and abstracts and OMIM entries for occurrences and co-occurrences of gene symbols, gene names and diseases; the databases contain more than 12 million abstracts. Relationships are identified by automated bioinformatics methods between genes, and between genes and diseases, that might not be detected by less computationally intense methods. Such methods must rely on consistent names, or they have to deal with a list of synonyms.
The writing of this article was funded, in part, by NIH grant P30 ES06096 (D.W.N.).
- Nebert DW, Wain HM: 'Update on human genome completion and annotations: Gene nomenclature'. Hum Genomics. 2003, 1: 66-71.PubMed CentralView ArticlePubMedGoogle Scholar
- The C. elegans protein database. [http://www.sanger.ac.uk/Projects/C_elegans/wormpep/]
- Nebert DW, Russell DW: 'Clinical importance of the cytochromes P450'. Lancet. 2002, 360: 1155-1162. 10.1016/S0140-6736(02)11203-7.View ArticlePubMedGoogle Scholar
- Nelson DR, Zeldin D, Hoffman S, et al: 'Comparison of cytochrome P450 (CYP) genes from the mouse and human genomes including nomenclature recommendations for genes, pseudogenes, and alternative-splice variants'. Pharmacogenetics. 2004, 14: 1-18. 10.1097/00008571-200401000-00001.View ArticlePubMedGoogle Scholar
- Nebert DW, Adesnik M, Coon MJ, et al: 'The P450 gene superfamily. Recommended nomenclature'. DNA. 1987, 6: 1-11. 10.1089/dna.1987.6.1.View ArticlePubMedGoogle Scholar
- Nelson DR, Koymans L, Kamataki T, et al: 'Cytochrome P450 superfamily: Update on new sequences, gene mapping, accession numbers, and nomenclature'. Pharmacogenetics. 1996, 6: 1-42. 10.1097/00008571-199602000-00002.View ArticlePubMedGoogle Scholar
- Maruyama T, Furutani M: 'Archaeal peptidyl prolyl cis-trans isomerases (PPIases)'. Front Biosci. 2000, 5: D821-D836. 10.2741/maruyama.View ArticlePubMedGoogle Scholar
- Galat A: 'Peptidylprolyl cis/trans isomerases (immunophilins): Biological diversity -- targets -- functions'. Curr T op Med Chem. 2003, 3: 1315-1347. 10.2174/1568026033451862.View ArticleGoogle Scholar
- He Z, Li L, Luan S: 'Immunophilins and parvulins. Superfamily of peptidyl prolyl isomerases in Arabidopsis'. Plant Physiol. 2004, 134: 1248-1267. 10.1104/pp.103.031005.PubMed CentralView ArticlePubMedGoogle Scholar
- [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db = unigene]
- Osmani SA, May GS, Morris NR: 'Regulation of the mRNA levels of nimA, a gene required for the G2-M transition in Aspergillus nidulans'. J Cell Biol. 1987, 104: 1495-1504. 10.1083/jcb.104.6.1495.View ArticlePubMedGoogle Scholar
- Osmani SA, Pu RT, Morris NR: 'Mitotic induction and maintenance by over-expression of a G2-specific gene that encodes a potential protein kinase'. Cell. 1988, 53: 237-244. 10.1016/0092-8674(88)90385-6.View ArticlePubMedGoogle Scholar
- Krien MJ, West RR, John UP, et al: 'The fission yeast NIMA kinase Fin1p is required for spindle function and nuclear envelope integrity'. EMBO J. 2002, 21: 1713-1722. 10.1093/emboj/21.7.1713.PubMed CentralView ArticlePubMedGoogle Scholar
- Noguchi K, Fukazawa H, Murakami Y, et al: 'Nek11, a new member of the NIMA family of kinases, involved in DNA replication and genotoxic stress responses'. J Biol Chem. 2002, 277: 39655-39665. 10.1074/jbc.M204599200.View ArticlePubMedGoogle Scholar
- Shirane M, Nakayama KI: 'Immunophilin FKBP38, an inherent inhibitor of calcineurin, targets BCL2 to mitochondria and inhibits apoptosis'. Nippon Rinsho. 2004, 62: 405-412.PubMedGoogle Scholar
- Jin L, Harrison SC: 'Crystal structure of human calcineurin complexed with cyclosporine-A and human cyclophilin'. Proc Natl Acad Sci USA. 2002, 99: 13522-13526. 10.1073/pnas.212504399.PubMed CentralView ArticlePubMedGoogle Scholar
- Rycyzyn MA, Clevenger CV: 'Role of cyclophilins in somatolactogenic action'. Ann NY Acad Sci. 2000, 917: 514-521.View ArticlePubMedGoogle Scholar
- Halestrap AP, McStay GP, Clarke SJ: 'The permeability transition pore complex: Another view'. Biochimie. 2002, 84: 153-166. 10.1016/S0300-9084(02)01375-5.View ArticlePubMedGoogle Scholar
- Pemberton TJ, Rulten SL, Kay JE: 'Identification and characterization of Schizosaccharomyces pombe cyclophilin-3, a cyclosporin A-insensitive orthologue of human USA-CyP'. J Chromatogr B Analyt Technol Biomed Life Sci. 2003, 786: 81-91. 10.1016/S1570-0232(02)00738-9.View ArticlePubMedGoogle Scholar
- Vandenbroeck K, Alloza I, Gadina M: 'Inhibiting cytokines of the interleukin-12 family: Recent advances and novel challenges'. J Pharm Pharmacol. 2004, 56: 145-160.View ArticlePubMedGoogle Scholar
- Chappell LH, Wastling JM: 'Cyclosporin A: Antiparasite drug, modulator of the host-parasite relationship, and immunosuppressant'. Parasitology. 1992, 105: S25-S40. 10.1017/S0031182000075338.View ArticlePubMedGoogle Scholar
- Page AP, MacNiven K, Hengartner MO: 'Cloning and biochemical characterization of the cyclophilin homologues from the free-living nematode Caenorhabditis elegans'. Biochem J. 1996, 317: 179-185.PubMed CentralView ArticlePubMedGoogle Scholar
- Romano PGN, Edvardsson A, Ruban AV, et al: 'Arabidopsis AtCYP20-2 is a light-regulated cyclophilin-type peptidyl-prolyl cis-trans isomerase associated with the photosynthetic membranes'. Plant Physiol. 2004, 134: 1244-1247. 10.1104/pp.104.041186.PubMed CentralView ArticlePubMedGoogle Scholar
- Romano PG, Horton P, Gray JE: 'The Arabidopsis cyclo-philin gene family'. Plant Physiol. 2004, 134: 1268-1282. 10.1104/pp.103.022160.PubMed CentralView ArticlePubMedGoogle Scholar
- Reisdorph NA, Small GD: 'The CPH gene of Chlamy-domonas reinhardtii encodes two forms of cryptochrome whose levels are controlled by light-induced proteolysis'. Plant Physiol. 2004, 134: 1546-1554. 10.1104/pp.103.031930.PubMed CentralView ArticlePubMedGoogle Scholar
- Wilkinson DM, Huberman BA: 'A method for finding communities of related genes'. Proc Natl Acad Sci USA. 2004, 101: S5241-S5248. 10.1073/pnas.0307740100.View ArticleGoogle Scholar