Mutational analysis of COL1A1 and COL1A2 genes among Estonian osteogenesis imperfecta patients

Background Osteogenesis imperfecta (OI) is a rare bone disorder. In 90% of cases, OI is caused by mutations in the COL1A1/2 genes, which code procollagen α1 and α2 chains. The main aim of the current research was to identify the mutational spectrum of COL1A1/2 genes in Estonian patients. The small population size of Estonia provides a unique chance to explore the collagen I mutational profile of 100% of OI families in the country. Methods We performed mutational analysis of peripheral blood gDNA of 30 unrelated Estonian OI patients using Sanger sequencing of COL1A1 and COL1A2 genes, including all intron-exon junctions and 5′UTR and 3′UTR regions, to identify causative OI mutations. Results We identified COL1A1/2 mutations in 86.67% of patients (26/30). 76.92% of discovered mutations were located in the COL1A1 (n = 20) and 23.08% in the COL1A2 (n = 6) gene. Half of the COL1A1/2 mutations appeared to be novel. The percentage of quantitative COL1A1/2 mutations was 69.23%. Glycine substitution with serine was the most prevalent among missense mutations. All qualitative mutations were situated in the chain domain of pro-α1/2 chains. Conclusion Our study shows that among the Estonian OI population, the range of collagen I mutations is quite high, which agrees with other described OI cohorts of Northern Europe. The Estonian OI cohort differs due to the high number of quantitative variants and simple missense variants, which are mostly Gly to Ser substitutions and do not extend the chain domain of COL1A1/2 products.

OI has many manifestations and is considered a group of disorders. Phenotypes range from mild osteopenia to severe deformities or even mortality. In 1979, Sillence described four OI types (I-IV) according to phenotype severity [5]. Recent updated classification distinguishes three additional types with specific histologies (V-VII) [4,7]. Genetic OI classification considers every OI gene as a separate OI type and so far includes OI types I-XVII [4,8,9].
The genetics of the disorder reflect the complexity of the OI phenotype range. Up to 21 different genes have been associated with occurrence of OI [10][11][12][13][14][15][16][17][18][19]. Previous studies have shown that the primary cause of OI are mutations in the COL1A1/2 genes, which code procollagen type I α1 and α2 chains, respectively [20]. Despite the approximately 1500 mutations already described in collagen type I genes, investigators continue to report novel mutations [21]. Moreover, there is still some controversy regarding the proportion of collagen mutations reported in different populations, which have ranged from 60 to 95% [9,10]. In this context, we believe that population-based studies of OI genetics might broaden current knowledge of collagen I mutations and OI.
Due to Estonia's small population (1.3 million) and centered treatment, follow-up, and research of all OI patients at the OI Center of the Traumatology and Orthopedics Clinic, Tartu University (TU) Hospital, it was possible to perform analysis of COL1A1/2 mutations among the whole Estonian OI population [22]. Herein, we describe for the first time the mutational spectrum of COL1A1/2 genes among 30 unrelated OI patients, from 30 Estonian OI families, which we estimate to constitutẽ 100% of OI cases in Estonia.

Subjects
The patients included in the study are treated and followed-up by the OI Center of the Traumatology and Orthopedics Clinic, TU Hospital. A total of 30 OI patients from 30 unrelated families were included in the study. Data regarding the OI types of the subjects were obtained from the medical records of TU Hospital [23]. All new OI cases across Estonia are registered by and treated at TU Hospital's OI Center. Thus, it can be estimated that as of May 2017, the current patient cohort represented~100% of the Estonian OI population.
No patient came from a consanguineous family. Mutational analysis of the COL1A1/2 genes was performed on a younger affected member of every OI family included in the study.
In accordance with the Declaration of Helsinki, all patients or their legal representatives signed an informed consent form prior to participation. The study was approved by the University of Tartu's Ethical Review Committee on Human Research (permit no. 221/M-34).

Genealogical description
Genealogical data of OI history in the family, consanguinity, and miscarriages was obtained from each patient or their representative. We constructed pedigree trees per kindred using the "Kinship2" package in R v3.3.2 [24].
Mutational analysis of the COL1A1/2 genes Genomic DNA (gDNA) was purified from 3 ml of ethylenediaminetetraacetic acid (EDTA) preserved whole blood samples-stored at −80°C-using a Gentra Puregene Blood Kit (Quiagen, Germany) following the manufacturer's protocol.
All statistical analyses were carried out with R v3.3.2. software (R Team, Austria) [32]. To assess the distribution of COL1A1/2 mutations and compare them to other studied OI populations, percentage differences were used.

Results
Mutational analysis of the COL1A1/2 genes of Estonian OI patients highlighted OI causative mutations in 26 of 30 patients (86.67%) (Fig. 1a). The number of patients harboring COL1A1 mutations was 20 (76.92%); COL1A2 mutations were found in 6 patients (23.08%) (Fig. 1b). A list of the mutations and their characteristics can be found in Table 1.
The number of novel mutations was 13/26 (50%) ( Table  1). Half of the COL1A1 and COL1A2 mutations appeared to be undescribed in the collagen type I mutation database. Patient EE26 had a heterozygous non-synonymous rs1800215 SNP (p.Ala1075Thr) in the COL1A1 gene, which was described before as a benign variant (data not shown). [33] Twenty-five mutations had an autosomal dominant inheritance pattern (Table 1). Of these, eight patients had no previous history of OI in the family. Thus, we assumed that their parents and relatives, who did not have any clinical features of OI, are not carriers of these mutations.
Patient EE07 had a recessive missense mutation. Mutational analysis showed that their parents are not carriers of the mutation, which confirmed the de novo nature of the mutation.
Missense mutations, associated with collagen I quality defects, were indicated in eight patients (30.77%), four in COL1A1 and four in COL1A2 genes. Of these, seven were Glycine substitutions (four of the COL1A2 and three of the COL1A1 missense mutations). In four cases, Glycine was substituted with Serine, two in the COL1A1 and COL1A2 genes, respectively (Fig. 2).
A c.3262G>T (COL1A1) mutation was detected in two patients (EE21 and EE22), who were thought to be unrelated ( Table 1). Investigation of the pedigree trees revealed a distant relationship between the families four generations back, of which the patients were not aware. Two identical splice site mutations at c.1821 + 1G>A (COL1A1) in intron 26 were identified in patients EE02 (type III OI) and EE05 (type I OI) ( Table 1). This mutation arose independently in the patients and caused phenotypes of different severity.

Discussion
Collagen I mutations were found in 26/30 (87%) studied OI patients. Previous findings have suggested collagen mutations ranging from 60 to 90% among different OI populations and study cohorts [25]. In a Finnish OI study, 90.7% of patients harbored collagen I mutations [34], which is higher than we found among Estonian OI patients. In Pollitt et al.'s study, collagen I mutations were revealed in 75% of OI patients [35], which is slightly lower than our Estonian cohort. Our data is in good agreement with research on the genetic epidemiology of the Swedish OI population, of which 87% had collagen I mutations [36]. The results of our study are also in concordance with Bardai et al.'s recent study of a large number (598) of OI individuals, where collagen type I mutations were found in 86% of OI patients of all types and different ethnic groups [19].
In some population studies, the amount of collagen I mutations were also lower. For example, in 51.4% of Taiwanese patients (N = 72), 52.2% of Korean patients (N = 67), and 59.4% of Vietnamese OI patients (N = 91) [25,37,38]. Due to the difficulties in arranging large cross-population studies of a rare disorder in populous countries, results can often be fragmented, which complicates population-wide estimates [39,40]. However, questions about the lower collagen type I mutational pattern of OI patients from Asian populations remain.
The proportions of COL1A1 and COL1A2 in Estonian, Finnish, and Swedish OI populations were surprisingly similar, 77 and 23%, 78 and 22%, and 79 and 21%, respectively [34,36]. Similar values were reported by Pollitt et al., where 77% of mutations occurred in the COL1A1 and 23% in the COL1A2 gene (N = 83) [35]. In a b Fig. 1 Diagram of collagen type I mutation distribution Estonian OI patients. a Percentage of patients with COL1A1/2 mutations vs patients negative for collagen I mutations. b Percentage of mutations in the COL1A1 and COL1A2 genes Bardai et al.'s 2016 study, 69% were COL1A1 and 31% COL1A2 mutations, which is similar to the beforementioned results [19].
The Estonian cohort also has a high proportion of quantitative mutations compared to qualitative collagen mutations, 69 and 31%, respectively. In the Finnish OI cohort, 67% of mutations were quantitative and 33% qualitative [34]. In the work of Pollitt et al., 35% of mutations were qualitative and 65% quantitative [35]. In the Swedish population, the proportions were almost equal (53 and 47%) [36]. Interestingly, we found only two quantitative mutations in the COL1A2 gene, which matches previous reports about comparatively lower numbers of quantitative mutations of this gene [34][35][36]. Due to the higher number of mutations leading to haploinsufficiency in the COL1A1 gene compared to the COL1A2 gene, patients harboring mutations in the COL1A1 gene had milder phenotypes (I, IV) compared to patients with COL1A2 mutations (type III, except EE20 who had a splice site mutation and OI type I).
Glycine substitutions composed the vast majority of missense mutations (7 of 8 cases), with serine being the most substituted amino acid (4 of 7 cases), which supports previous findings. Curiously, all missense mutations were situated in triple helical chain domains (aa residues 162-1218 α1; aa residues 80-1102 α2) of COL1A1/2 gene products. Only one mutation (patient EE07 with OI type II) altered the "lethal cluster" proposed by Marini et al. [41].
Half of the mutations (50%) we found appeared to be novel. Despite the numerous works on collagen I mutations and a growing list of identified mutations, the number of revealed novel variants was high, which underlines the individual nature of OI mutations [19,35,36]. Half of the glycine substitutions (4 of 7) were even absent from the collagen I mutational database. Despite sharing of the same mutation, patients may develop different phenotypes, as in the case of patients EE02 and EE05, who had type III and I OI, respectively. Genotype-phenotype correlations remain an unresolved issue in our understanding of OI. Cases of inter-and intra-familial OI diversity are not rare. Not only genetics, but additional factors, such as epigenetics and environment might contribute to the development of specific OI phenotypes. This leads to many questions and the need to further investigate potential OI factors.
Sanger sequencing is a powerful and accurate method of mutational analysis and allows the identification of frameshift, and missense and nonsense mutations in the coding regions of genes. Moreover, due to the special design of the primers distant from intron-exon junction regions, we could asses splice site mutations of the COL1A1/2 genes, which are the cause of quantitative collagen defects. However, the current study had some limitations. We could not identify whole gene or exon deletions and duplications, which could have slightly reduced the number of discovered COL1A1/2 mutations. In addition, due to the small population size of Estonia, our cohort was limited. We cannot exclude the possibility that the small sample size might be the cause of differences compared to the results of other studies.

Conclusion
This paper has described the mutational spectrum of COL1A1/2 genes among 30 Estonian OI patients, which were estimated to represent~100% of OI families in Estonia at the time. We identified collagen I mutations in 87% of Estonian OI families. The number of a b c Fig. 2 a Mutational spectrum of COL1A1/2 mutations in Estonian OI patients. Distribution of the COL1A1 (b) and COL1A2 (c) mutations according to mutation type quantitative mutations (69%) was high compared to other European OI cohorts. All missense mutations of our Estonian patients altered the triple helical chain domain of α1 and α2 procollagen chains. One mutation was situated in the lethal cluster. A normal distribution of novel collagen mutations (50%) among the COL1A1 (77%) and COL1A2 (23%) genes, and mostly glycine substitutions were observed, compared to other OI cohorts of Northern Europe. Four patients that showed no collagen type I mutations will be further studied using whole exome sequencing analysis to identify disease causing variants.

Availability of data and materials
The datasets used and analyzed during the current study are available from the corresponding author upon reasonable request.
Authors' contributions LZ conceived the study, carried out the genetic studies, performed the data analysis, participated in the design of the study, and drafted the manuscript. EP, SK, and ER carried out the genetic studies, performed the data analysis, and helped to draft the manuscript. KM participated in the design of the study, interacted with the patients, coordinated the blood sample collection, and helped to draft the manuscript and perform analysis. SK and AM participated in the designing of the study, coordinated the data interpretation and statistical analysis, and helped to draft the manuscript. All authors read and approved the final manuscript.

Ethics approval and consent to participate
The current study was conducted in accordance with the Helsinki Declaration and received approval from the Ethical Review Committee on Human Research of the University of Tartu (permit no. 221/M-34). Informed written consent from the patients or their legal representatives were obtained prior to inclusion to the study.

Consent for publication
Not applicable.