Skip to main content

Table 2 Re-assignment characteristics of unknown samples to reference clusters defined in the hierarchical decomposition of a dataset by structure

From: Inference of ancestry: constructing hierarchical reference populations and assigning unknown individuals

   Raw re-assignment      Post-clustering adjustment re-assignment
Level of
resolution
No.
clusters
Median
cluster
size
RASR a No. clusters
> 90%
RASR
Criterion R2 No.
clusters
Median
cluster
size
RASR No. clusters
> 90%
RASR
1 1 1,056 1.000 - - - - - - -
2 7 86 0.995 6 IMM b 0.900 7 86 0.999 6
      Cluster
size
0.427     
3 30 19 0.939 21 IMM 0.931 29 18 0.960 23
      Cluster
size
0.216     
4 78 4 0.818 40 IMM 0.943 72 3 0.823 39
      Cluster
size
0.296     
5 113 3 0.633 47 IMM 0.898 87 3 0.700 42
      Cluster
size
0.662     
  1. Re-assignment success rates were high over the first several levels of resolution in the analysis. The decrease in re-assignment success over the course of the hierarchy was primarily attributed to an increase in the number of samples exhibiting mixed membership properties among multiple clusters. Despite the decline in overall success, a large proportion of the defined clusters at each tier demonstrated highly successful re-assignment rates. These high-performance clusters have the potential to provide highly informative assignments for truly unknown individuals in ancestry testing procedures. Additionally, post-clustering adjustments, wherein certain outlier samples were systematically relocated, had the effect of increasing success rates at all levels of resolution.
  2. a RASR = Re-assignment success rate
  3. b IMM = Individual mixed membership (in multiple clusters)