Skip to main content

Table 2 Re-assignment characteristics of unknown samples to reference clusters defined in the hierarchical decomposition of a dataset by structure

From: Inference of ancestry: constructing hierarchical reference populations and assigning unknown individuals

  

Raw re-assignment

    

Post-clustering adjustment re-assignment

Level of

resolution

No.

clusters

Median

cluster

size

RASR a

No. clusters

> 90%

RASR

Criterion

R2

No.

clusters

Median

cluster

size

RASR

No. clusters

> 90%

RASR

1

1

1,056

1.000

-

-

-

-

-

-

-

2

7

86

0.995

6

IMM b

0.900

7

86

0.999

6

     

Cluster

size

0.427

    

3

30

19

0.939

21

IMM

0.931

29

18

0.960

23

     

Cluster

size

0.216

    

4

78

4

0.818

40

IMM

0.943

72

3

0.823

39

     

Cluster

size

0.296

    

5

113

3

0.633

47

IMM

0.898

87

3

0.700

42

     

Cluster

size

0.662

    
  1. Re-assignment success rates were high over the first several levels of resolution in the analysis. The decrease in re-assignment success over the course of the hierarchy was primarily attributed to an increase in the number of samples exhibiting mixed membership properties among multiple clusters. Despite the decline in overall success, a large proportion of the defined clusters at each tier demonstrated highly successful re-assignment rates. These high-performance clusters have the potential to provide highly informative assignments for truly unknown individuals in ancestry testing procedures. Additionally, post-clustering adjustments, wherein certain outlier samples were systematically relocated, had the effect of increasing success rates at all levels of resolution.
  2. a RASR = Re-assignment success rate
  3. b IMM = Individual mixed membership (in multiple clusters)