A comprehensive literature review of haplotyping software and methods for use with unrelated individuals

Table 4 Description of programs designed for pooled samples.

Program name	Algorithm	Output^a	Missing data^b	Assumptions^c	Key features	Limitations	Pool Size, MAX #Loci, Type	Platform	Ref.^d
Pools2	Clark's/EM	HF/HA	N/A	None	Haplotype-tagging SNPs	Computationally slow	Pools of 2 individuals, practical limit, biallelic	PC	[117]
					Accommodates a large number of SNPs	Need to re-calculate several times to assure consistent results
						EM issues
LDPooled	EM	HF/HA	No	HWE	Calculates LD	LD impacts performance	Based on pools of 4 individuals, practical limit, biallelic	*	[96]
					SNPs or microsatellites	EM issues
EHP.R	EM	HF	Yes	HWE	Tests haplotype-disease association	Variance increases with pool size, weaker LD and # loci	Pools of 4 individuals, practical limit, biallelic	PC/UNIX	[98]
					Assessment of haplotype frequency estimate accuracy	EM issues
					Handles different types of missing data	Requires knowledge of S-Plus 6.0 or R

^a Program haplotype output, individual assignment, frequency estimates or both.
^b Ability of program to accept missing data.
^c Program assumptions.
^d List of references.
*Could not determine from available data.
EM: Expectation maximisation algorithm; EM issues: May be sensitive to HWE departures, long run times, and non-global max (requiring multiple restarts); HF: Haplotype frequency estimate; HA: Individual haplotype assignment; HWE: Hardy-Weinberg equilibrium; LD: Linkage disequilibrium; PC: IBM compatible personal computer; UNIX: Runs on Unix operating system, including Linux, FORTRAN, Solaris and others.

ISSN: 1479-7364