Corander J, Waldmann P, Sillanpaa MJ: Bayesian analysis of genetic differentiation between populations. Genetics. 2003, 163: 367-374.

PubMed Central
CAS
PubMed
Google Scholar

Patterson N, Hattangadi N, Lane B, et al: Methods for high-density admixture mapping of disease genes. Am J Hum Genet. 2004, 74: 979-1000. 10.1086/420871.

Article
PubMed Central
CAS
PubMed
Google Scholar

Pritchard JK, Stephens M, Donnelly P: Inference of population structure using multilocus genotype data. Genetics. 2000, 155: 945-959.

PubMed Central
CAS
PubMed
Google Scholar

Rosenberg NA, Burke T, Elo K, et al: Empirical evaluation of genetic clustering methods using multilocus genotypes from 20 chicken breeds. Genetics. 2001, 159: 699-713.

PubMed Central
CAS
PubMed
Google Scholar

Rosenberg NA, Pritchard JK, Weber JL, et al: Genetic structure of human populations. Science. 2002, 298: 2381-2385. 10.1126/science.1078311.

Article
CAS
PubMed
Google Scholar

Lander ES, Schork NJ: Genetic dissection of complex traits. Science. 1994, 265: 2037-2048. 10.1126/science.8091226.

Article
CAS
PubMed
Google Scholar

Risch NJ: Searching for genetic determinants in the new millennium. Nature. 2000, 405: 847-856. 10.1038/35015718.

Article
CAS
PubMed
Google Scholar

Kim JJ, Verdu P, Pakstis AJ, et al: Use of autosomal loci for clustering individuals and populations of East Asian origin. Hum Genet. 2005, 117: 511-519. 10.1007/s00439-005-1334-8.

Article
PubMed
Google Scholar

Dawson KJ, Belkhir K: A Bayesian approach to the identification of panmictic populations and the assignment of individuals. Genet Res. 2001, 78: 59-77.

Article
CAS
PubMed
Google Scholar

Falush D, Stephens M, Pritchard JK: Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics. 2003, 164: 1567-1587.

PubMed Central
CAS
PubMed
Google Scholar

Satten GA, Flanders WD, Yang Q: Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. Am J Hum Genet. 2001, 68: 466-477. 10.1086/318195.

Article
PubMed Central
CAS
PubMed
Google Scholar

Tang H, Peng J, Wang P, Risch NJ: Estimation of individual admixture: Analytical and study design considerations. Genet Epidemiol. 2005, 28: 289-301. 10.1002/gepi.20064.

Article
PubMed
Google Scholar

Ding CH, X H, H Z, H S: Adaptive dimension reduction for clustering high dimensional data. Proc 2nd IEEE Intl Conf Data Mining. 2002, 147-154.

Google Scholar

Landauer TK, Foltz PW, Laham D: Introduction to latent semantic analysis. Discourse Process. 1998, 25: 259-284. 10.1080/01638539809545028.

Article
Google Scholar

Dempster AP, Laird NM, Rubin DB: Maximum likelihood for incomplete data via the EM algorithm (with discussion). J Roy Stat Soc Ser B. 1977, 39: 1-38.

Google Scholar

Moore AW: Very fast EM-based mixture model clustering using multiresolution kd-trees. Advances in Neural Information Processing Systems. Edited by: Kearns M, Solla S, Cohn D. 1999, MIT Press, Cambridge, MA, 543-549.

Google Scholar

Bartell BT, Cottrell GW, Belew RK: Latent semantic indexing is an optimal special case of multidimensional scaling. Proc SIGIR'92 Research and Development in Information Retrieval. 1992, 161-167.

Chapter
Google Scholar

Ding CH: A probabilistic model for dimensionality reduction in information retrieval and filtering. Proc 1st SIAM Computational Information Retrieval Workshop. 2000

Google Scholar

Dhillon ID, Modha DS: Concept decomposition for large sparse text data using clustering. Machine Learning. 2001, 42: 143-175. 10.1023/A:1007612920971.

Article
Google Scholar

Golub G, Van Loan C: Matrix Computations. 1996, Johns Hopkins University Press, Baltimore, MD

Google Scholar

Figueiredo M, Jain AK: Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell. 2002, 24: 381-396.

Article
Google Scholar

Celeux G, Chrétien S, Forbes F, Mkhadri A: A component-wise EM algorithm for mixtures. J Comput Graph Stat. 2001, 10: 699-712.

Article
Google Scholar

Zhu X, Zhang S, Zhao H, Cooper RS: Association mapping, using a mixture model for complex traits. Genet Epidemiol. 2002, 23: 181-196. 10.1002/gepi.210.

Article
PubMed
Google Scholar

Alizadeh AA, Eisen MB, Davis RE, et al: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000, 403: 503-511. 10.1038/35000501.

Article
CAS
PubMed
Google Scholar

Alter O, Brown PO, Botstein D: Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA. 2000, 97: 10101-10106.

Article
PubMed Central
CAS
PubMed
Google Scholar

Bishop CM: Variational principle components. Proc 9th International Conference on Artificial Neural Networks. 1999, 1: 509-514.

Google Scholar

Brand ME: Incremental singular value decomposition of uncertain data with missing values. European Conference on Computer Vision (ECCV). 2002, 2350: 707-720.

Google Scholar

Chan KL, Lee TW, Sejnowski TJ: Handling missing data with variational learning of ICA. Advances in Neural Information Processing Systems. Edited by: Michael J, Kearns MN, Solla SA. 2003, MIT Press, Cambridge, MA, 415-420.

Google Scholar

Roweis S: EM algorithms for PCA, SPCA. Advances in Neural Information Processing Systems. Edited by: Michael J, Kearns MN, Solla SA. 1998, MIT Press, Cambridge, MA

Google Scholar

Butte AJ, Ye J, Haring HU, et al: Determining significant fold differences in gene expression analysis. Pac Symp Biocomput. 2001, 6-17.

Google Scholar

Sen S, Churchill GA: A statistical framework for quantitative trait mapping. Genetics. 2001, 159: 371-387.

PubMed Central
CAS
PubMed
Google Scholar

Broman KW, Wu H, Sen S, Churchill GA: R/qtl: QTL mapping in experimental crosses. Bioinformatics. 2003, 19: 889-890. 10.1093/bioinformatics/btg112.

Article
CAS
PubMed
Google Scholar

Ando RK: Latent semantic space: Iterative scaling improves precision of inter-document similarity measurement. Proc 23rd SIGIR. 2000, 216-223.

Google Scholar

Troyanskaya O, Cantor M, Sherlock G, et al: Missing value estimation methods for DNA microarrays. Bioinformatics. 2001, 17: 520-525. 10.1093/bioinformatics/17.6.520.

Article
CAS
PubMed
Google Scholar

The International HapMap Consortium: The international HapMap project. Nature. 2003, 426: 789-796. 10.1038/nature02168.

Article
Google Scholar

Rosenberg NA, Li LM, Ward R, Pritchard JK: Informativeness of genetic markers for inference of ancestry. Am J Hum Genet. 2003, 73: 1402-1422. 10.1086/380416.

Article
PubMed Central
CAS
PubMed
Google Scholar

Hudson RR: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002, 18: 337-338. 10.1093/bioinformatics/18.2.337.

Article
CAS
PubMed
Google Scholar

Wall ME, Rechtsteiner A, Rocha LM: Singular value decomposition and principal component analysis. A Practical Approach to Microarray Data Analysis. Edited by: Berrar DP, Dubitzky W, Kluwer MG. 2003, Norwell, MA

Google Scholar

Nakov P, Popova A, Mateev P: Weight functions impact on LSA performance. Proc RANLP. 2001, 187-193.

Google Scholar

Liu N, Chen L, Wang S, et al: Comparison of single-nucleotide polymorphisms and microsatellites in inference of population structure. BMC Genet. 2005, 6: S26-10.1186/1471-2156-6-S1-S26.

Article
PubMed Central
PubMed
Google Scholar

Nascimento S, Mirkin B, Moura-Pires F: Modeling proportional membership in fuzzy clustering. IEEE Trans Fuzzy Syst. 2003, 11: 173-186. 10.1109/TFUZZ.2003.809889.

Article
Google Scholar

Carreira-Perpinan M: A review of dimension reduction techniques. Technical Report CS-96-09, Department of Computer Science, University of Sheffield, Sheffield, UK. 1997

Google Scholar

Collins M, Dasgupta S, Schapire RE: A generalization of principal component analysis to the exponential family. Proc NIPS. 2001, 617-624.

Google Scholar

Schein A, Saul L, Ungar L: A generalized linear model for principal component analysis of binary data. Proc of the Ninth International Workshops Artificial Intelligence and Statistics. 2003

Google Scholar

Tipping M: Probabilistic visualisation of high-dimensional binary data. Advances in Neural Information Processing Systems. Edited by: Michael J, Kearns MN, Solla SA. 1999, MIT Press, Cambridge, MA, 592-598.

Google Scholar

Jornsten R, Yu B: Simultaneous gene clustering and subset selection for sample classification via MDL. Bioinformatics. 2003, 19: 1100-1109. 10.1093/bioinformatics/btg039.

Article
PubMed
Google Scholar

Xu G, Zha H, Golub G, Kailath T: Fast algorithms for updating signal subspaces. IEEE Trans Circuits Syst II: Analog and Digital Signal Processing. 1994, 41: 537-549. 10.1109/82.318942.

Article
Google Scholar

Minka T: Automatic choice of dimensionality for PCA. Proc of NIPS. 2000, 598-604.

Google Scholar

Hastie T, Tibshirani R, Eisen MB, et al: "Gene shaving" as a method for identifying distinct sets of genes with similar expression patterns. Genome Biol. 2000, 1: 3.1-3.21.

Article
Google Scholar

MacQueen J: Some methods for classification and analysis of multivariate observations. Proc 5th Berkeley Symp. 1967, 1: 281-297.

Google Scholar

Calinski RB, Harabasz J: A dendrite method for cluster analysis. Communications in Statistics. 1974, 3: 1-27.

Article
Google Scholar

Pollard D: Quantization and the method of k-means. IEEE Trans Inform Theory. 1982, 28: 199-205. 10.1109/TIT.1982.1056481.

Article
Google Scholar

Fridlyand J, Dudoit S: Application of resampling methods to estimate the number of clusters and to improve the accuracy of a clustering method. Technical Report 600, Department of Statistics, University of California, Berkeley, CA. 2001

Google Scholar

Tibshirani R, Walther G, Hastie T: Estimating the number of clusters in a data set via the gap statistic. JR Stat Soc Ser B. 2001, 63: 411-423. 10.1111/1467-9868.00293.

Article
Google Scholar