Corander J, Waldmann P, Sillanpaa MJ: Bayesian analysis of genetic differentiation between populations. Genetics. 2003, 163: 367-374.
PubMed Central
CAS
PubMed
Google Scholar
Patterson N, Hattangadi N, Lane B, et al: Methods for high-density admixture mapping of disease genes. Am J Hum Genet. 2004, 74: 979-1000. 10.1086/420871.
Article
PubMed Central
CAS
PubMed
Google Scholar
Pritchard JK, Stephens M, Donnelly P: Inference of population structure using multilocus genotype data. Genetics. 2000, 155: 945-959.
PubMed Central
CAS
PubMed
Google Scholar
Rosenberg NA, Burke T, Elo K, et al: Empirical evaluation of genetic clustering methods using multilocus genotypes from 20 chicken breeds. Genetics. 2001, 159: 699-713.
PubMed Central
CAS
PubMed
Google Scholar
Rosenberg NA, Pritchard JK, Weber JL, et al: Genetic structure of human populations. Science. 2002, 298: 2381-2385. 10.1126/science.1078311.
Article
CAS
PubMed
Google Scholar
Lander ES, Schork NJ: Genetic dissection of complex traits. Science. 1994, 265: 2037-2048. 10.1126/science.8091226.
Article
CAS
PubMed
Google Scholar
Risch NJ: Searching for genetic determinants in the new millennium. Nature. 2000, 405: 847-856. 10.1038/35015718.
Article
CAS
PubMed
Google Scholar
Kim JJ, Verdu P, Pakstis AJ, et al: Use of autosomal loci for clustering individuals and populations of East Asian origin. Hum Genet. 2005, 117: 511-519. 10.1007/s00439-005-1334-8.
Article
PubMed
Google Scholar
Dawson KJ, Belkhir K: A Bayesian approach to the identification of panmictic populations and the assignment of individuals. Genet Res. 2001, 78: 59-77.
Article
CAS
PubMed
Google Scholar
Falush D, Stephens M, Pritchard JK: Inference of population structure using multilocus genotype data: Linked loci and correlated allele frequencies. Genetics. 2003, 164: 1567-1587.
PubMed Central
CAS
PubMed
Google Scholar
Satten GA, Flanders WD, Yang Q: Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. Am J Hum Genet. 2001, 68: 466-477. 10.1086/318195.
Article
PubMed Central
CAS
PubMed
Google Scholar
Tang H, Peng J, Wang P, Risch NJ: Estimation of individual admixture: Analytical and study design considerations. Genet Epidemiol. 2005, 28: 289-301. 10.1002/gepi.20064.
Article
PubMed
Google Scholar
Ding CH, X H, H Z, H S: Adaptive dimension reduction for clustering high dimensional data. Proc 2nd IEEE Intl Conf Data Mining. 2002, 147-154.
Google Scholar
Landauer TK, Foltz PW, Laham D: Introduction to latent semantic analysis. Discourse Process. 1998, 25: 259-284. 10.1080/01638539809545028.
Article
Google Scholar
Dempster AP, Laird NM, Rubin DB: Maximum likelihood for incomplete data via the EM algorithm (with discussion). J Roy Stat Soc Ser B. 1977, 39: 1-38.
Google Scholar
Moore AW: Very fast EM-based mixture model clustering using multiresolution kd-trees. Advances in Neural Information Processing Systems. Edited by: Kearns M, Solla S, Cohn D. 1999, MIT Press, Cambridge, MA, 543-549.
Google Scholar
Bartell BT, Cottrell GW, Belew RK: Latent semantic indexing is an optimal special case of multidimensional scaling. Proc SIGIR'92 Research and Development in Information Retrieval. 1992, 161-167.
Chapter
Google Scholar
Ding CH: A probabilistic model for dimensionality reduction in information retrieval and filtering. Proc 1st SIAM Computational Information Retrieval Workshop. 2000
Google Scholar
Dhillon ID, Modha DS: Concept decomposition for large sparse text data using clustering. Machine Learning. 2001, 42: 143-175. 10.1023/A:1007612920971.
Article
Google Scholar
Golub G, Van Loan C: Matrix Computations. 1996, Johns Hopkins University Press, Baltimore, MD
Google Scholar
Figueiredo M, Jain AK: Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell. 2002, 24: 381-396.
Article
Google Scholar
Celeux G, Chrétien S, Forbes F, Mkhadri A: A component-wise EM algorithm for mixtures. J Comput Graph Stat. 2001, 10: 699-712.
Article
Google Scholar
Zhu X, Zhang S, Zhao H, Cooper RS: Association mapping, using a mixture model for complex traits. Genet Epidemiol. 2002, 23: 181-196. 10.1002/gepi.210.
Article
PubMed
Google Scholar
Alizadeh AA, Eisen MB, Davis RE, et al: Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 2000, 403: 503-511. 10.1038/35000501.
Article
CAS
PubMed
Google Scholar
Alter O, Brown PO, Botstein D: Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci USA. 2000, 97: 10101-10106.
Article
PubMed Central
CAS
PubMed
Google Scholar
Bishop CM: Variational principle components. Proc 9th International Conference on Artificial Neural Networks. 1999, 1: 509-514.
Google Scholar
Brand ME: Incremental singular value decomposition of uncertain data with missing values. European Conference on Computer Vision (ECCV). 2002, 2350: 707-720.
Google Scholar
Chan KL, Lee TW, Sejnowski TJ: Handling missing data with variational learning of ICA. Advances in Neural Information Processing Systems. Edited by: Michael J, Kearns MN, Solla SA. 2003, MIT Press, Cambridge, MA, 415-420.
Google Scholar
Roweis S: EM algorithms for PCA, SPCA. Advances in Neural Information Processing Systems. Edited by: Michael J, Kearns MN, Solla SA. 1998, MIT Press, Cambridge, MA
Google Scholar
Butte AJ, Ye J, Haring HU, et al: Determining significant fold differences in gene expression analysis. Pac Symp Biocomput. 2001, 6-17.
Google Scholar
Sen S, Churchill GA: A statistical framework for quantitative trait mapping. Genetics. 2001, 159: 371-387.
PubMed Central
CAS
PubMed
Google Scholar
Broman KW, Wu H, Sen S, Churchill GA: R/qtl: QTL mapping in experimental crosses. Bioinformatics. 2003, 19: 889-890. 10.1093/bioinformatics/btg112.
Article
CAS
PubMed
Google Scholar
Ando RK: Latent semantic space: Iterative scaling improves precision of inter-document similarity measurement. Proc 23rd SIGIR. 2000, 216-223.
Google Scholar
Troyanskaya O, Cantor M, Sherlock G, et al: Missing value estimation methods for DNA microarrays. Bioinformatics. 2001, 17: 520-525. 10.1093/bioinformatics/17.6.520.
Article
CAS
PubMed
Google Scholar
The International HapMap Consortium: The international HapMap project. Nature. 2003, 426: 789-796. 10.1038/nature02168.
Article
Google Scholar
Rosenberg NA, Li LM, Ward R, Pritchard JK: Informativeness of genetic markers for inference of ancestry. Am J Hum Genet. 2003, 73: 1402-1422. 10.1086/380416.
Article
PubMed Central
CAS
PubMed
Google Scholar
Hudson RR: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002, 18: 337-338. 10.1093/bioinformatics/18.2.337.
Article
CAS
PubMed
Google Scholar
Wall ME, Rechtsteiner A, Rocha LM: Singular value decomposition and principal component analysis. A Practical Approach to Microarray Data Analysis. Edited by: Berrar DP, Dubitzky W, Kluwer MG. 2003, Norwell, MA
Google Scholar
Nakov P, Popova A, Mateev P: Weight functions impact on LSA performance. Proc RANLP. 2001, 187-193.
Google Scholar
Liu N, Chen L, Wang S, et al: Comparison of single-nucleotide polymorphisms and microsatellites in inference of population structure. BMC Genet. 2005, 6: S26-10.1186/1471-2156-6-S1-S26.
Article
PubMed Central
PubMed
Google Scholar
Nascimento S, Mirkin B, Moura-Pires F: Modeling proportional membership in fuzzy clustering. IEEE Trans Fuzzy Syst. 2003, 11: 173-186. 10.1109/TFUZZ.2003.809889.
Article
Google Scholar
Carreira-Perpinan M: A review of dimension reduction techniques. Technical Report CS-96-09, Department of Computer Science, University of Sheffield, Sheffield, UK. 1997
Google Scholar
Collins M, Dasgupta S, Schapire RE: A generalization of principal component analysis to the exponential family. Proc NIPS. 2001, 617-624.
Google Scholar
Schein A, Saul L, Ungar L: A generalized linear model for principal component analysis of binary data. Proc of the Ninth International Workshops Artificial Intelligence and Statistics. 2003
Google Scholar
Tipping M: Probabilistic visualisation of high-dimensional binary data. Advances in Neural Information Processing Systems. Edited by: Michael J, Kearns MN, Solla SA. 1999, MIT Press, Cambridge, MA, 592-598.
Google Scholar
Jornsten R, Yu B: Simultaneous gene clustering and subset selection for sample classification via MDL. Bioinformatics. 2003, 19: 1100-1109. 10.1093/bioinformatics/btg039.
Article
PubMed
Google Scholar
Xu G, Zha H, Golub G, Kailath T: Fast algorithms for updating signal subspaces. IEEE Trans Circuits Syst II: Analog and Digital Signal Processing. 1994, 41: 537-549. 10.1109/82.318942.
Article
Google Scholar
Minka T: Automatic choice of dimensionality for PCA. Proc of NIPS. 2000, 598-604.
Google Scholar
Hastie T, Tibshirani R, Eisen MB, et al: "Gene shaving" as a method for identifying distinct sets of genes with similar expression patterns. Genome Biol. 2000, 1: 3.1-3.21.
Article
Google Scholar
MacQueen J: Some methods for classification and analysis of multivariate observations. Proc 5th Berkeley Symp. 1967, 1: 281-297.
Google Scholar
Calinski RB, Harabasz J: A dendrite method for cluster analysis. Communications in Statistics. 1974, 3: 1-27.
Article
Google Scholar
Pollard D: Quantization and the method of k-means. IEEE Trans Inform Theory. 1982, 28: 199-205. 10.1109/TIT.1982.1056481.
Article
Google Scholar
Fridlyand J, Dudoit S: Application of resampling methods to estimate the number of clusters and to improve the accuracy of a clustering method. Technical Report 600, Department of Statistics, University of California, Berkeley, CA. 2001
Google Scholar
Tibshirani R, Walther G, Hastie T: Estimating the number of clusters in a data set via the gap statistic. JR Stat Soc Ser B. 2001, 63: 411-423. 10.1111/1467-9868.00293.
Article
Google Scholar