Book Reviews

Cambridge University Press has done the authors proud. The book is superbly produced: beautiful print, high-grade paper, and seventy-six illustrations of outstanding quality which are an essential, an integral, part of the work as a whole. By today's standards £40 is a moderate price for an academic book of such high quality. Here is a book that gives both more and less than its title suggests. This is neither a study of medieval plant names (the introduction, pp. 1-23, reprinted with slight changes from Ber. Physico-Medica 1981/83, is a sketch of the difficulties involved in identifying medieval herbs) nor a proper semantic and lexicographical investigation of the ways in which plants were named. Nor is it a complete listing of medieval plant names, with all their variants, an almost impossible task, although one now facilitated by Daems' comprehensive indexes of Latin and vernacular names that form part III of the book. Instead, Daems has chosen to use as his base two largely complementary manuscript lists of synonyms, Basel, Universitaitsbibliothek D II 13, fols 2r-9r, 1402, and Kassel, Landesbibliothek 40 med. 10, fols 81r-83r, early fifteenth century, the first with 488 plant names, the second 270. Each plant name is followed by a list of the synonyms found in a variety of other manuscripts and editions, and concludes with modern plant identifications. An appendix of 61 pages adds a further series of synonyms for the plants listed earlier, a confusing procedure for which it is hard to see a convincing justification in an age of computer typesetting. Access to the material is helped by a good index of sources and of modern botanical names. Checking Daem's listings of Wellcome manuscripts 332, 625, 642, and 708 confirms the general accuracy of the transcriptions (p. 91 has a rare error: artemisia in Wellcome 642 is glossed as biboj3 vel buck), but throws up other problems. Not all the synonyms in these manuscripts are included by Daems, e.g., p. 101, s.v. aristologia, add Wellcome 708, 43r, and many of them are included in the Wellcome glossaries under different headings, e.g., p. 109, WMS 708 glosses the word "urtica", not "acantum" as the reader might expect; p. 1 13, WMS 708, 43v has a variety of synonyms but divided between ambrosia minor and maior. The editor's silence should thus not be taken to indicate that a particular plant is not included in a named manuscript …

The aim of 'Statistics for Microarrays' is to explain the statistical methods commonly used for microarraya nalysis. The book is divided into twop arts: the first, 'Getting Good Data', focuseso ne xperimental design, normalisation and quality control issues. The second, 'Getting Good Answers', deals with higher level statistical inferences about the biological questions of interest, such as clustering samples or genes; assessing differential expression; and classificationa nd prediction. The booko pens withau seful section describing a number of experiments and associated datasetst hat areu sed throughout the book to illustrate the different stages in an analysisofamicroarraydataset. An additionalspecial feature is the inclusion of descriptions of R-language functions. Some of these aree xisting functions in standard Rl ibraries, othersa re implementations by the authorso fn ew methods developed in the book. Then ew functions are included in an R-package 'smida',w hich,a longw ith the datasets, is made available online at the website for the book.
The book represents one of the first attempts to present acoherent exposition of the field of microarraydata analysis, and is written in aclear and readable style.Although not areference work, it has been written in such away that one should not have to readthe earlier chaptersinorder to understand the later ones. This idea that chaptersors ections should be self-contained has been somewhat overdone and results in some statistical methods being explained more than once.For example, the definition of the t-statistic is giveninboth the classical and the Bayesian hypothesis sections in Chapter 8. The style is also rather repetitivewithin sections: for example, 'confounding' is explained well on page37, yetaseparate paragraph with the heading 'confounding' appears on page38. In general, the use of cross-references within the book would have been helpful (eg Sammon plots are used at an early stage but are not introduced until later,withnocross-reference).
Chapters3and 4onexperimental design and normalisation are very good. There is ad etailed discussion of then umber of replicate arrays needed to detect ag iven level of fold change between experimental conditions; ad iscussion of the variability of pooled samples;a nd an extensives ection on finding optimal designs for two-colour arrays. Normalisationi s explained well, with plots illustrating the variousr easons for normalisation. Thed iscussiono fs ingle-channel arrays, in particular those of Affymetrix, is partly dealt with in separate subsections of Chapters3a nd 4.The resulting presentation is rather messy and also slightly misleading: the probes representing ag ene aren ot technicalr eplicates (as claimed in Chapter 3), but represent different subsequences of the sequence encoding ag ene (as correctly stated in Chapter 4). The shortd escription of some methods for estimating gene expression measures from oligonucleotide arrays at the end of the normalisationc hapter does not do justice to the large literature on this subject.
The quality assessment chapter contains someinstructive examplesand good illustrations of the qualities of some of the most commonly used pairwise distance measures: the Euclidian, Manhattan and correlation distance measures.The chapter describes someinteresting methods, such as Sammon mappingfor dimensional reduction and 'false arrayimages' for assessing arrayhandling, but on the former pointisnot very clear. Sammon mappingisused to illustrate several different possible reasons for poor-quality data; however, howthe different possible reasons would be distinguished is not obvious.
There are twoc hapterso nc lustering methods, one on unsupervised methods used to group samples and/or genes and one on supervisedm ethods of classification. Thec hapter on unsupervised clustering starts with ag ood discussiono f different possible measures for calculating distancesb etween clusters.I tf ocuses mainly on hierarchical and partitioning around medoids (PAM)-type algorithms, with ab rief discussion of model-based clustering at the end. The authorsrightly warn against putting too much faith in agglomerative hierarchical clustering of genes,a lthough the point could have been made better withsome illustrations of where this maygo wrong, in line with the good illustrations in the booko nt he virtues of various distance measures.
The topic of gene-filtering is covered in the chapter on supervisedm ethods. This chapter briefly introduces an umber of important concepts in classification theory, predictor evaluation and cross-validation. Attention is restrictedt o simplem ethods, such as principal component analysis, linear discriminant analysis and k-nearest neighbour classification for class prediction and penalised and k-nearest neighbour regression for classifying and predicting continuous responses. Throughout the chapterso nc lustering, the authorsp resent anumberofnew methods, particularlyrelating to the problem of selecting appropriate numberso fc lusterso rn umbers of predictors, which are made available as Rf unctions.
Differential geneexpression is covered in aseparate chapter. The standard varieties of t-test are described, along with guidanceo nw hich to use in variouss ituations. Moreover, p -values and errorr ates (familywise and false discovery rates) are discussed and different methods of obtaining p -values (parametric,b ootstrapa nd permutation) areg iven. This section provides av eryg ood introduction to one of the most widely used methods for assessing differential expression. One

BOOK REVIEWS
q HENRYSTEWART PUBLICATIONS 1479-7364. HUMANGENOMICS .V OL 2. NO 1. 75-76 MARCH 2005 drawback is that there is no discussion of methods (such as the significance analysiso fm icroarrays [SAM] method) for stabilising gene variancee stimates used in t-statistics, which are often used when very fewr eplicate arrays area vailable.
In summary, this book provides ag ood introduction to the statisticalanalysisofmicroarraydata. The focus is primarily on cDNA arrays, although the higher-level analysisi nt he second half of the bookc an mostly be applied to oligonucleotide arrays as well. The more mathematically inclined reader mayw ish to refer to originalp apersf or sophisticated discussion. The books hould be well suited to the biologist or computer scientist who wants an overview of the problems encountered in analysing microarrayd ata and to gain some understandingo ft he different methods available.

Anne-Mette Hein and Alex Lewin Department of Epidemiology and Public Health
Imperial College,L ondon London, UK This book tells the historyo fs tudies of classical polymorphisms, surnames and church records of marriages (and particu-larly dispensations to marry relatives) that began in the small communities of theP arma Va lley during the 1950s and 1960s, and wasl ater extended to the Italian islandsa nd -inm ore limitedf orm-the whole of Italy.

Consanguinity,i nbreeding, and genetic drift in Italy
Although aspects of these studies have previously been published in several formats, this is the first full account of the background to the studies, the methods used, ther esults and the conclusions of the principal investigators. The publication of this booki st herefore important, becauset hese studies have formed ab edrock for late 20th centuryp opulation genetics. Human population genetics has, and continues to be,m arred by poorly designed sampling schemes. The careful designo f these studies, together with statistical analyses and computer simulations that were often groundbreaking, remains an example to others.
Much of the book discusses consanguineous marriagesfor example, marriages between cousins. The discussion covers Roman,' German' (Lombard) and CatholicC hurch law, as well as other social, economic and demographic factorst hat affect the prevalence of such marriages.T he effects of both consanguinity and 'random inbreeding' (geographically restrictedmatechoice) on genetic drift are also studied. There is also ac hapter discussing their effects on both normal and pathological phenotypes.
In these days of genome-wide genetic surveys and fast computational analyses, the painstaking effortr equired to collect and analyse these relatively sparse data seems unthinkable.A lthough we can nowa nswert he questions withm uch greater precision, the basic issues of genetic variationonafine geographical scale -and its relationship to demographic factors, drift, selection and,u ltimately,t op henotypes of interest -are the same as they were 50 yearsa go,w hen the studies described here were just beginning.

DavidB alding Imperial College London
London, UK