A computer simulation study on the number of loci and trees required to estimate genetic variability in cacao (Theobroma cacao L.)
Loading...
Date
2006
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Current methods for measures of genetic diversity of populations and germplasm collections are often based on statistics calculated from molecular markers. The objective of this study was to investigate the precision and accuracy of the most common estimators of genetic variability and population structure, as calculated from simple sequence repeat (SSR) marker data from cacao (Theobroma cacao L.). Computer simulated genomes of replicate populations were generated from initial allele frequencies estimated using SSR data from cacao accessions in a collection. The simulated genomes consisted of ten linkage groups of 100 cM in length each. Heterozygosity, gene diversity and the F statistics were studied as a function of number of loci and trees sampled. The results showed that relatively small random samples of trees were needed to achieve consistency in the observed estimations. In contrast, very large random samples of loci per linkage group were required to enable reliable inferences on the whole genome. Precision of estimates was increased by more than 50% with an increase in sample size from one to five loci per linkage group or 50 per genome, and up to 70% with ten loci per linkage group, or equivalently, 100 loci per genome. The use of fewer, highly polymorphic loci to analyze genetic variability led to estimates with substantially smaller variance but with an upward bias. Nevertheless, the relative differences of estimates among populations were generally consistent for the different levels of polymorphism considered.
Description
Keywords
Citation
Tree Genetics & Genomes (2006) 2: 152–164