Dataset: | human | mouse | hd cattle | ld cattle |
---|
Physical length | 79 mbp | 95 mbp | 76 mbp | 76 mbp |
#samples | 83 | 15 | 64 | 469 |
#SNPs | 40,755 | 288,229 | 22,266 | 1,508 |
Missing genotype rate | 0.268% | 11.1% | 2.224% | 0.078% |
#complete SNPs | 34,071 | 144,820 | 5,487 | 1,271 |
- The summary of the four datasets we used in the simulation studies. These four datasets are all on chromosome 17, for human, mouse and cattle. The “physical length” refers to the chromosome length, in million basepairs (mbp). “#samples” is the number of individuals genotyped in the dataset, “#SNPs” is the number of SNP markers in the original dataset, and “missing genotype rate” refers to the percentage of missing genotype values in the original dataset. “#complete SNPs” is the number of SNP markers at which all samples have genotype values. All the other SNP markers were removed, leaving a complete sub-dataset to be used in the simulation studies.