Singapore Genome Variation Project: A haplotype map of three Southeast Asian populations

  1. Yik-Ying Teo1,2,3,7,
  2. Xueling Sim1,7,
  3. Rick T.H. Ong1,4,7,
  4. Adrian K.S. Tan4,
  5. Jieming Chen4,
  6. Erwin Tantoso4,
  7. Kerrin S. Small3,
  8. Chee-Seng Ku1,
  9. Edmund J.D. Lee5,
  10. Mark Seielstad4,8 and
  11. Kee-Seng Chia1,6,8,9
  1. 1 Centre for Molecular Epidemiology, National University of Singapore, Singapore 117597;
  2. 2 Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546;
  3. 3 Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, United Kingdom;
  4. 4 Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore 138672;
  5. 5 Department of Pharmacology, National University of Singapore, Singapore 117597;
  6. 6 Department of Epidemiology and Public Health, National University of Singapore, Singapore 117597
    1. 7 These authors contributed equally to this work.

    Abstract

    The Singapore Genome Variation Project (SGVP) provides a publicly available resource of 1.6 million single nucleotide polymorphisms (SNPs) genotyped in 268 individuals from the Chinese, Malay, and Indian population groups in Southeast Asia. This online database catalogs information and summaries on genotype and phased haplotype data, including allele frequencies, assessment of linkage disequilibrium (LD), and recombination rates in a format similar to the International HapMap Project. Here, we introduce this resource and describe the analysis of human genomic variation upon agglomerating data from the HapMap and the Human Genome Diversity Project, providing useful insights into the population structure of the three major population groups in Asia. In addition, this resource also surveyed across the genome for variation in regional patterns of LD between the HapMap and SGVP populations, and for signatures of positive natural selection using two well-established metrics: iHS and XP-EHH. The raw and processed genetic data, together with all population genetic summaries, are publicly available for download and browsing through a web browser modeled with the Generic Genome Browser.

    Footnotes

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server