Skip to main content
Log in

Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations

  • Original Investigation
  • Published:
Human Genetics Aims and scope Submit manuscript

Abstract

Population stratification may confound the results of genetic association studies among unrelated individuals from admixed populations. Several methods have been proposed to estimate the ancestral information in admixed populations and used to adjust the population stratification in genetic association tests. We evaluate the performances of three different methods: maximum likelihood estimation, ADMIXMAP and Structure through various simulated data sets and real data from Latino subjects participating in a genetic study of asthma. All three methods provide similar information on the accuracy of ancestral estimates and control type I error rate at an approximately similar rate. The most important factor in determining accuracy of the ancestry estimate and in minimizing type I error rate is the number of markers used to estimate ancestry. We demonstrate that approximately 100 ancestry informative markers (AIMs) are required to obtain estimates of ancestry that correlate with correlation coefficients more than 0.9 with the true individual ancestral proportions. In addition, after accounting for the ancestry information in association tests, the excess of type I error rate is controlled at the 5% level when 100 markers are used to estimate ancestry. However, since the effect of admixture on the type I error rate worsens with sample size, the accuracy of ancestry estimates also needs to increase to make the appropriate correction. Using data from the Latino subjects, we also apply these methods to an association study between body mass index and 44 AIMs. These simulations are meant to provide some practical guidelines for investigators conducting association studies in admixed populations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Bacanu SA, Devlin B, Roeder K (2000) The power of genomic control. Am J Hum Genet 66:1933–1944

    Article  PubMed  CAS  Google Scholar 

  • Bonilla C, Parra EJ, Pfaff CL, Dios S, Marshall JA, Hamman RF, Ferrell RE, Hoggart CL, McKeigue PM, Shriver MD (2004) Admixture in the Hispanics of the San Luis Valley, Colorado, and its implications for complex trait gene mapping. Ann Hum Genet 68:139–153

    Article  PubMed  CAS  Google Scholar 

  • Burchard EG, Avila PC, Nazario S, Casal J, Torres A, Rodriguez-Santana JR, Toscano M, Sylvia JS, Alioto M, Salazar M, Gomez I, Fagan JK, Salas J, Lilly C, Matallana H, Ziv E, Castro R, Selman M, Chapela R, Sheppard D, Weiss ST, Ford JG, Boushey HA, Rodriguez-Cintron W, Drazen JM, Silverman EK (2004) Lower bronchodilator responsiveness in Puerto Rican than in Mexican subjects with asthma. Am J Respir Crit Care Med 169:386–392

    Article  PubMed  Google Scholar 

  • Burchard EG, Ziv E, Coyle N, Gomez SL, Tang H, Karter AJ, Mountain JL, Perez-Stable EJ, Sheppard D, Risch N (2003) The importance of race and ethnic background in biomedical research and clinical practice. N Engl J Med 348:1170–1175

    Article  PubMed  Google Scholar 

  • Cardon LR, Bell JI (2001) Association study designs for complex diseases. Nat Rev Genet 2:91–99

    Article  PubMed  CAS  Google Scholar 

  • Chakraborty R, Ferrell RE, Stern MP, Haffner SM, Hazuda HP, Rosenthal M (1986) Relationship of prevalence of non-insulin-dependent diabetes mellitus to Amerindian admixture in the Mexican Americans of San Antonio, Texas. Genet Epidemiol 3:435–454

    Article  PubMed  CAS  Google Scholar 

  • Devlin B, Roeder K (1999) Genomic control for association studies. Biometrics 55:997–1004

    Article  PubMed  CAS  Google Scholar 

  • Devlin B, Roeder K, Wasserman L (2001) Genomic control, a new approach to genetic-based association studies. Theor Popul Biol 60:155–166

    Article  PubMed  CAS  Google Scholar 

  • Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567–1587

    PubMed  CAS  Google Scholar 

  • Hanis CL, Chakraborty R, Ferrell RE, Schull WJ (1986) Individual admixture estimates: disease associations and individual risk of diabetes and gallbladder disease among Mexican–Americans in Starr County, Texas. Am J Phys Anthropol 70:433–441

    Article  PubMed  CAS  Google Scholar 

  • Hoggart CJ, Parra EJ, Shriver MD, Bonilla C, Kittles RA, Clayton DG, McKeigue PM (2003) Control of confounding of genetic associations in stratified populations. Am J Hum Genet 72:1492–1504

    Article  PubMed  CAS  Google Scholar 

  • Hoggart CJ, Shriver MD, Kittles RA, Clayton DG, McKeigue PM (2004) Design and analysis of admixture mapping studies. Am J Hum Genet 74:965–978

    Article  PubMed  CAS  Google Scholar 

  • King TE Jr (2002) Racial disparities in clinical trials. N Engl J Med 346:1400–1402

    Article  PubMed  Google Scholar 

  • Knowler WC, Williams RC, Pettitt DJ, Steinberg AG (1988) Gm3;5,13,14 and type 2 diabetes mellitus: an association in American Indians with genetic admixture. Am J Hum Genet 43:520–526

    PubMed  CAS  Google Scholar 

  • Lander ES, Schork NJ (1994) Genetic dissection of complex traits. Science 265:2037–2048

    Article  PubMed  CAS  Google Scholar 

  • Marchini J, Cardon LR, Phillips MS, Donnelly P (2004) The effects of human population structure on large genetic association studies. Nat Genet 36:512–517

    Article  PubMed  CAS  Google Scholar 

  • McKeigue PM, Carpenter JR, Parra EJ, Shriver MD (2000) Estimation of admixture and detection of linkage in admixed populations by a Bayesian approach: application to African–American populations. Ann Hum Genet 64:171–186

    Article  PubMed  CAS  Google Scholar 

  • Pritchard JK, Rosenberg NA (1999) Use of unlinked genetic markers to detect population stratification in association studies. Am J Hum Genet 65:220–228

    Article  PubMed  CAS  Google Scholar 

  • Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959

    PubMed  CAS  Google Scholar 

  • Risch N, Merikangas K (1996) The future of genetic studies of complex human diseases. Science 273: 1516–1517

    Article  PubMed  CAS  Google Scholar 

  • Rosenberg NA, Li LM, Ward R, Pritchard JK (2003) Informativeness of genetic markers for inference of ancestry. Am J Hum Genet 73:1402–1422

    Article  PubMed  CAS  Google Scholar 

  • Satten GA, Flanders WD, Yang Q (2001) Accounting for unmeasured population substructure in case-control studies of genetic association using a novel latent-class model. Am J Hum Genet 68:466–477

    Article  PubMed  CAS  Google Scholar 

  • Snyder EE, Walts B, Perusse L, Chagnon YC, Weisnagel SJ, Rankinen T, Bouchard C (2004) The human obesity gene map: the 2003 update. Obes Res 12:369–439

    Article  PubMed  CAS  Google Scholar 

  • Spielman RS, McGinnis RE, Ewens WJ (1993) Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). Am J Hum Genet 52:506–516

    PubMed  CAS  Google Scholar 

  • Tang H, Peng J, Wang P, Risch NJ (2005) Estimation of individual admixture: analytical and study design considerations. Genet Epidemiol 28:289–301

    Article  PubMed  Google Scholar 

  • Wagner DR, Heyward VH (2000) Measures of body composition in blacks and whites: a comparative review. Am J Clin Nutr 71:1392–1402

    PubMed  CAS  Google Scholar 

  • Wright S (1969) Evolution and the genetics of populations, vol 2: the theory of gene frequencies. University of Chicago Press, Chicago

  • Zhang S, Zhao H (2001) Quantitative similarity-based association tests using population samples. Am J Hum Genet 69:601–614

    Article  PubMed  CAS  Google Scholar 

  • Zhang S, Zhu X, Zhao H (2003) On a semiparametric test to detect associations between quantitative traits and candidate genes using unrelated individuals. Genet Epidemiol 24:44–56

    Article  PubMed  Google Scholar 

  • Zhu X, Zhang S, Zhao H, Cooper RS (2002) Association mapping, using a mixture model for complex traits. Genet Epidemiol 23:181–196

    Article  PubMed  Google Scholar 

  • Ziv E, Burchard EG (2003) Human population structure and genetic association studies. Pharmacogenomics 4:431–441

    Article  PubMed  Google Scholar 

Download references

Acknowledgements

Financial support was received from HL07185, GM61390, American Lung Association of California, RWJ Amos Medical Faculty Development Award, NCMHD Health Disparities Scholar, Extramural Clinical Research Loan Repayment Program for Individuals from Disadvantaged Backgrounds, 2001–2003, to Esteban González Burchard, K22CA109351, from the NIH, CRTG 02-0841-CCE from the American Cancer Society, and BCRP030551 from the Department of Defense to Elad Ziv, U19AG23122 from NIH to Steven Cummings, HL51823, HL074204, 3M01RR000083-38S30488, HL56443 and HL51831 to the Asthma Clinical Research Network, U01-CA86117, SFGH General Clinical Research Center M01RR00083-41, U01-HL 65899, UCSF-Children’s Hospital of Oakland Pediatric Clinical Research Center (M01 RR01271), Oakland, CA, Sandler Center for Basic Research in Asthma and the Sandler Family Supporting Foundation. The authors would like to acknowledge the families and the patients for their participation. The authors would also like to thank the numerous health care providers and community clinics for their support and participation in the GALA Study. In addition to the primary clinical centers of the investigators, participating community clinics and hospitals include: La Clinica de La Raza, Oakland, CA; UCSF-Children’s Hospital of Oakland Pediatric Clinical Research Center, Oakland, CA; General Clinical Research Center, SFGH, San Francisco, CA; Alliance Medical Center, Healdsburg, CA; Santa Clara Valley Medical Center, San José, CA; Fair Oaks Family Health Center, Redwood City, CA; Clinica de Salud del Valle de Salinas, Salinas, CA; Natividad Medical Center, Salinas, CA; Asthma Education and Management Program, Community Medical Centers, Fresno, CA., Diagnostic Health Centers of: Corozal, Naranjito, Catano, Orocovis, Barranquitas and San Antonio Hospital of Mayaguez. The authors would also like to acknowledge Monica Toscano, MariaElena Alioto, Ivan Gomez, Henry Matallana, Carmen Jimenez, Yannett Marcano, Pedro Yapor, Alma Ortiz, Lisandra Perez and Sheila Gonzalez for their assistance with recruitment and study organization. The authors would like to especially thank Dr. Jeffrey M. Drazen, Dr. Ed Silverman, Dr. Homer A. Boushey, Dr. Dean Shepaprd, Dr. Sylvette Nazario, Dr. Jesus Casal, Dr. Alfonso Torres, Dr. Jose Rodriguez-Santana, Dr. Rocio Chapella, Dr. Scott Weiss, and Dr. Jean G. Ford for all of their effort towards the creation of the GALA Study and to Dr. Mark D. Shriver for assistance in development of the AIMs and for providing ancestral DNA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hui-Ju Tsai.

Additional information

Esteban González Burchard and Elad Ziv contributed equally to this manuscript.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tsai, HJ., Choudhry, S., Naqvi, M. et al. Comparison of three methods to estimate genetic ancestry and control for stratification in genetic association studies among admixed populations. Hum Genet 118, 424–433 (2005). https://doi.org/10.1007/s00439-005-0067-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00439-005-0067-z

Keywords

Navigation