Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing
- Jamie K. Teer1,3,
- Lori L. Bonnycastle1,3,
- Peter S. Chines1,
- Nancy F. Hansen1,
- Natsuyo Aoyama2,
- Amy J. Swift1,
- Hatice Ozel Abaan1,
- Thomas J. Albert2,
- NISC Comparative Sequencing Program1,
- Elliott H. Margulies1,
- Eric D. Green1,
- Francis S. Collins1,4,
- James C. Mullikin1,4 and
- Leslie G. Biesecker1,4
- 1 National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland 20892, USA;
- 2 Roche NimbleGen Inc., Madison, Wisconsin 53719, USA
-
↵3 These authors contributed equally to this work.
Abstract
Massively parallel DNA sequencing technologies have greatly increased our ability to generate large amounts of sequencing data at a rapid pace. Several methods have been developed to enrich for genomic regions of interest for targeted sequencing. We have compared three of these methods: Molecular Inversion Probes (MIP), Solution Hybrid Selection (SHS), and Microarray-based Genomic Selection (MGS). Using HapMap DNA samples, we compared each of these methods with respect to their ability to capture an identical set of exons and evolutionarily conserved regions associated with 528 genes (2.61 Mb). For sequence analysis, we developed and used a novel Bayesian genotype-assigning algorithm, Most Probable Genotype (MPG). All three capture methods were effective, but sensitivities (percentage of targeted bases associated with high-quality genotypes) varied for an equivalent amount of pass-filtered sequence: for example, 70% (MIP), 84% (SHS), and 91% (MGS) for 400 Mb. In contrast, all methods yielded similar accuracies of >99.84% when compared to Infinium 1M SNP BeadChip-derived genotypes and >99.998% when compared to 30-fold coverage whole-genome shotgun sequencing data. We also observed a low false-positive rate with all three methods; of the heterozygous positions identified by each of the capture methods, >99.57% agreed with 1M SNP BeadChip, and >98.840% agreed with the whole-genome shotgun data. In addition, we successfully piloted the genomic enrichment of a set of 12 pooled samples via the MGS method using molecular bar codes. We find that these three genomic enrichment methods are highly accurate and practical, with sensitivities comparable to that of 30-fold coverage whole-genome shotgun data.
Footnotes
-
↵4 Corresponding authors.
E-mail collinsf{at}mail.nih.gov; fax (301) 402-2218.
E-mail mullikin{at}mail.nih.gov; fax (301) 480-0634.
E-mail leslieb{at}helix.nih.gov; fax (301) 402-2170.
-
[Supplemental material is available online at http://www.genome.org. The sequence data from this study have been submitted to the NCBI Sequence Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi) under accession no. SRA022076. Bam2mpg is freely available at http://research.nhgri.nih.gov/software/bam2mpg.]
-
Article published online before print. Article and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.106716.110.
- Received March 13, 2010.
- Accepted July 29, 2010.