Skip to main content
Log in

Estimation of DNA Sequence Context-dependent Mutation Rates Using Primate Genomic Sequences

  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

It is understood that DNA and amino acid substitution rates are highly sequence context-dependent, e.g., C→T substitutions in vertebrates may occur much more frequently at CpG sites and that cysteine substitution rates may depend on support of the context for participation in a disulfide bond. Furthermore, many applications rely on quantitative models of nucleotide or amino acid substitution, including phylogenetic inference and identification of amino acid sequence positions involved in functional specificity. We describe quantification of the context dependence of nucleotide substitution rates using baboon, chimpanzee, and human genomic sequence data generated by the NISC Comparative Sequencing Program. Relative mutation rates are reported for the 96 classes of mutations of the form 5′αβγ3′ → 5′αδγ3′, where α, β, γ, and δ are nucleotides and β ≠ δ, based on maximum likelihood calculations. Our results confirm that C→T substitutions are enhanced at CpG sites compared with other transitions, relatively independent of the identity of the preceding nucleotide. While, as expected, transitions generally occur more frequently than transversions, we find that the most frequent transversions involve the C at CpG sites (CpG transversions) and that their rate is comparable to the rate of transitions at non-CpG sites. A four-class model of the rates of context-dependent evolution of primate DNA sequences, CpG transitions > non-CpG transitions ≈ CpG transversions > non-CpG transversions, captures qualitative features of the mutation spectrum. We find that despite qualitative similarity of mutation rates among different genomic regions, there are statistically significant differences.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

References

  • Agnez-Lima LF, Napolitano RL, Fuchs RP, Mascio PD, Muotri AR, Menck CF (2001) DNA repair and sequence context affect (1)O(2)-induced mutagenesis in bacteria. Nucleic Acids Res 29(13):2899–2903

    Article  PubMed  CAS  Google Scholar 

  • Aquadro CF, Greenberg BD (1983) Human mitochondrial DNA variation and evolution: Analysis of nucleotide sequences from seven individuals. Genetics 103:287–312

    PubMed  CAS  Google Scholar 

  • Berg JM, Tymoczko JL, Stryer L (2002) Biochemistry, 5th ed. W.H.Freeman, New York

    Google Scholar 

  • Blake RD, Hess ST, Nicholson-Tuell J (1992) The influence of nearest neighbors on the rate and pattern of spontaneous point mutations. J Mol Evol 34:189–200

    Article  PubMed  CAS  Google Scholar 

  • Bouffard GG, Idol JI, Braden VV, Iyer LM, Cunningham AF, Weintraub LA, Touchman JW, Mohr-Tidwell RM, Peluso DC, Fulton RS, Ueltzen MS,Weissenbach J, Magness CL, Green ED (1997) A physical map of human chromosome7: an integrated YAC contig map with average STS spacing of 79 kb. Genome Res 7:673–692

    Article  PubMed  CAS  Google Scholar 

  • Brown WM, Prager EM, Wilson AC (1982) Mitochondrial DNA sequences of primates, tempo and mode of evolution. J Mol Evol 18:225–239

    Article  PubMed  CAS  Google Scholar 

  • Coulondre C, Miller JH, Farabaugh PJ, Gilbert W (1978) Molecular basis of base substitution hotspots in Escherichia coli. Nature 274:775–778

    Article  PubMed  CAS  Google Scholar 

  • Curtis SE, Clegg MT (1984) Molecular evolution of chloroplast DNA sequences. Mol Biol Evol 1:291–301

    PubMed  CAS  Google Scholar 

  • Delany JC, Essigmann JM (2001) Effect of sequence context on O(6)-methylguanine repair and replication in vivo. Biochemistry 40(49):14968–14975

    Article  Google Scholar 

  • Denissenko MF, Chen JX, Tang M, Pfeifer GP (1997) Cytosine methylation determines hot spots of DNA damage in the human P53 gene. Proc Natl Acad Sci U S A 94(8):3893–3898

    Article  PubMed  CAS  Google Scholar 

  • Duncan BK, Weiss B (1982) Specific mutator effects of ung (uracil-DNA glycosylase) mutation in Escherichia coli. J Bacteriol 151:750–755

    PubMed  CAS  Google Scholar 

  • Felsenstein J (1981) Evolutionary trees from DNA sequences: A maximum likelihood approach. J Mol Evol 17:368–376

    Article  PubMed  CAS  Google Scholar 

  • Felsenstein J (2004) Inferring Phylogenies. Sinauer Associates, Sunderland, MA

    Google Scholar 

  • Evans J, Maccabee M, Hatahet Z, Courcelle J, Bockrath R, Ide H, Wallace S (1993) Thymine ring saturation and fragmentation products: lesion bypass, misinsertion and implications for mutagenesis. Mutat Res 299:147–156

    Article  PubMed  CAS  Google Scholar 

  • Fryxell KJ, Zuckerkandl E 2000) Cytosine deamination plays a primary role in the evolution of mammalian isochors. Mol Biol Evol 17(9):1371–1383

    PubMed  CAS  Google Scholar 

  • Gojobori T, Li W, Graur D (1982) Patterns of nucleotide substitution in pseudogenes and functional genes. J Mol Evol 18:360–369

    Article  PubMed  CAS  Google Scholar 

  • Green PM, Montandon AJ, Bentley DR, Liung R, Nilsson IM, Giannelli F (1990) The incidence and distribution of CpG—-TpG transitions in the coagulation factor IX gene: A fresh look at CpG mutational hotspots. Nucleic Acids Res 18(11):3227–3231

    Article  PubMed  CAS  Google Scholar 

  • Green P, Ewing B, Miller W, Thomas PJ, NISC Comparative Sequencing Program, Green ED (2003) Transcription-associated mutational asymmetry in mammalian evolution. Nat Genet 33:514–517

    Article  PubMed  CAS  Google Scholar 

  • Hatahet Z, Wallace SS (1998) Translesion DNA Repair, DNA Damage and Repair – Vol 1: DNA Repair in Prokaryotes and Lower Eukaryotes. Humana Press, Totowa, NJ

    Google Scholar 

  • Hatahet Z, Zhou M, Reha-Krantz LJ, Morrical SW, Wallace SS (1998) In search of a mutational hotspot. Proc Natl Acad Sci U S A 95:8556–8561

    Article  PubMed  CAS  Google Scholar 

  • Hatahet Z, Zhou M, Reha-Krantz LJ, Ide H, Morrical SW, Wallace SS (1999) In vitro selection of sequence contexts which enhance bypass of abasic sites and tetrahydrofuran by T4 DNA polymerase holoenzyme. J Mol Biol 286:1045–1057

    Article  PubMed  CAS  Google Scholar 

  • Hatsukami DK, Slade J, Benowitz NL, Giovino GA, Gritz ER, Leischow S, Warner KE (2002) Reducing tobacco harm: Research challenges and issues. Nicotine Tob Res 4 Suppl 2:S89–S101

    Article  PubMed  Google Scholar 

  • Hayes RC, LeClerc JE (1986) Sequence dependence for bypass of thymine glycols in DNA by DNA polymerase I. Nucleic Acids Res 14:1045–1061

    Article  PubMed  CAS  Google Scholar 

  • Hess ST, Blake JD, Blake RD (1994) Wide variation in neighbor-dependent substitution rates. J Mol Biol 236:1022–1033

    Article  PubMed  CAS  Google Scholar 

  • Ide H, Kow YW, Wallace SS (1985) Thymine glycols and urea residues in M13 DNA constitute replicative blocks in vitro. Nucleic Acids Res 13:8035–8052

    Article  PubMed  CAS  Google Scholar 

  • Jones M, Wagner R, Radman M (1987) Repair of a mismatch is influenced by the base composition of the surrounding nucleotide sequence. Genetics 115:605–610

    PubMed  CAS  Google Scholar 

  • Jukes TH, Cantor CR (1969) Evolution of protein molecules. In Munro HN (ed) Mammalian Protein Metabolism, Academic Press, New York pp 21–123

    Google Scholar 

  • Ketterling RP, Veilhaber E, Sommer SS (1994) The rates of G:CT:A and G:CC:G transversions at CpG dinucleotides in the human factor IX gene. Am J Hum Genet 54(5):832–835

    Google Scholar 

  • Kimura M (1980) A simple method for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. J Mol Evol 16:111–120

    Article  PubMed  CAS  Google Scholar 

  • Krawczak M, Ball EV, Cooper DN (1998) Neighboring-nucleotide effects on the rates of germ-line single-base-pair substitution in human genes. Am J Hum Genet 63(2):474–488

    Article  PubMed  CAS  Google Scholar 

  • Li W (1997) Molecular Evolution. Sinauer Associates, Sunderland, MA

    Google Scholar 

  • Maddison DR, Maddison WP, Schulz KS, Wheeler T, Frumkin J (2001) The Tree of Life Web Project. Available at http://www.tolweb.org

  • Mancini D, Singh S, Ainsworth P, Rodenhiser D (1997) Constitutively methylated cpG dinucleotides as mutation hot spots in the retinoblastoma gene (RB1). Am J Hum Genet 61(1):80–87

    PubMed  CAS  Google Scholar 

  • Morton BR (2003) The role of context-dependent mutations in generating compositional and codon usage bias in grass chloroplast DNA. J Mol Evol 56:616–629

    Article  PubMed  CAS  Google Scholar 

  • Morton BR, Clegg MT (1995) Neighboring base composition is strongly correlated with base substitution bias in a region of the chloroplast genome. J Mol Evol 41(5):597–603

    Article  PubMed  CAS  Google Scholar 

  • Morton BR, Oberholzer VM, Clegg MT (1997) The influence of specific neighboring bases on substitution bias in noncoding regions of the plant chloroplast genome. J Mol Evol 45(3):227–231

    Article  PubMed  CAS  Google Scholar 

  • Mund C, Musch T, Strodicke M, Assmann B, Li E, Lyko F (2004) Comparative analysis of DNA methylation patterns in transgenic Drosophila overexpressing mouse DNA methyltransferases. Biochem J 378(Pt 3):763–768

    Article  PubMed  CAS  Google Scholar 

  • Nachman MW, Crowell SL (2000) Estimate of the mutation rate per nucleotide in humans. Genetics 156:297–304

    PubMed  CAS  Google Scholar 

  • Ng PC, Henikoff S (2003) SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res 31:3812–3814

    Article  PubMed  CAS  Google Scholar 

  • Ota R, Penny D (2003) Estimating changes in mutational mechanisms of evolution. J Mol Evol 57:S23–S240

    Article  Google Scholar 

  • Peltonen L, McKusick VA (2001) Genomics and medicine. Dissecting human disease in the postgenomic era. Science 291:1224–1229

    Article  PubMed  CAS  Google Scholar 

  • Petruska J, Goodman MF (1985) Influence of neighboring bases on DNA polymerase insertion and proofreading fidelity. J Biol Chem 260:7533–7539

    PubMed  CAS  Google Scholar 

  • Purmal AA, Kow YW, Wallace SS (1994) Major oxidative products of cytosine, 5-hydroxycytosine and 5-hydroxyuracil, exhibit sequence context-dependent mispairing in vitro. Nucleic Acids Res 22:72–78

    Article  PubMed  CAS  Google Scholar 

  • Radman M, Wagner R (1986) Mismatch repair in Escherichia coli. Annu Rev Genet 20:523–538

    Article  PubMed  CAS  Google Scholar 

  • Razin A, Riggs AD (1980) DNA methylation and gene function. Science 210:604–610

    Article  PubMed  CAS  Google Scholar 

  • Rice P, Longden I, Bleasby A (2000) EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet 16(6):276–277

    Article  PubMed  CAS  Google Scholar 

  • Santalucia J Jr, Allawi HT, Seneviratne PA (1996) Improved nearest-neighbor parameters for predicting DNA duplex stability. Biochemistry 35:3555–3562

    Article  PubMed  CAS  Google Scholar 

  • Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R, Hardison R, Miller M (2000) PipMaker–A web server for aligning two genomic DNA sequences. Genome Res 10:577–586

    Article  PubMed  CAS  Google Scholar 

  • Schwartz S, Elnitski L, Li M, Weirauch M, Riemer C, Smit A, NISC Comparative Sequencing Program, Green ED, Hardison RC, Miller M (2003a) MultiPipMaker and supporting tools: alignments and analysis of multiple genomic DNA sequences. Nucleic Acids Res 31:3518–3524

    Google Scholar 

  • Schwartz S, Kent WJ, Smit A, Zhang Z, Baertsch R, Hardison RC, Haussler D, Miller W (2003b) Human–mouse alignments with BLASTZ. Genome Res 13:103–107

  • Seibert E, Ross JB, Osman R (2002) Role of DNA flexibility in sequence-dependent activity of uracil DNA glycosylase. Biochemistry 41(36):10976–10984

    Article  PubMed  CAS  Google Scholar 

  • Shen JC, Rideout WM 3rd, Jones PA (1994) The rate of hydrolytic deamination of 5-methylcytosine in double stranded DNA. Nucleic Acids Res 22(6):972–976

    Article  PubMed  CAS  Google Scholar 

  • Shiraishi M, Oates AJ, Sekiya T (2002) An overview of the analysis of DNA methylation in mammalian genomes. Biol Chem 383(6):893–906

    Article  PubMed  CAS  Google Scholar 

  • Siepel A, Haussler D (2004) Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 21:468–488

    Article  PubMed  CAS  Google Scholar 

  • Skopek T, Marino D, Kort K, Miller J, Trumbauer M, Gopal S, Chen H (1998) Effects of target genes CpG content on spontaneous mutations in 299 transgenic mice. Mutat Res 400(1–2):77–88

    PubMed  CAS  Google Scholar 

  • Thomas JW, Summers TJ, Lee-Lin SQ, Braden Maduro VV, Idol JR, Mastrian SD, Ryan JF, Jamison DC, Green ED (2000) Comparative genome mapping in the sequence-based era: early experience with human chromosome 7. Genome Res 10:624–633

    Article  PubMed  CAS  Google Scholar 

  • Thomas JW, Touchman JW, Blakesley RW, Bouffard GG, Beckstrom-Sternberg SM, Margulies EH, Blanchette M, Siepel AC, Thomas PJ, McDowell JC, Maskeri B, Hansen NF, Schwartz MS, Weber RJ, Kent WJ, Karolchik D, Bruen TC, Bevan R, Cutler DJ, Schwartz S, Elnitski L, Idol JR, Prasad AB, Lee-Lin S-Q, Maduro VVB, Summers TJ, Portnoy ME, Dietrich NL, Akhter N, Ayele K, Benjamin B, Cariaga K, Brinkley CP, Brooks SY, Granite S, Guan X, Gupta J, Haghighi P, Ho S-L, Huang MC, Karlins E, Laric PL, Legaspi R, Lim MJ, Maduro QL, Masiello CA, Mastrian SD, McCloskey JC, Pearson R, Strantripop S, Tiongson EE, Tran JT, Tsurgeon C, Vogt JL, Walker MA, Wetherby KD, Wiggins LS, Young AC, Zhang L-H, Osoegawa K, Zhu B, Zhao B, Shu CL, De Jong PJ, Lawrence CE, Smit AF, Chakravarti A, Haussler D, Green P, Miller W, Green ED (2003) Comparative analyses of multi-species sequences from targeted genomic regions. Nature 424:788–793

    Article  PubMed  CAS  Google Scholar 

  • Tornaletti S, Pfeifer GP (1995) Complete and tissue-independent methylation of CpG sites in the p53 gene:implications for mutations in human cancers. Oncogene 10(8):1493–1499

    PubMed  CAS  Google Scholar 

  • Vigilant L, Stoneking M, Harpending H, Hawkes K, Wilson AC (1991) African populations and the evolution of human mitochondrial DNA. Science 253:1503–1507

    Article  PubMed  CAS  Google Scholar 

  • Wakeley J (1994) Substitution rate variation among sites and the estimation of transition bias. Mol Biol Evol 11:436–442

    PubMed  CAS  Google Scholar 

  • Wakeley J (1996) The excess of transitions among nucleotide substitutions: New methods of estimating transition bias underscore its significance. TREE 11:158–163

    Google Scholar 

  • Wallace SS (2002) Biological consequences of free radical-damaged DNA bases. Free Rad Biol Med 33(1):1–14

    Article  PubMed  CAS  Google Scholar 

  • Walsh CP, Bestor TH (1999) Cytosine methylation and mammalian development. Genes Dev 13(1):26–34

    PubMed  CAS  Google Scholar 

  • Weisenberger DJ, Romano LJ (1999) Cytosine methylation in a CpG sequence leads to enhanced reactivity with benzo[a]pyrene diol epoxide that correlates with a conformational change. J Bio Chem 274:23948–23955

    Article  CAS  Google Scholar 

  • Yang AS, Gonzalgo ML, Zingg JM, Millar RP, Buckley JD, Jones PA (1996) The rate of CpG mutation in Alu repetitive elements within the p53 tumor suppressor gene in the primate germline. J Mol Evol 258(2):240–250

    CAS  Google Scholar 

  • Yang Z (1994) Estimating the pattern of nucleotide substitution. J Mol Evol 39:105–111

    PubMed  Google Scholar 

  • Yang Z, Yoder AD (1999) Estimation of transition/transversion rate bias and species sampling. J Mol Evol 48:274–283

    Article  PubMed  CAS  Google Scholar 

  • Yang Z, Ro S, Rannala B (2003) Likelihood models of somatic mutation and codon substitution in cancer genes. Genetics 165:695–705

    PubMed  CAS  Google Scholar 

  • Zharkikh A (1994) Estimation of evolutionary distances between nucleotide sequences. J Mol Evol 39:315–329

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

Dr. Eric Green of NISC generously provided us with the completed sequence contigs of both targets. W. Zhang was supported by a Graduate Research Fellowship from a Vermont EPSCoR grant awarded by the U.S. Department of Energy. The computational studies used infrastructure provided by the Vermont Cancer Center, the Vermont Genetics Network, the DOE EPSCoR initiative in structural and computational biology, and the Vermont Advanced Computing Center.

Author information

Authors and Affiliations

Authors

Consortia

Corresponding author

Correspondence to Jeffrey P. Bond.

Additional information

Reviewing Editor: Dr. Brian Morten

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, W., Bouffard, G.G., Wallace, S.S. et al. Estimation of DNA Sequence Context-dependent Mutation Rates Using Primate Genomic Sequences. J Mol Evol 65, 207–214 (2007). https://doi.org/10.1007/s00239-007-9000-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-007-9000-5

Keywords

Navigation