Mauve: Multiple Alignment of Conserved Genomic Sequence With Rearrangements

  1. Aaron C.E. Darling1,2,6,
  2. Bob Mau2,3,
  3. Frederick R. Blattner4,5, and
  4. Nicole T. Perna2,5
  1. 1 Department of Computer Science, University of Wisconsin–Madison, Madison, Wisconsin 53706, USA
  2. 2 Department of Animal Health and Biomedical Sciences, University of Wisconsin–Madison, Madison, Wisconsin 53706, USA
  3. 3 Department of Oncology, University of Wisconsin–Madison, Madison, Wisconsin 53706, USA
  4. 4 Department of Genetics, University of Wisconsin–Madison, Madison, Wisconsin 53706, USA
  5. 5 Genome Center of Wisconsin, University of Wisconsin–Madison, Madison, Wisconsin 53706, USA

Abstract

As genomes evolve, they undergo large-scale evolutionary processes that present a challenge to sequence comparison not posed by short sequences. Recombination causes frequent genome rearrangements, horizontal transfer introduces new sequences into bacterial chromosomes, and deletions remove segments of the genome. Consequently, each genome is a mosaic of unique lineage-specific segments, regions shared with a subset of other genomes and segments conserved among all the genomes under consideration. Furthermore, the linear order of these segments may be shuffled among genomes. We present methods for identification and alignment of conserved genomic DNA in the presence of rearrangements and horizontal transfer. Our methods have been implemented in a software package called Mauve. Mauve has been applied to align nine enterobacterial genomes and to determine global rearrangement structure in three mammalian genomes. We have evaluated the quality of Mauve alignments and drawn comparison to other methods through extensive simulations of genome evolution.

Footnotes

  • [Supplemental material is available online at www.genome.org. The source code and binaries are freely available for academic and nonprofit research. Commercial licenses are also available. See http://gel.ahabs.wisc.edu/mauve for more details.]

  • 7 Under this definition of an LCB, multi-MEMs on nontandem repetitive elements would break LCBs. Each multi-MEM would become its own independent LCB with identical weight, leaving the greedy breakpoint elimination algorithm with no means for discrimination.

  • Article and publication are at http://www.genome.org/cgi/doi/10.1101/gr.2289704.

  • 6 Corresponding author. E-MAIL darling{at}cs.wisc.edu; FAX (608) 262-7420.

    • Accepted April 16, 2004.
    • Received December 19, 2003.
| Table of Contents

Preprint Server