Skip to main content
Erschienen in: Chinese Medicine 1/2018

Open Access 01.12.2018 | Research

Complete chloroplast genome of the medicinal plant Amomum compactum: gene organization, comparative analysis, and phylogenetic relationships within Zingiberales

verfasst von: Ming-li Wu, Qing Li, Jiang Xu, Xi-wen Li

Erschienen in: Chinese Medicine | Ausgabe 1/2018

Abstract

Background

Amomum compactum is one of the basic species of the traditional herbal medicine amomi fructus rotundus, with great pharmacology effect. The system position of A. compactum is not clear yet, and the introduction of this plant has been hindered by many plant diseases. However, the correlational molecular studies are relatively scarce.

Methods

The total chloroplast (cp) DNA was extracted according to previous studies, and then sequenced by 454 GS FLX Titanium platform. Sequence assembly was complished by Newbler. Genome annotation was preformed by CPGAVAS and tRNA-SCAN. Then, general characteristics of the A. compactum cp genome and genome comparsion with three Zingiberaceae species was analyzed by corresponding softwares. Additionally, phylogenetical trees were reconstructed, based on the shared protein-coding gene sequences among 15 plant taxa by maximum parsimony (MP) and maximum likelihood (ML) methods.

Results

The A. compactum cp genome with a classic quadripartite structure, consisting of a pair of reverse complement repeat regions (IRa/IRb) of 29,824 bp, a large single copy (LSC, 88,535 bp) region as well as a small single copy (SSC, 15,370 bp) region, is 163,553 bp in total size. The total GC content of this cp genome is 36.0%. The A. compactum cp genome owns 135 functional genes, that 113 genes are unique, containing eighty protein-coding genes, twenty-nine tRNA (transfer RNA) genes and four rRNA (ribosomal RNA) genes. Codon usage of the A. compactum cp genome is biased toward codons ending with A/T. Total 58 SSR loci and 24 large repeats are detected in the A. compactum cp genome. Relative to three other Zingiberaceae cp genomes, the A. compactum cp genome exhibits an obvious expansion in the IR regions. In A. compactum cp genome, the ycf1 pseudogene is 2969 bp away from the IRa/SSC border, whereas in other Zingiberaceae species, it is only 4–5 bp away from the IRa/SSC border. Comparative cp genome sequences analysis of A. compactum with other Zingiberaceae reveals that the gene order and gene content differ slightly among Zingiberaceae species. The phylogenetic analysis based on 67 protein-coding gene sequences supports the phylogenetic position of A. compactum.

Conclusions

The study has identified unique features of the A. compactum cp genome which would be helpful for us to understand the cp genome evolution and offer useful information for phylogenetics and further studies of this traditional medicinal plant.

Background

Chloroplasts can provide necessary energy for plants growth as photosynthetic organelles, which also participate in other major life activities such as starch storage, sugar synthesis and many critical biological metabolic pathways. As circular DNA molecules, cp genomes mainly vary from 120 to 160 kb in size with a typical quadripartite organization in angiosperms [1]. Two reverse complement copies of IR region (20–28 kb) separate the whole cp genome into a LSC region (80–90 kb) and a SSC region (16–27 kb) [2]. In angiosperms, cp genomes usually encode approximately 80 unique proteins, 30 tRNAs and four rRNAs. Previous studies have corroborated that cp gene order, gene content, and genome organization are highly conserved in plants [3, 4]. Owing to the high conservation and monolepsis, cp genomes are widely used in species identification, phyletic evolution studies and genetic engineering. The availability of whole cp genomes has helped to resolve phylogenetic relationships among major clades of angiosperms with greater accuracy [5, 6]. Nevertheless, with the number of cp genomes increasing, gene losses, structural rearrangements and IR contractions/expansions have been reported, which can also be exploited for the reconstruction of plant phylogenies [79].
Amomum compactum (genus Amomum, family Zingiberaceae) is one of the basic species of the traditional Chinese medicine amomi fructus rotundus, which is mainly produced in Vietnam and Thailand and is cultivated as a medicinal plant in the Guangdong, Guangxi and Yunnan provinces of China with great pharmacology effect. However, bacterial wilt, damping-off, leaf spot and other major plant diseases have become a severe obstacle for the introduction of this plant. Many plants belonging to the Zingiberaceae family are used as important seasoning and medicinal plants, such as Zingiber officinale, Amomum villosum, Curcuma longa, Zingiber mioga, Elettaria cardamomum, and Alpinia officinarum. In addition, previous studies have shown that the efficacy, chemical composition and pharmacological effects among the five genera of Zingiberaceae are strongly correlated. It is of great significance and broad interest to investigate the genetic relationships of traditional Chinese medicinal plants to find alternative medicinal plants. With the number of whole cp genomes in the Zingiberaceae increasing, the cp genome sequences of other species in Zingiberaceae are becoming easier to be assembled. However, studies of amomi fructus rotundus are scarce both inside and outside China, especially molecular studies.
This study reports the assembly, annotation and structural analysis of A. compactum cp genome for the first time. And to reveal the structure of this cp genome, we compare the organization (IR expansion/contraction and divergent regions) of complete cp genomes between A. compactum and other Zingiberaceae species. We also provide the result of phylogenetic analyses on basis of 67 protein-coding gene sequences from A. compactum and 14 monocot cp genomes.

Methods

DNA extraction and sequencing

Fresh A. compactum leaves were acquired from cultivated bases in Guangdong Province, China. The total cp DNA was extracted from roughly 100 g of leaves through an improved method by Li et al. [10]. The quality of cp DNA was checked by Nanodrop-2000 spectrometer (Nanodrop Technologies, Wilmington, DE, USA), and agarose gel electrophoresis. Pure cp DNA was used for shotgun library construction with 454 GS FLX Titanium platform. The obtained SFF file was preprocessed by trimming short (L < 50 bp) and low-quality (Q < 20) reads. Trimmed reads were assembled using Newbler V2.6 (GS FLX De Novo Assembler Software). In order to verify the assembly, the four junctional regions were further confirmed by Sanger sequencing.

Genome assembly and annotation

Preliminary gene annotation of this cp genome was performed by CpGAVAS, a program available online (http://​www.​herbalgenomics.​org/​0506/​cpgavas) [11]. The position of each gene was then manually corrected by Apollo [12] after alignment to the reference genomes by MEGA 5.0. In addition, according to start and stop codons, minor revisions were performed. The tRNAs were further confirmed by the online tool tRNAscan-SE with default settings (http://​lowelab.​ucsc.​edu/​tRNAscan-SE/​). [13]. Then, the circular map of this cp genome was accomplished by OrganellarGenomeDRAW program (http://​ogdraw.​mpimp-golm.​mpg.​de/​) [14]. Finally, the complete cp genome of A. compactum was submitted to NCBI GenBank database (Accession Number: MG000589).

Sequence analyses

Relative synonymous codon usage (RSCU) values, which were used to research the features of variations in synonymous and nonsynonymous codon usage by disregarding the composition impact of amino acid, were determined using MEGA 6.0 [15]. Additionally, GC content and codon usage were determined by MEGA 6.0. SSRs (simple sequence repeats) loci were detected by MISA software (http://​pgrc.​ipk-gatersleben.​de/​misa/​), with following thresholds: ten, six, five, five, five, and five repeat units for mono-nucleotide, di-nucleotide, tri-nucleotide, tetra-nucleotide, penta-nucleotide, and hexa-nucleotide SSRs, respectively. To analyze the repeat structure, REPuter [16] (http://​bibiserv.​techfak.​uni-bielefeld.​de/​reputer/​) was performed to detect forward (direct) and palindromic (inverted) repeats in the cp genome. The minimum repeat unit was set to 30 bp in length, the identity of repeats was set to > 90%, and the Hamming distance equals three. All identified results were verified and redundant repeats were manually removed.

Genome comparison

Pairwise alignments of several cp genome sequences were conducted by MUMmer [17], and the dot plots were drawn using a Perl script. The complete cp genomes of A. compactum and three other Zingiberaceae species (Additional file 1), Curcuma flaviflora (KR967361), Curcuma roscoeana (KF601574), and Zingiber spectabile (JX088661), were used for comparative analysis by mVISTA program (http://​genome.​lbl.​gov/​vista/​index.​shtml) [18] in Shuffle-LAGAN mode. A. compactum was set as the reference.

Phylogenomic analysis

To examine the phylogenetic position of A. compactum, 14 complete chloroplast genomes were downloaded from NCBI. The 67 shared protein-coding gene sequences were extracted using a Python script and aligned separately by ClustalW2. Phylogenetical trees were reconstructed based on 67 concatenated protein-coding gene sequences by MP and ML methods. The best-fitting model was filtrated by jModelTest 2.1.7 through the Akaike information criterion (AIC) [19]. The MP tree was reconstructed by PAUP ver. 4.0b10 [20] with a heuristic search, while ML analysis was calculated by RAxML-HPC 2.7.6.3 on XSEDE in the CIPRES Science Gateway (http://​www.​phylo.​org/​) with default parameters. Based on APGIII, Fritillaria cirrhosa was set as an outgroup. Both MP and ML analyses used 1000 bootstrap replicates.
The Minimum Standards of Reporting Checklist includes details of the experimental design, statistics, and resources used in this study.

Results and discussion

General characteristics of the A. compactum cp genome

The complete cp genome sequence of A. compactum is 163,553 bp in length with a obvious quadripartite structure (Fig. 1). A pair of inverted region (IR) with 29,824 bp in length partition the rest sequence into a LSC region (88,535 bp) and a SSC region (15,370 bp) (Table 1). The universal GC content of this cp sequence was 36.0%, which has been reported to act a significant role in evolution of genomic structures. Nevertheless, the overall GC content is unequally distributed across the cp genome, which is lowest in SSC region (29.8%) but highest in IR regions (41.1%), followed by LSC region (33.7%).
Table 1
Base composition in the A. compactum cp genome
 
T(U)%
C%
A%
G%
Length (bp)
LSC
33.8
17.2
32.5
16.5
88,535
IR
28.8
19.8
30.1
21.3
29,824
SSC
34.3
15.6
35.9
14.2
15,370
Total
32.3
18.3
31.7
17.8
163,553
CDS
31.6
17.2
31.5
19.8
79,701
 1st position
24
18.2
31.3
26.7
26,567
 2nd position
32
20.2
30.0
17.4
26,567
 3rd position
39
13.1
33.1
15.3
26,567
CDS protein-coding regions
As shown in Fig. 1, the A. compactum cp genome totally encodes 135 functional genes, that 113 are unique, containing eighty protein-coding genes, twenty-nine tRNAs and four rRNAs (Table 2). Among the functional genes, all rRNAs, eight tRNAs and seven protein-coding genes are duplicated in IR regions. The LSC region includes 60 protein-coding genes and 21 tRNAs, whereas the SSC region includes 11 protein-coding genes and one tRNA gene. Among the protein-coding genes, 72 are single-copy, whereas eight are duplicated. Among the tRNA genes, 20 are single-copy genes and nine are duplicated. Among the 113 unique genes, 13 include one intron (eight protein-coding and five tRNAs) and three (ycf3, clpP, and rps12) include two introns (Table 2). Unusually, the rps12 gene is trans-spliced, of which the 5′ end is situated in LSC region whereas two replicative 3′ ends are located in IRa and IRb regions respectively. What’s more, the ndhA gene contains the longest intron region (1033 bp).
Table 2
Gene content of the A. compactum cp genome
Gene category
Gene group
Gene name
Self-replication
rRNA genes
rrn16c, rrn23c, rrn5c, rrn4.5c
tRNA genes
trnH-GUGc, trnK-UUUa, trnQ-UUG, trnS-GCU, trnC-GCA, trnD-GUC, trnY-GUA, trnE-UUC, trnR-UCU, trnT-GGU, trnS-UGA, trnG-GCCc, trnfM-CAU, trnS-GGA, trnT-UGU, trnL-UAAa, trnF-GAA, trnV-UACa, trnW-CCA, trnP-UGG, trnI-CAUc, trnL-CAAc, trnV-GACc, trnI-GAUa, c, trnA-UGCa, c, trnR-ACGc, trnN-GUUc, trnL-UAG, trnM-CAU
Small subunit of ribosome
rps4, rps14, rps18, rps2, rps12b, c, rps11, rps8, rps3, rps19, rps7c, rps15, rps16a
Large subunit of ribosome
rpl33, rpl20, rpl36, rpl14, rpl16a, rpl22, rpl2a, c, rpl23c, rpl32
DNA dependent RNA polymerase
rpoB, rpoC1a, rpoC2, rpoA
Translational initiation factor
infA
Genes for photosynthesis
Subunits of NADH dehydrogenase
ndhAa, ndhBa, c, ndhC, ndhD, ndhE, ndhF, ndhG, ndhH, ndhI, ndhJ, ndhK
Subunits of photosystem I
psaA, psaB, psaC, psaI, psaJ, ycf3b, ycf4
Subunits of photosystem II
psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbM, psbN, psbT, psbZ
Subunits of cytochrome b/f complex
petN, petA, petL, petG, petBa, petD
Subunits of ATP synthase
atpI, atpH, atpFa, atpA, atpE, atpB
Large subunit of rubisco
rbcL
Genes of unknown function
Open reading frames (ORF, ycf)
ycf1, ycf15c, ycf2c
Pseudogenes
ycf1
aGene with one intron
bGene with two introns
cGene with two copies
The protein-coding gene sequences are 79,701 bp in length, which comprise 26,567 codons. And the usage frequency of codon was counted and exhibited in Table 3. In protein-coding sequences (CDSs), the AT content are 55.3% at the first codon positions, 62.0% at the second codon positions and 72.1% at the third codon positions, respectively (Table 1). Most protein-coding genes in land plant cp genomes use the standard ATG as the initiation codon. However, in the A. compactum cp genome, two genes use alternatives to ATG as start codon, as following: ATC for ndhD and ATA for rpl2. Relative synonymous codon usage (RSCU) is a statistics of uneven usage of synonymous and nonsynonymous codons in the coding sequences. An RSCU value < 1.00 indicates that the use of a codon is less frequent than expected, whereas a codon used more frequently will attain an RSCU value > 1.00. A total of 96.7% (29/30) of preferred synonymous codons, i.e., RSCU values > 1, end with A/U, whereas 90.6% (29/32) of non-preferred synonymous codons, i.e., RSCU values < 1, end with G/C. This codon usage pattern is similar with other reported cp genomes [21, 22], which might be driven by the high proportion of A/T. The usage of the start codon (ATG) and UGG (coding TRP) show no bias (RSCU value = 1).
Table 3
Codon-anticodon recognition patterns and codon usage in the A. compactum cp genome
Amino acid
Codon
No.
RSCU
tRNA
Amino acid
Codon
Count
RSCU
tRNA
Phe
UUU
971
1.31
 
Tyr
UAU
811
1.57
 
Phe
UUC
516
0.69
trnF-GAA
Tyr
UAC
221
0.43
trnY-GUA
Leu
UUA
892
1.96
trnL-UAA
Stop
UAA
48
1.66
 
Leu
UUG
559
1.23
trnL-CAA
Stop
UAG
22
0.76
 
Leu
CUU
567
1.25
 
His
CAU
519
1.6
 
Leu
CUC
181
0.4
 
His
CAC
129
0.4
trnH-GUG
Leu
CUA
381
0.84
trnL-UAG
Gln
CAA
706
1.54
trnQ-UUG
Leu
CUG
151
0.33
 
Gln
CAG
210
0.46
 
Ile
AUU
1146
1.47
 
Asn
AAU
989
1.55
 
Ile
AUC
426
0.55
trnI-GAU
Asn
AAC
289
0.45
trnN-GUU
Ile
AUA
763
0.98
trnI-CAU
Lys
AAA
1114
1.49
trnK-UUU
Met
AUG
614
1
trn(f)M-CAU
Lys
AAG
383
0.51
 
Val
GUU
521
1.45
 
Asp
GAU
875
1.64
 
Val
GUC
159
0.44
trnV-GAC
Asp
GAC
192
0.36
trnD-GUC
Val
GUA
559
1.56
trnV-UAC
Glu
GAA
1125
1.53
trnE-UUC
Val
GUG
194
0.54
 
Glu
GAG
350
0.47
 
Ser
UCU
598
1.74
 
Cys
UGU
232
1.56
 
Ser
UCC
337
0.98
trnS-GGA
Cys
UGC
66
0.44
trnC-GCA
Ser
UCA
412
1.2
trnS-UGA
Stop
UGA
17
0.59
 
Ser
UCG
182
0.53
 
Trp
UGG
452
1
trnW-CCA
Pro
CCU
442
1.62
 
Arg
CGU
365
1.37
trnR-ACG
Pro
CCC
202
0.74
 
Arg
CGC
86
0.32
 
Pro
CCA
325
1.19
trnP-UGG
Arg
CGA
342
1.29
 
Pro
CCG
120
0.44
 
Arg
CGG
113
0.43
 
Thr
ACU
537
1.57
 
Arg
AGA
519
1.95
trnR-UCU
Thr
ACC
237
0.7
trnT-GGU
Arg
AGG
168
0.63
 
Thr
ACA
433
1.27
trnT-UGU
Ser
AGU
430
1.25
 
Thr
ACG
157
0.46
 
Ser
AGC
102
0.3
trnS-GCU
Ala
GCU
626
1.82
 
Gly
GGU
604
1.39
 
Ala
GCC
203
0.59
 
Gly
GGC
141
0.33
trnG-GCC
Ala
GCA
434
1.26
trnA-UGC
Gly
GGA
714
1.65
 
Ala
GCG
112
0.33
 
Gly
GGG
276
0.64
 
RSCU relative synonymous codon usage

Repeat and SSR analysis

SSRs are a class of tandemly repeated sequences that consists of 1–6 nucleotide repeat units. SSRs are important in plant typing and widely developed as molecular genetic markers for species identification. Total 58 SSRs loci were found in the A. compactum cp genome (Table 4), and 47 SSRs were only composed of A/T bases. Furthermore, 10 SSRs were composed of di-nucleotide (AT/TA) repeats, and one SSR was composed of trinucleotide (ATA) repeats. Obviously, the SSRs in the A. compactum cp genome were rich in A/T, which has been reported in many plant families [2325]. Among these SSRs, 17 SSRs were situated in protein-coding genes and one was located in a tRNA gene. Furthermore, five were in coding regions and 12 in intronic regions. No tetra-, penta- or hexa-nucleotide repeats over 15 bp long was detected. REPuter allowed us to identify 24 repeats, including 13 forward and 11 palindromic repeats (Table 5). Almost all repeats were situated in the intronic and intergenic regions, although few of them were situated in protein-coding regions [26]. As reported in other genomes, the gene richest in repeats was ycf2, carrying two direct and two palindromic repeats.
Table 4
Simple sequence repeats in the A. compactum cp genome
cpSSR ID
Repeat motif
Length (bp)
Start
End
Region
Annotation
1
(T)10
10
3975
3984
LSC
trnK-UUU
2
(A)10
10
4328
4337
LSC
 
3
(TA)6
12
4900
4911
LSC
 
4
(A)10
10
5287
5296
LSC
rps16 intron
5
(A)11
11
6253
6263
LSC
 
6
(TA)6
12
6609
6620
LSC
 
7
(A)10
10
7204
7213
LSC
 
8
(AT)6
12
7521
7532
LSC
 
9
(A)10
10
7700
7709
LSC
 
10
(T)12
12
8633
8644
LSC
 
11
(A)13
13
14,885
14,897
LSC
 
12
(T)10
10
17,474
17,483
LSC
 
13
(A)10
10
19,831
19,840
LSC
rpoC2
14
(T)11
11
24,121
24,131
LSC
rpoC1 intron
15
(A)10
10
28,802
28,811
LSC
 
16
(A)15
15
29,013
29,027
LSC
 
17
(A)11
11
30,868
30,878
LSC
 
18
(T)10
10
35,129
35,138
LSC
 
19
(TA)7
14
38,632
38,645
LSC
 
20
(A)12
12
39,292
39,303
LSC
 
21
(A)12
12
47,481
47,492
LSC
 
22
(T)10
10
48,986
48,995
LSC
 
23
(A)10
10
50,236
50,245
LSC
 
24
(AT)7
14
50,395
50,408
LSC
 
25
(T)10
10
51,829
51,838
LSC
 
26
(T)11
11
52,709
52,719
LSC
 
27
(ATA)5
15
54,345
54,359
LSC
 
28
(A)11
11
54,562
54,572
LSC
 
29
(T)10
10
58,778
58,787
LSC
 
30
(T)11
11
59,269
59,279
LSC
 
31
(A)12
12
60,919
60,930
LSC
 
32
(T)10
10
61,621
61,630
LSC
 
33
(AT)6
12
63,489
63,500
LSC
 
34
(A)12
12
68,715
68,726
LSC
 
35
(AT)10
20
69,266
69,285
LSC
 
36
(T)10
10
70,716
70,725
LSC
 
37
(A)10
10
72,600
72,609
LSC
rps18
38
(TA)7
14
74,094
74,107
LSC
rps12 intron
39
(A)10
10
74,569
74,578
LSC
clpP intron
40
(T)11
11
74,845
74,855
LSC
clpP intron
41
(T)10
10
75,108
75,117
LSC
clpP intron
42
(T)10
10
75,572
75,581
LSC
clpP intron
43
(T)10
10
75,831
75,840
LSC
clpP intron
44
(A)10
10
79,177
79,186
LSC
 
45
(AT)6
12
79,751
79,762
LSC
petB intron
46
(T)10
10
86,407
86,416
LSC
rpl16 intron
47
(T)11
11
88,970
88,980
IRa
 
48
(T)10
10
116,573
116,582
IRa
ycf1
49
(A)11
11
120,872
120,882
SSC
 
50
(T)11
11
121,055
121,065
SSC
 
51
(A)11
11
128,865
128,875
SSC
ndhA intron
52
(T)10
10
129,188
129,197
SSC
ndhA intron
53
(AT)6
12
131,778
131,789
SSC
 
54
(T)11
11
133,103
133,113
SSC
 
55
(T)12
12
133,236
133,247
SSC
 
56
(T)11
11
133,374
133,384
SSC
ycf1
57
(A)10
10
135,507
135,516
IRb
ycf1
58
(A)11
11
163,109
163,119
IRb
 
Table 5
Long repeat sequences in A. compactum cp genome
ID
Repeat start 1
Type
Size (bp)
Repeat start 2
Mismatch (bp)
E value
Gene
Region
1
3990
P
34
3996
− 3
4.12E−06
trnK-UUU (intron)
LSC
2
8768
P
31
48,057
− 3
1.98E−04
IGS; trnS-GGA
LSC
3
10,522
F
30
39,347
− 3
7.15E−04
trnG-GCC (intron)
LSC
4
31,322
P
32
31,352
− 3
5.46E−05
IGS
LSC
5
32,991
F
30
33,020
− 3
7.15E−04
IGS
LSC
6
39,660
P
32
39,701
0
4.08E−10
IGS
LSC
7
41,551
F
58
43,775
− 3
7.54E−20
psaB; psaA
LSC
8
41,595
F
37
43,819
− 2
2.39E−09
psaB; psaA
LSC
9
63,481
P
31
126,101
− 3
1.98E−04
IGS
LSC; SSC
10
63,481
F
31
126,106
− 3
1.98E−04
IGS
LSC; SSC
11
63,487
F
32
69,264
− 3
5.46E−05
IGS
LSC
12
67,809
P
31
67,864
− 2
6.83E−06
IGS
LSC
13
71,632
F
30
71,659
0
6.53E−09
IGS
LSC
14
72,281
F
42
72,302
− 3
1.21E−10
rps18
LSC
15
91,249
F
46
91,299
− 1
2.10E−16
trnI-CAU; IGS
IRa
16
91,249
P
46
160,743
− 1
2.10E−16
trnI-CAU; IGS
IRa; IRb
17
91,299
P
46
160,793
− 1
2.10E−16
IGS
IRa; IRb
18
93,917
F
30
93,938
− 3
7.15E−04
ycf2
IRa
19
93,917
P
30
158,120
− 3
7.15E−04
ycf2
IRa; IRb
20
93,938
P
30
158,141
− 3
7.15E−04
ycf2
IRa; IRb
21
121,695
P
30
121,723
− 3
7.15E−04
IGS
SSC
22
158,122
F
30
158,143
− 3
7.15E−04
ycf2
IRb
23
160,743
F
46
160,793
− 1
2.10E−16
IGS
IRb
24
160,762
F
30
160,812
− 3
7.15E−04
IGS
IRb
F forward, P palindromic, IGS intergenic space

IR expansion/contraction in the A. compactum cp genome

The variations of angiosperm cp genomes in length are mainly because of the contraction and expansion of boundary regions between the IR regions with single copy (SC) regions. A minute comparison of junctional regions between the IR and SC boundaries among A. compactum, C. flaviflora, C. roscoeana, and Z. spectabile is presented in Fig. 2. In addition, a size comparison of cp genome among the four Zingiberaceae species is shown in Additional file 2: Table S1. In spite of the alike lengths of IR regions in these four species (from 25,618 to 29,824 bp), few IR contractions/expansions were still detected. rpl22, ycf1 and rps19 pseudogenes with various lengths were situated in IRb/LSC or IRb/SSC boundaries. The borderline of the IRb/LSC junction was situated in left side of the rps19 gene in examined cp genomes, except in Z. spectabile, which resulted from the contraction of the IRa region in the Z. spectabile cp genome. By contrast, the ycf1 pseudogene was situated in the left side of the IRa-SSC border and was 4–5 bp away from the IRa-SSC borderline, except in the A. compactum cp genome. The size of the ycf1 pseudogene was 918 bp in A. compactum, 1068 bp in C. flaviflora and C. roscoeana, and 924 bp in Z. spectabile. In addition, in the A. compactum cp genome, the ycf1 pseudogene was 2969 bp away from the IRa-SSC borderline, that indicated the expansion of the IR region. The trnH gene was situated in LSC region, except in Z. spectabile cp genome, where it was situated in SSC region and was 136 bp away from the IRb-LSC borderline.

Comparison with other Zingiberaceae cp genomes

Three sequences representing the Zingiberaceae (C. flaviflora, C. roscoeana and Z. spectabile) were selected for comparison with A. compactum. Pairwise cp genome alignments between A. compactum and other three cp genomes regained a high degree of synteny (Additional file 3: Figure S1, Additional file 4: Figure S1 and Additional file 5: Figure S3). To detect the divergent regions in the cp genome, this study compared the sequence identities among four Zingiberaceae cp genomes by mVISTA, using the annotation of A. compactum as a reference. The multiple sequences alignment showed the coding regions are highly conserved, however the non-coding regions are divergent (Fig. 3). As an example, the intergenic sequences between the trnT-GGUpsbD, rps16trnQ-UUG, atpHatpI, trnE-UUCtrnT-GGU, trnT-UGUtrnL-UAA, petApsbL and psaCndhE regions were highly divergent, parts of which have been also reported as divergent sequences in other plant. Obviously, the LSC region and SSC region were more divergent than IR regions.

Phylogenetic analysis

Cp genomes are widely employed in the study of evolution through phylogenetics. To examine the phylogenetic position of A. compactum and its relationship within Zingiberales, MP and ML phylogenetical analyses were performed based on 67 protein-coding gene sequences from 15 plant taxa, including A. compactum, as sequenced in the study. The total alignment was 51,452 bp in length. The results are presented in Figs. 4 and 5. The basic topologies were similar in the MP and ML analyses, but there were few differences. Bootstrap values were all extremely high, and nine of the 12 nodes with bootstrap values of ≥ 90% were found in MP tree, whereas eight of 12 nodes were found in ML tree with 100% bootstrap values. The Zingiberaceae species A. compactum, C. flaviflora, C. roscoeana and Z. spectabile were grouped in both MP and ML phylogenetic trees with 100% bootstrap values. In the MP trees, the four Zingiberaceae species composed a unique clade and were separated from the rest of Zingiberales with high bootstrap values in every node. By contrast, the ML tree was mainly separated into two clades, one of which included Strelitziaceae, Heliconiaceae, Musaceae and Lowiaceae species, whereas another included Zingiberaceae, Costaceae, Cannaceae and Marantaceae species. However, the Zingiberaceae and Costaceae species were grouped with a very low bootstrap value (15%) in the ML tree. These phylogenetic results strongly support the position of A. compactum and provide some helpful hints about relationships within the order Zingiberales.

Conclusion

The research assembled, annotated and analyzed the whole cp genome of A. compactum, which reveals that the cp genome of A. compactum shares a quadruple structure, gene order, GC content, and codon usage features, similar to those of other land plant cp genomes. This Amomum cp genome was compared with three available Zingiberaceae cp genomes, while the genome structure and composition are similar. Also phylogenetic analysis provides new insight into phyletic evolution of this genus. Our research will contribute to species identification and evolutionary mechanisms required for the further study of A. compactum.

Authors’ contributions

XL and JX initiated the research, MW and QL drafted the paper and processed the data, XL supervised the task, the above authors analysed the results and reviewed the manuscript. All authors read and approved the final manuscript.

Acknowledgements

All authors thank Dr. Shi-Lin Chen for his helps on group discussion and manuscript polishing.

Competing interests

The authors declare that they have no competing interests.

Availability of data and materials

All data generated or analyzed during the course of this study are included in this document or obtained from the appropriate author(s) at reasonable request.
Not applicable.
Not applicable.

Funding

This work is supported by the grants from the National Key Technology Support Program (2015BAI05B02) and from the China Academy of Chinese Medical Sciences Special Fund for Thirteen-five key research (ZZ10-007).

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://​creativecommons.​org/​licenses/​by/​4.​0/​), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://​creativecommons.​org/​publicdomain/​zero/​1.​0/​) applies to the data made available in this article, unless otherwise stated.
Literatur
1.
Zurück zum Zitat Bendich AJ. Circular chloroplast chromosomes: the grand illusion. Plant Cell Online. 2004;16:1661–6.CrossRef Bendich AJ. Circular chloroplast chromosomes: the grand illusion. Plant Cell Online. 2004;16:1661–6.CrossRef
2.
Zurück zum Zitat Chumley TW, Palmer JD, Mower JP, Fourcade HM, Calie PJ, et al. The complete chloroplast genome sequence of Pelargonium hortorum: organization and evolution of the largest and most highly rearranged chloroplast genome of land plants. Mol Biol Evol. 2006;23:2175–90.CrossRefPubMed Chumley TW, Palmer JD, Mower JP, Fourcade HM, Calie PJ, et al. The complete chloroplast genome sequence of Pelargonium hortorum: organization and evolution of the largest and most highly rearranged chloroplast genome of land plants. Mol Biol Evol. 2006;23:2175–90.CrossRefPubMed
3.
Zurück zum Zitat Palmer JD. Plastid chromosomes: structure and evolution. In: Bogorad L, Vasil I, editors. Cell culture and somatic cell genetics of plants. San Diego: Academic Press; 1991. p. 5–53. Palmer JD. Plastid chromosomes: structure and evolution. In: Bogorad L, Vasil I, editors. Cell culture and somatic cell genetics of plants. San Diego: Academic Press; 1991. p. 5–53.
4.
Zurück zum Zitat Raubeson LA, Jansen RK. Chloroplast genomes of plants. In: Henry RJ, editor. Plant diversity and evolution: genotypic and phenotypic variation in higher plants. Cambridge: CAB International; 2005. p. 45–68.CrossRef Raubeson LA, Jansen RK. Chloroplast genomes of plants. In: Henry RJ, editor. Plant diversity and evolution: genotypic and phenotypic variation in higher plants. Cambridge: CAB International; 2005. p. 45–68.CrossRef
5.
Zurück zum Zitat Jansen RK, Cai Z, Raubeson LA, Daniell H, dePamphilis CW, et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc Natl Acad Sci. 2007;104:19369–74.CrossRefPubMedPubMedCentral Jansen RK, Cai Z, Raubeson LA, Daniell H, dePamphilis CW, et al. Analysis of 81 genes from 64 plastid genomes resolves relationships in angiosperms and identifies genome-scale evolutionary patterns. Proc Natl Acad Sci. 2007;104:19369–74.CrossRefPubMedPubMedCentral
6.
Zurück zum Zitat Givnish TJ, Ames M, McNeal JR, McKain MR, Steele PR, et al. Assembling the tree of the monocotyledons: plastome sequence phylogeny and evolution of Poales. Ann Mo Bot Gard. 2010;97:584–616.CrossRef Givnish TJ, Ames M, McNeal JR, McKain MR, Steele PR, et al. Assembling the tree of the monocotyledons: plastome sequence phylogeny and evolution of Poales. Ann Mo Bot Gard. 2010;97:584–616.CrossRef
7.
Zurück zum Zitat Millen RS, Olmstead RG, Adams KL, Palmer JD, Lao NT, et al. Many parallel losses of infA from chloroplast DNA during angiosperm evolution with multiple independent transfers to the nucleus. Plant Cell Online. 2001;13:645–58.CrossRef Millen RS, Olmstead RG, Adams KL, Palmer JD, Lao NT, et al. Many parallel losses of infA from chloroplast DNA during angiosperm evolution with multiple independent transfers to the nucleus. Plant Cell Online. 2001;13:645–58.CrossRef
8.
Zurück zum Zitat Guisinger M, Chumley T, Kuehl J, Boore J, Jansen R. Implications of the plastid genome sequence of Typha (Typhaceae, Poales) for understanding genome evolution in Poaceae. J Mol Evol. 2010;70:149–66.CrossRefPubMedPubMedCentral Guisinger M, Chumley T, Kuehl J, Boore J, Jansen R. Implications of the plastid genome sequence of Typha (Typhaceae, Poales) for understanding genome evolution in Poaceae. J Mol Evol. 2010;70:149–66.CrossRefPubMedPubMedCentral
9.
Zurück zum Zitat Downie SR, Palmer JD. Use of chloroplast DNA rearrangements in reconstructing plant phylogeny. In: Soltis PS, Soltis DE, Doyle JJ, editors. Molecular systematics of plants. New York: Chapman and Hall; 1992. p. 14–35.CrossRef Downie SR, Palmer JD. Use of chloroplast DNA rearrangements in reconstructing plant phylogeny. In: Soltis PS, Soltis DE, Doyle JJ, editors. Molecular systematics of plants. New York: Chapman and Hall; 1992. p. 14–35.CrossRef
10.
Zurück zum Zitat Li X-W, Hu Z-G, Lin X-H, Li Q, Gao H-H, Luo G-A, et al. High-throughput pyrosequencing of the complete chloroplast genome of Magnolia officinalis and its application in species identification. Acta Pharm Sin. 2012;47:124–30. Li X-W, Hu Z-G, Lin X-H, Li Q, Gao H-H, Luo G-A, et al. High-throughput pyrosequencing of the complete chloroplast genome of Magnolia officinalis and its application in species identification. Acta Pharm Sin. 2012;47:124–30.
11.
Zurück zum Zitat Liu C, Shi LC, Zhu YJ, Chen HM, Zhang JH, Lin XH, et al. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genom. 2012;13:715.CrossRef Liu C, Shi LC, Zhu YJ, Chen HM, Zhang JH, Lin XH, et al. CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genom. 2012;13:715.CrossRef
13.
Zurück zum Zitat Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:0955–64.CrossRef Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:0955–64.CrossRef
14.
Zurück zum Zitat Lohse M, Drechsel O, Kahlau S, Bock R. OrganellarGenomeDRAW—a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucl Acids Res. 2013;41(Web Server issue):W575.CrossRefPubMedPubMedCentral Lohse M, Drechsel O, Kahlau S, Bock R. OrganellarGenomeDRAW—a suite of tools for generating physical maps of plastid and mitochondrial genomes and visualizing expression data sets. Nucl Acids Res. 2013;41(Web Server issue):W575.CrossRefPubMedPubMedCentral
15.
Zurück zum Zitat Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9.CrossRefPubMedPubMedCentral Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9.CrossRefPubMedPubMedCentral
16.
Zurück zum Zitat Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001;29:4633–42.CrossRefPubMedPubMedCentral Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001;29:4633–42.CrossRefPubMedPubMedCentral
17.
Zurück zum Zitat Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:1–9.CrossRef Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:1–9.CrossRef
18.
Zurück zum Zitat Mayor C, Brudno M, Schwartz JR, Poliakov A, Rubin EM, Frazer KA, et al. VISTA: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics. 2000;16:1046–7.CrossRefPubMed Mayor C, Brudno M, Schwartz JR, Poliakov A, Rubin EM, Frazer KA, et al. VISTA: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics. 2000;16:1046–7.CrossRefPubMed
19.
20.
Zurück zum Zitat Swofford DL. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4.0b10. Sunderland: Sinauer Associates; 2003. Swofford DL. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4.0b10. Sunderland: Sinauer Associates; 2003.
21.
Zurück zum Zitat Tangphatsornruang S, Sangsrakru D, Chanprasert J, Uthaipaisanwong P, Yoocha T, Jomchai N, et al. The chloroplast genome sequence of mungbean (Vigna radiata) determined by high-throughput pyro-sequencing: structural organization and phylogenetic relationships. DNA Res. 2010;17:1–22.CrossRef Tangphatsornruang S, Sangsrakru D, Chanprasert J, Uthaipaisanwong P, Yoocha T, Jomchai N, et al. The chloroplast genome sequence of mungbean (Vigna radiata) determined by high-throughput pyro-sequencing: structural organization and phylogenetic relationships. DNA Res. 2010;17:1–22.CrossRef
22.
Zurück zum Zitat Qian J, Song J, Gao H, Zhu Y, Xu J, Pang X, et al. The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza. PLoS ONE. 2013;8:e57607.CrossRefPubMedPubMedCentral Qian J, Song J, Gao H, Zhu Y, Xu J, Pang X, et al. The complete chloroplast genome sequence of the medicinal plant Salvia miltiorrhiza. PLoS ONE. 2013;8:e57607.CrossRefPubMedPubMedCentral
24.
Zurück zum Zitat Melotto-Passarin D, Tambarussi E, Dressano K, De Martin V, Carrer H. Characterization of chloroplast DNA microsatellites from Saccharum spp. and related species. Genet Mol Res. 2011;10:2024–33.CrossRefPubMed Melotto-Passarin D, Tambarussi E, Dressano K, De Martin V, Carrer H. Characterization of chloroplast DNA microsatellites from Saccharum spp. and related species. Genet Mol Res. 2011;10:2024–33.CrossRefPubMed
25.
Zurück zum Zitat Martin G, Baurens FC, Cardi C, Aury JM, D’Hont A. The complete chloroplast genome of banana (Musa acuminata, Zingiberales): insight into plastid monocotyledon evolution. PLoS ONE. 2013;8:e67350.CrossRefPubMedPubMedCentral Martin G, Baurens FC, Cardi C, Aury JM, D’Hont A. The complete chloroplast genome of banana (Musa acuminata, Zingiberales): insight into plastid monocotyledon evolution. PLoS ONE. 2013;8:e67350.CrossRefPubMedPubMedCentral
26.
Zurück zum Zitat Yang Y, Dang YY, Li Q, Lu JJ, Li XW, Wang YT. Complete chloroplast genome sequence of poisonous and medicinal plant Datura stramonium: organizations and implications for genetic engineering. PLoS ONE. 2014;9(11):e110656.CrossRefPubMedPubMedCentral Yang Y, Dang YY, Li Q, Lu JJ, Li XW, Wang YT. Complete chloroplast genome sequence of poisonous and medicinal plant Datura stramonium: organizations and implications for genetic engineering. PLoS ONE. 2014;9(11):e110656.CrossRefPubMedPubMedCentral
Metadaten
Titel
Complete chloroplast genome of the medicinal plant Amomum compactum: gene organization, comparative analysis, and phylogenetic relationships within Zingiberales
verfasst von
Ming-li Wu
Qing Li
Jiang Xu
Xi-wen Li
Publikationsdatum
01.12.2018
Verlag
BioMed Central
Erschienen in
Chinese Medicine / Ausgabe 1/2018
Elektronische ISSN: 1749-8546
DOI
https://doi.org/10.1186/s13020-018-0164-2

Weitere Artikel der Ausgabe 1/2018

Chinese Medicine 1/2018 Zur Ausgabe