Introduction
Lonicera japonica Thunb, also known as Japanese honeysuckle, ‘Jin Yin Hua’, and ‘
Ren Dong’, belongs to the Caprifoliaceae family and is often used in traditional Chinese and Japanese medicine [
1].
L. japonica is native to eastern Asia, and is cultivated worldwide, particularly in China, Japan, and Korea due to its medicinal properties, and as an ornamental plant due to its pleasant smelling flowers, and attractive evergreen foliage [
2]. However, as it is highly invasive to the ecology of some countries, such as New Zealand and several other countries including North America, it is considered a major nuisance and is restricted [
3].
L. japonica has been used as traditional medicine in China for over thousands of years, and has been listed as top grade in ‘Ming Yi Bie Lu’ and ‘Shen Nong Ben Cao Jing’, and described in ‘Ben Cao Gang Mu’, the famous classical book of Chinese Materia Medica, as early as the seventeenth century, for applications in various diseases such as to clean away the heat-evil or heal the swelling [
1]. Different parts of
L. japonica have been reported to possess unique medicinal properties, with flowers and floral buds being highly used in Chinese traditional medicine, while the leaves and stems are used in Japan [
2,
4,
5]. The commercial value of
L. japonica in the herbal medicine trading market has increased by several hundred-fold in recent years, and >30 % of current traditional Chinese medicine prescriptions contain extracts from different plant parts of
L. japonica [
6]. Since 1995,
L. japonica has been included in the Chinese Pharmacopoeia, with >500 prescriptions containing
L. japonica being used for the treatment of various diseases [
1].
Whole plant or aerial parts of
L. japonica, particularly leaves and floral buds are used to derive bioactive metabolites for various preparations and medicinal uses [
7]. Modern pharmacological studies have shown that extracts from
L. japonica possess a wide range of bioactive properties, such as anti-bacterial, anti-inflammatory, anti-viral, anti-pyretic, anti-oxidant, anti-hyperlipidemic, and anti-nociceptive among others [
2,
5,
8‐
15]. Extracts from
L. japonica were used to prevent and treat severe acute respiratory syndromes, H1N1 influenza, and hand, foot and mouth diseases, and were reported to be effective against SARS coronavirus [
2]. Apart from its application in traditional medicine,
L. japonica has also been used as a health beverage such as ‘Jin Yin Hua’ tea or ‘Jin Yin Hua’ wine, as cosmetics such as ‘Jin Yin Hua’ floral water, or even as an active ingredient of toothpaste to prevent oral cavity diseases [
1].
The major chemical constituents of
L. japonica extracts include phenolic acids [
16,
17], flavonoids [
18,
19], volatile oils [
20,
21], and saponins [
22‐
27], and predominantly account for a wide range of attributed pharmacological properties. Chlorogenic acid (CGA) is a potent phenolic acid derived from phenylalanine and is considered to have several important biological activities. CGA, a group of esters created from certain trans-cinnamic acids such as caffeic acid, ferulic acid, and quinic acid, is a primary phenylpropanoid generated from the shikimic acid pathways with high anti-oxidant activities and, therefore, are often used in the form of medicines or foods. Studies have shown strong anti-bacterial, anti-oxidant and anti-diabetic activities attributed to CGA [
1,
28,
29]. Luteolin, and its sugar-conjugated derivative, luteolosides, are also derived from phenylpropanoid metabolic pathways, and are major constituents of
L. japonica extracts. Studies have shown luteolin and luteolosides to possess anti-oxidative, anti-inflammatory, anti-tumor, and anti-5-lipoxygenase activity [
30]. CGA and luteolosides are the major constituents of
L. japonica and are used as standard compounds for assessing its quality [
28]. Besides CGA and luteolosides, secoiridoids such as loganin, secologanin, sweroside, and kingiside among others have been identified from extracts from
L. japonica. In the past decades, >30 iridoids from
L. japonica have been identified and reported [
1,
27,
31,
32]. Iridoids and secoiridoids are pharmaceutically active metabolites, and are known to possess anti-tumor, anti-inflammatory, and anti-oxidant activities and hepatoprotective effects [
32‐
37]. In Japanese pharmacopoeia, loganin along with CGA are recommended as a means to evaluate the quality of
L. japonica. Several studies on chemical constituents across different tissues have shown a higher content of CGA and luteolosides in floral buds, leaves and stems of
L. japonica [
1,
5,
7]. The CGA content in
L. macranthoides, a species closely related to
L. japonica, was reported to be higher in young leaves and young stems compared to flowers [
38]. The content of CGA, luteolosides, and other bioactive constituents of
L. japonica varies based on tissue, extraction period or season, and their habitat.
Recent advances in next-generation sequencing, and computational resources to perform de novo transcriptome assembly and analysis has revolutionized the field of phytochemistry, especially for non-model plants [
39‐
41]. RNA-seq-based transcriptome profiling provides a broad overview of different active metabolic processes, and their localization. Using a different statistical approach leads to the identification of potential genes involved in the pathway of interest. Previous transcriptome-based studies on
L. japonica described transcripts across leaves and different floral developmental stages, and were focused on CGA, luteolosides, and flavonoid biosynthesis [
2,
6]. However, the number of tissues used to perform de novo transcriptome assembly was limited and, therefore, does not represent a complete transcriptome for
L. japonica. Furthermore, genes involved in secoiridoid metabolic pathways, one of the major chemical constituents with important pharmaceutical properties, have not been studied in
L. japonica. Our study attempts to bridge this gap. We performed deep RNA sequencing for nine different tissues of
L. japonica, yielding over 24 Gbps reads, which upon de novo transcriptome assembly, by combining three popular assemblers, resulted in 243,185 unigenes. The transcriptome assembly thus obtained is a more complete representation of the transcripts and ongoing metabolic processes of
L. japonica. Through multiple transcriptome assemblers and integration of their resulting assemblies to obtain final de novo transcriptome assembly of
L. japonica, we managed to capture diverse transcripts with improved N50 values and number of contigs assembled. Homologs for all enzymes from CGA, luteolin, and secoiridoid metabolic pathways were identified. Transcriptome abundance estimation across all nine tissues of
L. japonica showed unigenes associated to key metabolic pathways were highly expressed in the young leaf and shoot apex. We also identified cytochrome P450s and UDP-glycosyl transferases, two major enzyme families involved in secondary metabolic pathways, which will serve as a basis for future validation and characterization. This study therefore presents a comprehensive transcriptome profiling and analysis for
L. japonica, and will be useful as a resource for future functional characterization of enzymes of interest.
Materials and methods
Plant material preparation, RNA extraction, and library preparation
All nine tissues for L. japonica, namely, shoot apex, stem, leaf-1 (youngest leaf near shoot apex), leaf-2 (second leaf), leaf-3 (mature leaf), green floral bud, white floral bud, white flower, and yellow flower were harvested in June 2014. L. japonica plants were cultivated in the natural environment of Chiba University pharmaceutical garden, Chiba (located at 35°36′17.7″N; 140°08′06.9″E). All tissues from L. japonica were harvested on ice, cut into small pieces, and were snap-freezed by liquid N2 before storing at −80 °C prior to RNA extraction.
The frozen tissues from L. japonica were powdered using a multi-bead shocker (Yasui Kikai, Japan), and were used for subsequent extraction of total RNA using RNeasy Plant Mini Kit (Qiagen, USA) according to the manufacturer’s instructions. RNA quality was assessed using Agilent Bioanalyzer 2100 (Agilent Technology, USA), and RNA samples with RNA integrity number (RIN) above 8 were used for cDNA library preparation.
mRNA for each sample was isolated from the total RNA by using beads with oligo (dT), and were added with fragmentation buffer to shear mRNA into short fragments, which were then used as a template for the synthesis of first-strand cDNA using random hexamer primers. cDNA library for Illumina sequencing was prepared using SureSelect Strand specific RNA library kit (Agilent Technology, USA) according to the manufacturer’s instructions.
Illumina sequencing and pre-processing of raw reads
A cDNA library was sequenced using Illumina HiSeq™ 2000 sequencer (Illumina Inc., USA) to obtain paired-end reads with an average length of 101 bps. cDNA library preparation and sequencing were performed at Kazusa DNA Research Institute, Chiba, Japan. The raw read sequences, transcriptome assembly, and RSEM-based transcript abundance data for nine tissues of L. japonica discussed in this study have been deposited in the NCBI’s Gene Expression Omnibus (GEO), and are accessible through GEO Series accession number GSE81949.
Raw reads thus obtained through Illumina sequencing were pre-processed using the Trimmomatic program [
42] for the removal of adaptor sequences, empty reads, reads with ambiguous ‘
N’ base >5 %, low-quality raw reads (Phred score <20), and raw reads with an average length <50 bps. The clean reads thus obtained were in the form of paired reads, or unpaired clean reads (forward and reverse), and were all used to perform de novo transcriptome assembly.
De novo transcriptome assembly and transcriptome expression analysis
De novo transcriptome assembly for
L. japonica was obtained by merging three popular assemblers, namely, SOAPdenovo-Trans, Trinity v 2.0.6, and CLC Genomics workbench v8.0.3 (
https://www.qiagenbioinformatics.com/) (Qiagen, USA). For SOAPdenovo-Trans, we performed six independent de novo transcriptome assemblies using kmer sizes as 31, 41, 51, 63, 71, and 91, and resultant assemblies were analyzed using perl script from assemblathon_2 to obtain N50 values and other assembly-related stats [
66]. De novo transcriptome assembly using Trinity and CLC Genomics Workbench were performed using default kmer size and default parameters. Resultant transcriptome assemblies from SOAPdenovo-Trans using kmer size as 31 emerged as the best assembly on the basis of different assembly parameters, which were then pooled together with assemblies from Trinity [
50] and CLC Genomics Workbench into one merged assembly, and were processed by CD-HIT-EST v 4.6 (built on Mar 5, 2015) [
51,
52] with parameters used as ‘−
c 0.95 −
n 8’ to remove sequence redundancy. Sequences with a length <200 bps were dropped, and the resulting de novo transcriptome assembly was used for further characterization. For transcriptome expression analysis, clean paired reads for each tissue were used for alignment over
L. japonica transcriptome assembly using the Bowtie 2.0 program [
62], and the RSEM program [
63] was used for abundance estimation. To calculate unigene expression, we used the FPKM method. Unsupervised principal component analysis for all nine tissues was performed by the DESeq2 program [
64] using count data for unigenes obtained from the RSEM program. GC content and basic statistic values for unigenes were calculated as described previously [
67].
Functional annotation and classification of de novo transcriptome assembly
We performed a homology search based on the Blastx program using
L. japonica transcriptome assembly as a query against the NCBI-non redundant (nr) protein database (
http://www.ncbi.nlm.nih.gov; formatted on Oct, 2015) using a cut-off
E value of <10
−5 with a maximum number of allowed hits of 20. The top hit for each unigene was used to annotate the transcriptome. For further characterization of
L. japonica transcriptome assembly, we used the Blast2GO v 3.0 program [
55] to assign GO terms, EC number, and KEGG pathway information to the unigenes using associated Blastx results. GO level distribution, and visualization of the top 20 GO terms from three broad categories (biological process, molecular function, and cellular component) at level 3 for
L. japonica transcriptome assembly were performed using Blast2GO.
Simple sequence repeat (SSR) detection
The transcriptome assembly for
L. japonica was searched to identify the composition, frequency, and distribution of SSRs using the microsatellite identification tool (MISA) (
http://pgrc.ipk-gatersleben.de/misa/) [
61]. The search parameters for maximum motif length group were set to recognize hexamers with each SSR length-based category to have at least ten repeats.