Introduction

Glycosaminoglycans (GAGs) are long linear negatively charged hetero-polysaccharides, consisting of repeating disaccharide residues usually of a hexuronic acid linked to a hexosamine. Glycosaminoglycans are ubiquitously found throughout the animal kingdom, where they are involved in a wide variety of biological processes. The most common glycosaminoglycans are chondroitin sulfate, dermatan sulfate, heparan sulfate, heparin, hyaluronan and keratan sulfate. An overview is given in Table 1.

Table 1 Overview of the most common glycosaminoglycans. GlcA, glucuronic acid; GalNAc, N-acetyl-galactosamine; GlcNAc, N-acetyl-glucosamine; IdoA, iduronic acid; GlcNS, N-sulfate-glucosamine; Gal, galactose; C#, carbon number; ECM, extracellular matrix

The glycosaminoglycan hexuronic acid residue is either D-glucuronic acid or its C5 epimer L-iduronic acid. The latter is a rather unique hexuronic acid that is typically found in glycosaminoglycans, while D-glucuronic acid is a very common compound in nature. The epimerization of D-glucuronic acid towards L-iduronic acid is mediated by a D-glucuronyl C5-epimerase acting at a polysaccharide level after D-glucuronic acid incorporation [1, 2]. The C5-epimerization of D-glucuronic acid is essential for specific binding properties of versatile glycosaminoglycans like heparin and heparan sulfate. Another D-glucuronyl C5-epimerase does exist responsible for dermatan sulfate biosynthesis, as well as a C5-epimerase involved in alginate biosynthesis. These epimerases do not show sequence homology to the heparin D-glucuronyl C5-epimerase [3].

Glycosaminoglycans traditionally are isolated from animal tissue. A major drawback is the great polydispersity of animal-derived material, both in chain length and degree of epimerization/sulfation. Large scale chemical synthesis of heparin is not feasible, because of the C5-epimerization of D-glucuronic acid being one of the major bottlenecks. L-iduronic acid can be synthesized chemically, however C5-epimerization of D-glucuronic acid as part of a polymer only can be done enzymatically. In the past decades the C5-epimerase in animal heparin biosynthesis has been identified and characterized [4], and has some major limitations to use in large scale biotechnological production processes. Novel D-glucuronyl C5-epimerases that have less restrictions in substrate acceptance, better stability and easier production methods would have great potential in controlled chemo-enzymatic synthesis of L-iduronic acid containing polymers like heparin and heparin-analogs. In this review we discuss putative novel C5-epimerases that might convert D-glucuronic acid to L-iduronic acid with less restrictions.

Glycosaminoglycan-like structures in bacterial capsules

Traditionally, GAGs were considered as polymeric structures only to be found in (in)vertebrate animals. However, in the past decades, some polyanionic bacterial cell wall polysaccharides have been described, of which the structure has resemblance to some well-characterized animal glycosaminoglycans. Microorganisms possessing GAGs generally are pathogenic bacteria in which the surface exposed capsular polysaccharides are likely to serve as a virulence factor. The resemblance between the bacterial capsule and the animal GAG results in very limited or no response of the hosts’ immune system. GAG-containing pathogens, including serotypes of both Escherichia coli and Pasteurella multocida, are discussed below.

E. coli GAGs

Many different serotypes of E. coli have been described. Discrimination between these strains is generally based on antigenic studies. Mostly differences can be found in a specific part of the bacterial lipopolysaccharide (LPS), the so-called O-antigen. To a lesser extent differences are seen in antigenic properties of the flagella (H-antigen) and the bacterial capsule (K-antigen). The latter is a protective layer of polysaccharides that generally can be easily observed by light microscopy. Over 70 different K-antigens have been described [5]. As discussed below, two of them are analogous to glycosaminoglycans as found in animals.

A chondroitin-like glycosaminoglycan has been isolated from E. coli O5 : K4 : H4 [6]. Identical to chondroitin this “K4 capsular polysaccharide” consists of equimolar amounts of β-D-glucuronic acid and β-N-acetyl-galactos-amine. Unlike chondroitin an additional fructose is β(1-3) linked as a substituent to each GlcA residue. However, this modification can easily be removed by mild acidification, resulting in a chondroitin backbone. Upon removal of the fructose residue, the immune response against the K4 polysaccharide decreases considerably.

Another glycosaminoglycan-like structure has been described for the K5 antigen of E. coli O10 : K5 : H4. This capsular polysaccharide has an identical structure to heparosan, the unsulfated and non-epimerized backbone structure of heparan sulfate and heparin. The K5 capsule is a linear polysaccharide containing α-N-acetyl-glucosamine and β-glucuronic acid in equimolar amounts, linked by an (1-4) glycosidic bond [7]. In contrast to animals, no post-polymerization modifications occur on the heparosan molecule. This makes the K5 polysaccharide a useful substrate to study the enzymes in heparin biosynthesis [8, 9], and a potential precursor for chemo-enzymatic synthesis of heparin.

P. multocida GAGs

Similar glycosaminoglycans have been isolated from several serotypes of another pathogenic gamma-proteobacterium, namely Pasteurella multocida. The capsules of P. multocida type A, D and F could be removed upon treatment with different glycosaminoglycan hydrolases [10]. A more detailed analysis of the capsular polysaccharides of P. multocida type D and F has revealed similarity with the K-antigens of E. coli K5 and K4 respectively. The P. multocida type D polymer is identical to heparosan, the type F polymer is unmodified chondroitin [11].

Additionally, another P. multocida capsular polysaccharide has been described to be analogous to a vertebrate glycosaminoglycan. The extracellular capsule of P. multocida type A is chemically identical to the animal GAG hyaluronan [12]. In addition, multiple species of Streptococci have been described to have such a hyaluronan capsule. All these strains are pathogenic to human or other mammals, the hyaluronan capsule having an important role in preventing an immune response. Since these molecules are identical to mammalian hyaluronan, bacterially produced hyaluronian has substantial commercial value. In addition to animal derived hyaluronan, it is nowadays widely commercially available for numerous existing applications.

Although multiple examples of bacterial GAGs do exist, for none of these glycosaminoglycans modifications have been observed similar to those in the GAG biosynthesis pathways in animals. A key modification step in these pathways is the C5-epimerization of D-glucuronic acid towards L-iduronic acid, catalyzed by a D-glucuronyl C5-epimerase. No bacterial counterpart of this enzyme has been experimentally characterized to date. However, the presence of L-iduronic acid in extracellular polysaccharides of several microorganisms, suggests that D-glucuronyl C5-epimerases do exist in prokaryotes.

Identification of iduronic acid in microbes

While being a well-known component of (animal) glycosaminoglycans, the presence of L-iduronic acid in prokaryotes is rather uncommon. For some time it was believed that L-iduronic acid could only be found in multicellular eukaryotes. However, as discussed below, in the last decades multiple examples of microbial L-iduronic acid have been published.

Bacteria

The first case of L-iduronic acid being present in a prokaryote was reported in a study of the gram-positive bacterium Clostridium perfringens NCTC 10578 [13]. L-iduronic acid was identified in a purified “type-specific” polysaccharide from Clostridium perfringens strain Hobbs 10. The exact polysaccharide structure is still unknown but the L-iduronic acid level in the isolated polysaccharide is estimated to be 7%. Most likely this “type-specific” polysaccharide is part of a bacterial capsular polysaccharide [14].

Another report on the presence of L-iduronic acid in a prokaryote concerns the analysis of specific extracellular polysaccharide (EPS) of Butyrivibrio fibrisolvens strain X6C61 [15]. As much as 37 strains of B. fibrisolvens were screened in total, but only a single strain appeared to contain L-iduronic acid. This indicates that L-iduronic acid is part of a type-specific EPS. The exact composition of the EPS remains to be characterized, although it has been proposed that L-iduronic acid is associated to a galactosamine residue.

In addition, several reports exist in which L-iduronic acid is identified as a compound of an O-specific polysaccharide. An O-antigen is the highly variable part of a lipopolysaccharide (LPS), which is present in the outer membrane of gram-negative bacteria. The first report of L-iduronic acid being present in an O-antigen was after structure elucidation of the O-antigen of the marine bacterium Pseudoalteromonas haloplanktis strain KMM 223 (44-1) [16]. The L-iduronic acid residue is part of a pentasaccharide (Fig. 1) that additionally contains two D-glucuronic acid residues and two residues of the uncommon QuiN4N (2,4-diamino-2,4,6-trideoxyglucose). The high amount of hexuronic acids results in a highly acidic O-antigen. In addition, the uncommon QuiN4N residues and GlcA residues have been found in other serotypes of Pseudoalteromonas [17], however strain KMM 223 remains the only example that has an L-iduronic acid-containing O-antigen.

Fig. 1
figure 1

Known prokaryotic structures containing L-iduronic acid; bacterial O-antigens and the Halobacterium halobium glycoconjugate. GlcA, glucuronic acid; QuiNHb4N, 2,4-diamino-2,4,6-trideoxy-D-glucose (Hb, S-3-hydroxybutyryl; Ac, acetyl); IdoA, iduronic acid; GlcNAc, N-acetyl-glucosamine; GalNAc, N-acetyl-galactosamine

More recently, two additional O-antigens have been identified in which L-iduronic acid is one of the building blocks (Fig. 1). Both Escherichia coli type 112ab and Shigella boydii B15, have an identical pentasaccharide structure [18, 19]. Many more O-antigen structures of various serotypes of both E. coli and S. boydii have been resolved to date, however L-iduronic acid seems to be restricted to these two reported strains. Just like the previous reports dealing with L-iduronic acid identification in bacteria, the occurrence of this structure is highly type-specific rather than a general feature.

Archaea

L-iduronic acid has been reported in archaea only once. There is evidence of the presence of iduronic acid in a cell surface lipoprotein of Halobacterium halobium [20]. The cell wall of this archaeon is a glycoprotein based S-layer. The glycoprotein has two specific forms of N-glycosylation. First each polypeptide consists of a single glycosaminoglycan-like polysaccharide with a [-4)GalNAc(1-4)GalA(1-3)-GalNAc(1-]n10–15 backbone attached. Apart from that, there are 12 potential glycosylation sites where an Asn-Glc (asparaginylglucose) linkage unit is extended by two or three β(1-4) bound glucuronic acid residues. About 1/3 of these glucuronic acid moieties are replaced by an iduronic acid (Fig. 1). An identical glycoconjugate can be found at the organisms’ flagellin [21]. Although archaeal flagellins often undergo posttranslational modification [22], until now this is the only report of iduronic acid presence in such a structure.

Non-canonical L-iduronic acid containing polymers in eukaryotes

L-iduronic acid traditionally is considered to be a component in many common animal glycosaminoglycans. Apart from these well-characterized GAGs, recently L-iduronic acid also has been identified in some atypical polymers. The eukaryotic organisms having these non-canonical polymers usually do not possess the traditional GAGs as found in animals. Possibly the formation of L-iduronic acid is the result of another C5-epimerase than the heparosan D-glucuronyl C5-epimerase. An overview of some of these rare L-iduronic acid containing structures is provided below.

Algae

Pleurochrysis haptonemofera is a unicellular coccolithophorid marine alga. It produces coccolith, a calcified scale. Apart from carbonate crystals, this scale contains a small amount of polysaccharide called “coccolith matrix acidic polysaccharide” (CMAP). The structure of CMAP has been determined to be composed of a repeating disaccharide structure, of which L-iduronic acid is one of the sugars [23]. In addition there are reports of the existence of L-iduronic acid in specific polysaccharides in multicellular algae. The cell wall of sea lettuce (genus Ulva) includes four types of polysaccharides, of which the water-soluble ulvan is exclusively found in members of the Ulvales. This polysaccharide has a repetitive disaccharide of L-iduronic acid that is α(1-4) linked to a C3-sulfated rhamnose [24].

Fungi

Phallic acids are specific glycuronans that can be found in the fruiting-bodies of members of the taxon Phallales. Tsuchihashi and colleagues have described the isolation of phallic acid of at least ten species, all containing L-iduronic acid. [25]. The exact structural composition is still unknown, but it has been reported that these polysaccharides are composed of β-glucuronic acid and α-iduronic acid residues that have an (1-4) linkage. The internal ratio of these two hexuronic acids varies around 2:1 to 3:1. The polysaccharide is called protuberic acid when the ratio glucuronic acid to iduronic acid is equal to 2:1 [26].

Sponges

Citronamides A and B are unique products that have been isolated from the Australian sponge Citronia astra. Both are non-canonical tetrapeptides with a linked 3- or 4-O-(aminocarbonyl)-α-iduronic acid residue. These compounds accidently have been co-isolated with Dysinosin A, a potential serine protease (thrombin) inhibitor. Citronamides A and B are structurally not related to Dysinosin A, and the biological function of these products still needs to be clarified [27].

Identification of C5-epimerases in prokaryotes

The above described polymeric structures are examples of L-iduronic acid containing polysaccharides and glycosaminoglycan-like structures in several microorganisms. The existence of L-iduronic acid does suggest D-glucuronyl C5-epimerase activity to occur in these organisms. The wide diversity of these GAG-like structures suggests the presence of candidate C5-epimerases with a different or broader substrate specificity. To date no such candidate enzyme has been identified.

In an attempt to identify candidate C5-epimerases, we screened all available prokaryotic genomes for sequences homologous to human D-glucuronyl C5-epimerase by Blast analysis (http://blast.ncbi.nlm.nih.gov/Blast.cgi) [28]. Multiple prokaryotic sequences were identified that have homology to the human sequence. Also Blasts on metagenome data reveal multiple putative prokaryotic C5-epimerases. All found candidate C5-epimerases have a well-conserved domain making them members of the pfam06662 superfamily [29] containing the consensus of the C-terminus of D-glucuronyl C5-epimerases. An overview of all prokaryotes having one or more candidate D-glucuronyl C5-epimerase gene(s) is provided in Table 2.

Table 2 Overview candidate C5-epimerases in prokaryotes

The occurrences of these candidate C5-epimerases seem to be type-specific rather than species-specific. This is in line with the earlier discussed reports on the identification of L-iduronic acid containing polymers in prokaryotes, that also appear to be type-specific in various species. Among the identified candidate C5-epimerases differences in size exist; however, this is mainly a result of variation of the N-terminal domain of the protein. Most prokaryotic sequences show a similar organization as the protein sequence of animal D-glucuronyl C5-epimerases. An N-terminal signal peptide is predicted for many sequences and at the C-terminus of the protein the conserved pfam06662 domain can be found. This architecture resembles that of animal D-glucuronyl C5-epimerases. A multiple sequence alignment of the C-terminal domain of the prokaryotic candidate C5-epimerase and a selection of animal C5-epimerases is included (Fig. 2). It is tempting to speculate on the role of those amino acid residues that are completely conserved. Residues possibly involved in catalysis are the conserved tyrosines and histidines. Structural data of two other types of C5-epimerases, not homologous to the heparin C5-epimerase [3] and the prokaryotic candidate epimerases, reveal a role of conserved histidines and tyrosines in catalysis for both functionally distinct C5-epimerases [30, 31]. Although there is no homology at amino acid level, a similar catalytic mechanism of the heparin-acting C5-epimerase to these distinct C5-epimerases cannot be ruled out and could be feasible.

Fig. 2
figure 2

Multiple sequence alignment of C-terminal part of the candidate D-glucuronyl C5-epimerases. The positions of the first and last residues of the aligned region of the corresponding candidate C5-epimerase are indicated for each sequence. Names are abbreviated and in the same order as in Fig. 3

We constructed a phylogenetic tree [32] containing several eukaryotic D-glucuronyl C5-epimerases, as well as a selection of prokaryotic homologs. The multiple sequence alignment [33, 34] is mostly based on the C-terminus of the genes (Fig. 2). No remarkable differences were observed when constructing a tree of the full length sequences or of the C-terminus only. Phylogenetically, the prokaryotic candidate C5-epimerase sequences cluster in a domain-specific way (Fig. 3). Most deviation is observed in bacterial sequences, while archaea and eukaryotes are more alike. Obvious inter-domain substitutions can not be observed, and are not expected to have occurred recently. Few bacterial sequences do cluster with eukaryotes and archaea but these are close to the root and bootstrap values are too low to draw any conclusions.

Fig. 3
figure 3

Phylogenetic analysis candidate C5-epimerases. The coloring of branches is domain specific; eukarya in red, bacteria in blue and archaea in green. The used bootstrap value is 1000

On an intra-domain level one could speculate on the clustering. Considering a confidence level of 70% or higher, the archaeal genes cluster in two clades. Surprisingly the two Methanothermobacter species do not cluster with the Methano(caldo)cocci. The subset of bacterial sequences gives rise to several clades, each not necessarily clustered in a class specific way. An example is seen for Thermoanaerobacter, Bacillus and Ruminococcus that do cluster with Bacteroides and not cluster with the clostridiae genes, even though they are all in the class of firmicutes. Instead the clostridiae genes cluster with an Acidobacterium. It is tempting to speculate that these deviations reflect the occurrence of variant enzymes (paralogs) with a different substrate specificity. This is expected because of the wide diversity that exists in bacterial cell wall polysaccharides. Most likely the candidate C5-epimerases are involved in the biosynthesis of type-specific polysaccharides.

The association of the putative candidate C5-epimerases with other (predicted) sugar modifying enzymes (e.g. glycosyltransferases) is clearly revealed upon neighborhood analysis of the involved prokaryotic genomes (Fig. 4). Genes in this gene cluster are likely to be involved in strain specific O-antigen production [35], since some of the sequences have homology to the wbb operon [36], which is known to be involved in the O-antigen biosynthesis [37]. As LPS do not occur in gram positives, alternatively these gene clusters can also be involved in the biosynthesis of a capsular polysaccharide. This cell wall structure can occur in gram positive bacteria like T. tengcongensis. The exact gene function is hard to distinguish, since genes involved in polysaccharide capsule biosynthesis are sometimes embedded in the other cell wall biosynthesis related gene regions (e.g. LPS) and vice versa [38].

Fig. 4
figure 4

Neighborhood analysis candidate C5-epimerases

A remarkable similarity in genomic organization is seen for some of the putative C5-epimerases. Figure 4 shows the flanking genes up- and downstream the candidate C5-epimerases of two Vibrio cholerae strains (albensis VL426 and CO845), the thermophilic bacterium Thermoanaerobacter tengcongensis MB4 and two Photorhabdus species. Both Photorhabdus luminescens TT01 and Photorhabdus asymbiotica are symbiotic pathogens of insects, although P. asymbiotica is occasionally found as an opportunistic pathogen of humans as well. V. cholera is a well-known human pathogen, causing cholera. No virulence activity is reported for T. tengcongensis [39].

For most of these prokaryotic candidate C5-epimerases, the gene is in close proximity of various sugar modifying enzymes. The analogy in gene neighborhood organization is remarkably similar for the above mentioned five bacteria. Other bacterial candidate C5-epimerases have a different organization of their gene neighborhood, despite the fact that they are more related to any of these five species with respect to taxonomy or candidate C5-epimerases sequence identity. A clear example is seen for the candidate C5-epimerase of Bacillus cereus R309803. This sequence is the best hit compared to T. tengcongensis (54% identity, 70% similarity). The homology of the T. tengcongensis sequence to the putative C5-epimerases from Photorhabdus (wblE) is rather low (15% identity, 31% similarity). Homology with V. cholera is better (23% identity, 44% similarity), but still weak compared to the best hit.

Naturally, addition of more sequences would give a better understanding of the exact sequence distribution and a better view on the number of different clades existing. However, it is obvious that within this subset of sequences a clustering in different clades exists, which is not necessarily class-specific. An explanation of this phylogenetic distribution might be the existence of several C5-epimerase paralogs. Given the enormous variety in type-specific cell wall polysaccharides, it is expected that enzymes involved in biosynthesis have a substrate optimized specificity. Variations in substrate specificity of involved D-glucuronyl C5-epimerase would certainly be feasible.

Conclusions

The C5-epimerization of D-glucuronic acid to its C5-epimer L-iduronic acid has long been considered typical for animal derived glycosaminoglycans. However, an increasing number of L-iduronic acid containing structures in microorganisms can be confidently identified in prokaryotes as well. Moreover, we found multiple candidate D-glucuronyl C5-epimerases in a wide variety of microbes by in silico analysis of available prokaryotic genome data. Gene neighborhood analysis of these sequences suggests a role in sugar modification, most likely in type-specific polysaccharides (e.g. capsule polysaccharides or O-antigens). Phylogenetic analysis indicates sub-clustering of the set of candidate D-glucuronyl C5-epimerases into several clades. This possibly correlates with the existence of different C5-epimerase paralogs each having a distinct substrate specificity. The exact physiological function and substrate specificity requires biochemical analysis. However, this subset of sequences and Blast analysis of metagenomes reveal the existence of multiple candidate C5-epimerase genes in prokaryotes, supporting the conclusion that L-iduronic acid most likely is less rare in prokaryotes than expected. These putative C5-epimerases potentially may become important tools in controlled chemo-enzymatic synthesis of L-iduronic acid containing polymers like heparin and heparin-analogs.