Background
The
Duchenne Muscular Dystrophy (DMD) gene, located on the short arm of the X chromosome (at Xp21.2), is the largest known gene in humans. It has an open reading frame of ~11.055 kb, containing 79 exons (Mendelian Inheritance in Man [MIM: 300377]) [
1], and transcription from seven tissue-specific promoters leads to the synthesis of 16 isoforms of the dystrophin protein. In humans, dystrophin diseases are caused by mutations in the
DMD gene and include the allelic phenotypes of Duchenne muscular dystrophy (DMD) [OMIM:310200], Becker muscular dystrophy (BMD) [OMIM:300376] and X-linked dilative cardiomyopathy (XLDCM) [OMIM:302045] [
1‐
3].
Dystrophin is present at the internal face of the plasma membrane in many tissues, including skeletal, cardiac and smooth muscle, and in various central nervous system cells. Dystrophin is highly conserved in vertebrates, including mouse, chicken and dog, and in invertebrates, such as
Drosophila[
4], and
Caenorhabditis elegans[
5].
The full-length 3685-residue isoform of dystrophin, dp427m, has a molecular weight of 427kDa and is expressed in skeletal and cardiac muscle, where it plays a key role during muscle contraction-relaxation cycles. Dystrophin has four main regions: (i) the N-terminal actin-binding domain (ABD) comprises the first 246 residues; (ii) the central rod domain spans residues 247 to 3045 (accounting for about 76% of the molecule [
6,
7]), is formed by 24 spectrin-like repeats and four hinges and binds to various partners (filamentous actin, membrane lipids and nitric oxide synthase); (iii) the cysteine-rich domain, from residues 3080 to 3360, binds to the intrinsic membrane protein β-dystroglycan and (iv) the carboxy-terminal domain, comprising the last 325 residues, binds to dystrobrevin and syntrophins (for reviews, see [
8,
9]). Dystrophin is associated with a large number of proteins, to which it either binds directly or with which it interacts indirectly through intracellular or extracellular proteins. The binding of dystrophin to β–dystroglycan brings it into contact with membrane and extracellular proteins to form the dystrophin-glycoprotein complex (DGC) [
10,
11]. Dystrophin therefore forms a link between the extracellular matrix and cytoskeletal actin. The function of dystrophin is not completely understood, but its main role is to protect the sarcolemma from rupture during the stresses of muscle contraction [
12,
13].
Patients with muscular dystrophy have little or no dystrophin, and it has been suggested that this results in the disruption of muscle membranes, which alters calcium-channel activity, thereby strongly increasing intracellular calcium concentration. This ultimately leads to muscle-cell necrosis [
13,
14], followed by regeneration. The continual cycles of regeneration and necrosis lead to the skeletal muscles being gradually replaced by adipose tissue and unable to sustain any mechanical activity.
The incidence of Duchenne muscular dystrophy (DMD) is about 1 in 3,500 male births. Affected patients have a massively reduced life expectancy and a poor functional prognosis. In most cases, DMD is due to frame-shift mutations in the
DMD gene, leading to a complete absence or low levels of dystrophin protein (no more than 3% normal levels). This accounts for the severity of the phenotype in all patients, although some variation of disease expression is observed between patients, in terms of motor, respiratory, cardiac and mental functions [
15,
16]. The expression of the various dystrophin isoforms may depend on the mutation site, potentially accounting, at least in part, for the correlation with the motor and mental status of patients [
15]. Becker muscular dystrophy (BMD) is less frequent than DMD, and is usually milder, with slower disease progression. BMD is caused by in-frame deletions or duplications of one or several exons or by splice-site and missense mutations. These mutations lead to the production of various amounts of internally truncated, lengthened, or slightly modified dystrophin molecules. This results in a broad spectrum of clinical severity, ranging from a complete absence of symptoms, through mild disease, to severe clinical conditions similar to DMD [
15‐
25]. According to the reading-frame rule, frame-shifting mutations lead to the severe DMD phenotype, whereas in-frame mutations lead to the less severe BMD phenotype [
16,
26]. However, there are exceptions to this rule, with certain in-frame mutations resulting in the severe DMD phenotype [
27‐
31]. These mutations are frequently located at the 5’ end of the gene encoding the N-terminus of dystrophin including ABD1, or at the 3’ end of the gene encoding the C-terminal domain, usually in the Cys-rich domain, thereby disrupting the DGC.
The structure and function of dystrophin are poorly resolved at the biological and physiological levels, and it is therefore difficult to establish a detailed phenotype-genotype correlation in BMD patients. Phenotypic differences between patients are thought to depend on the site of the deletion or duplication and the conservation of the reading frame, and such differences have recently been shown to be correlated with the residual amount of dystrophin [
32]. Such knowledge is essential to anticipate the effects of current exon-skipping treatments on phenotype restoration in treated DMD patients [
33‐
36].
Two databases of
DMD human mutations are already freely available online: the Leiden Muscular Dystrophy database [
37,
38] and the UMD-DMD French database [
39,
40]. The Leiden Muscular Dystrophy database lists the
DMD mutations in patients reported in publications or submitted by contributors from around the world and includes some biochemical and phenotypic details. The UMD-DMD database provides molecular and clinical data for patients from France carrying a mutation of the
DMD gene. Both databases include in-frame and frame-shifting mutations and focus on gene-level information. However, as dystrophin acts at the protein level, a more detailed and comprehensive characterization of the protein produced from genes with in-frame gene mutations is required. Such a characterization is particularly important for comparisons of the structural features and molecular interactions of the mutated protein with those of the wild-type protein. For example, the total absence of dystrophin, or its presence in very small amounts in DMD patients, leads to the breakdown of the DGC complex, a histological marker of the disease [
41]. However, the site of the mutation determines whether these interactions are abolished in BMD patients, resulting in diverse phenotypes. For both basic research and clinical/therapeutic purposes, it is therefore of interest to establish a correlation between the genotype and the molecular and structural consequences of in-frame mutations for the encoded protein.
To this end, we have developed a new database called eDystrophin, specifically dedicated to providing information about the in-frame mutations of the
DMD gene and their consequences for dystrophin molecules. The eDystrophin database includes both in-frame
DMD mutations identified at a routine diagnostic laboratory for these mutations in France and published mutations. In addition to the genetic and clinical details provided by the other two available databases, the eDystrophin database provides: (1) a synthetic view of the properties of mutated dystrophin, (2) a map of modifications to binding sites for interacting protein partners; for deletions involving the central rod domain, eDystrophin provides (3) a structural model of the mutation site and (4) a specific comment indicating whether a correct filamentous 3D-structure is reconstituted around the mutation site. Finally, this new database focuses on the protein rather than the gene, providing a new vantage point regarding in-frame mutations of the
DMD gene, with the finding that the gene exons and protein domains are “in phase” for the specific central rod domain of dystrophin. This phasing controls the ability of the internally truncated dystrophin molecules to reconstitute a hybrid repeat unit able to fold into a triple coiled-coil, resembling the native repeats present in full-length dystrophin. This database is freely available from
http://edystrophin.genouest.org/, and all the information provided can be downloaded.
Discussion
The eDystrophin database is a new biomedical resource for clinicians and researchers working on human dystrophin diseases. This dedicated database for the dystrophin protein specifically aims to provide information about in-frame
DMD mutations and their consequences for the dystrophin protein. It provides a framework for the analysis of such mutations, by presenting a large body of information for both wild-type and mutated dystrophin proteins, including findings relating to the structure of these proteins and their interactions with known partners. Although eDystrophin is a locus-specific database, it was not constructed with an existing database system, such as LOVD [
66,
67] or UMD [
68,
69]. Indeed, such systems are more useful for DNA variant databases and are not suitable for the construction of a protein-based database like eDystrophin.
In human dystrophin diseases, the ratio of the frequency of DMD to that of BMD is approximately 2/3 – 1/3 [
40,
70]. Most cases of DMD are caused by frame-shift mutations, whereas BMD is generally caused by in-frame mutations, although exceptions have been reported [
26]. Documented cases of in-frame mutations are largely underrepresented in existing databases, and the primary aim of the eDystrophin database project was to redress the balance, by developing a dedicated information source for in-frame mutations. Unlike frame-shift mutations, in-frame mutations lead to the production of proteins with various degrees of functionality. The secondary goal of eDystrophin was therefore to determine and show the predicted consequences of these mutations for the composition and structure of the encoded proteins and their clinical consequences. In this first version of eDystrophin, patients and in-frame mutations were obtained from one of the major French contributors to the UMD-DMD database and from published studies. Evidently, the database could be expanded in the near future by including mutations and patients from around the world, which would probably yield more accurate phenotype-genotype correlations.
However, dystrophin is a large protein, and it is a challenge to investigate the consequences of mutations of its gene. The dystrophin protein has two principal roles: as a scaffolding protein for several interacting partners and as a filamentous protein with a mechanical and structural function, providing resistance to the stress of muscle contraction [
8,
13]. Any mutation altering the structure of dystrophin may therefore affect both these functions (and potentially other minor functions of the protein as well) simultaneously. Our database provides an overview of the effects of mutations on protein function. In particular, it provides the user with information about changes to interactions and about the maintenance or disruption of the filamentous structure of the mutated dystrophin protein.
Several binding partners of dystrophin have been identified, and the eDystrophin database infers changes to their binding to a mutated dystrophin variant by considering whether the interacting domains remain intact and unmodified. Based on these inferences and previous observations, deletions affecting the Cys-rich or ABD1 domains appear to be much more deleterious than those affecting the central domain [
23,
40]. However, we detected several mutations affecting the central rod domain and causing a DMD phenotype in a substantial number of patients. In these patients, mRNA levels may have been low and/or unstable, accounting for the presence of little or no protein. Careful re-examination of the boundaries of the mutation is also necessary for these patients. Indeed, Taylor
et al. (2007, PhD thesis) re-examined a large cohort of DMD patients with in-frame deletions affecting the central rod domain and found that most were frame-shift mutations, consistent with the Monaco reading frame rule. Furthermore, we cannot entirely exclude the possibility of two mutations occurring in the same gene. For the other DMD patients carrying in-frame mutations, uncertainties remain concerning the levels or stability of the corresponding mRNA.
We obtained models of the three-dimensional structure of the new junctions created between the sequences on either side of the deletions in the central rod domain, as previously described [
53]. The database provides a computational model for each in-frame deletion collected. An analysis of the structural features of these new junctions showed that two outcomes were possible: the reconstitution of a hybrid repeat and the formation of a fractional repeat in situations in which it was not possible to form a hybrid repeat. The likelihood of hybrid repeat formation depends on the phasing of the exon boundaries with the center of the B helix of the repeats. The reconstitution of a hybrid repeat can be assumed to occur because the major factor controlling this folding pattern is the presence of a heptad pattern. As this pattern is respected in cases in which the deletion creates a new junction between the first half of one B helix and the second half of the next, from two truncated repeats, coiled-coil folding similar to that in native repeats would be expected [
64,
65]. By contrast, in fractional repeats, the α-helices can fold correctly, but the heptad pattern is not respected and a three-dimensional coiled-coil structure therefore cannot be obtained. This may result in a less stable deletion site than for native and hybrid repeats. The hypothesis that repeats phasing in truncated dystrophins is essential to ensure a high level of protein function has already been tested. Transgenic
mdx mice were produced with several types of truncated dystrophin, some with correct and others with incorrect phasing of the repeats. However, in this previous study, native repeats were either entirely conserved or entirely lost [
71]. These findings led to the “mini-dystrophin” concept for DeltaH2-R19, in which the rod domain was decreased in size by a deletion encompassing the amino acids from hinge 2 to repeat 19. By contrast the “micro-dystrophin” DeltaR4-R23 had a deletion extending from repeat 4 to repeat 23. Constructs encoding these proteins proved to be among the best therapeutic constructs for
mdx mouse rescue. In BMD patients, phasing is not as described in these previous experiments and only hybrid repeats may be reconstituted. However, the demonstration of beneficial effects of phasing in the
mdx mouse suggests that the presence of hybrid repeats may be associated with a milder phenotype than the presence of fractional repeats [
62,
63]. Such a correlation between the structural features of mutated dystrophin and clinical severity in a cohort of BMD patients has been reported for cardiomyopathy [
24]. The authors constructed models of the mutated dystrophin for deletions involving exons 45 to 49 and investigated the phasing of spectrin repeats. They concluded that the absence of hinge 3 delayed the onset of dilated cardiomyopathy.
However, it should be stressed that the presence of a hybrid repeat does not itself imply a better conservation of dystrophin function than the presence of a fractional repeat. Indeed, mRNA instability or changes to protein-protein interactions may also affect the function of the mutated dystrophin, and it is not currently possible to predict these effects. Investigations of the correlation between the presence of a hybrid repeat and the severity of clinical symptoms are now required. However, the eDystrophin database can be used as a predictive tool for exon skipping–based therapy. The choice of the exon to be deleted to restore the reading frame could be based on careful consideration of the likelihood of reconstituting a hybrid repeat.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
ELR, FBH, RBY and JC initiated and supervised the project. AN, CLM and FBH created the database. AN, RBY, FL and ELR monitored data collection. AN and ELR created and analyzed the structural models. ALL authors participated in the writing of the manuscript and approved its submission.