Introduction
Language is one of the most important and specific cognitive abilities of human beings. According to De Saussure (
1975), language is a universal structure encompassing the abstract, systematic rules and conventions of a unifying system, which is independent of individual users, while speech is the personal use of language, thus presenting many different variations such as style, grammar, syntax, intonation, rhythm, and pronunciation. Though neuroimaging studies of language at the word/phonological level have demonstrated bilateral activation during language tasks, calculation of asymmetries provides results that are consistent with the neuropsychological evidence that language is implemented in large areas located along the left sylvian fissure (Vigneau et al.
2006,
2011). More specifically, word processing is underpinned by cortical areas involved in the auditory, visual, and motor areas spreading over the left hemisphere depending on the type of language modality (Price
2010,
2012). However, the question of the existence of core areas independent of the modality of the language task is still open. Regarding the left hemisphere arrangement of language areas, two divergent theories explain the relations of speech perception and speech production to language. The former, called the horizontal view, proposes that the elements of speech are sounds that rely on two separate processes (one for speech perception, the other for speech production) that are not specialized for language until a cognitive process connects them to each other and then to language (Fodor
1983). The latter, called the vertical view (or motor theory of speech perception), posits that speech elements are articulatory gestures serving both speech perception and production processes that are immediately linguistic, thus requiring no cognitive process (Liberman and Whalen
2000). More generally, the existence of a bilateral dorsal–ventral model of speech processing with preferential leftward involvement has been widely accepted (Binder et al.
1996; Hickok and Poeppel
2004; Rauschecker and Tian
2000). This model posits the coexistence of (1) a dorsal pathway, i.e. the “where stream,” in which an acoustic–phonetic–articulatory transformation linking auditory representations to motor representations is reported to occur in superior temporal/parietal areas and ultimately in frontal areas (Buchsbaum et al.
2001); and (2) a ventral pathway, i.e. the “what stream”, in which speech-derived representations interface with lexical semantic representations, reported to involve the superior, middle, and inferior temporal gyri (Binder et al.
2000; Hickok and Poeppel
2000). Interestingly, concerning the dorsal pathway, the postulated existence of an auditory–motor system (Hickok and Poeppel
2000) has been supported by studies that aimed at examining the role of motor areas in speech perception. Hence, an fMRI study revealed that listening to syllables and producing the same syllables led to a common bilateral network encompassing a superior part of the ventral premotor cortex, suggesting the existence of a common phonetic code between speech perception and production (Wilson et al.
2004). Furthermore, another study has not only suggested that the cortical motor system is organized in a somatotopic way along the precentral cortex with the lip area being superior to the tongue area, but also revealed that these precentral regions are consistently activated by syllable articulation and syllable perception, hence demonstrating a shared speech-sound-specific neural substrate of these sensory and motor processes (Pulvermüller et al.
2006). These findings were supported by a meta-analysis revealing that in right-handers, activations of the posterior part of the frontal lobe distributed along the precentral gyrus were strongly left lateralized during both production and auditory tasks at the word or syllable level, together with the involvement of the supramarginal gyrus (SMG) (Vigneau et al.
2006). Moreover, a recent MEG study reported a synchronization between the anterior motor regions involved in syllable articulation and the posterior regions involved in their auditory perception during perception of these syllables (Assaneo and Poeppel
2018). In addition, studies on split-brain patients have demonstrated a strict leftward lateralization concerning phonological processing, with split-brain patients’ right hemisphere lacking categorical perception of phonemes (Gazzaniga
2000; Sidtis et al.
1981). Such a leftward lateralization was confirmed by studies using the Wada test procedure (Dym et al.
2011), and the leftward asymmetry of the audio–motor loop measured with functional imaging actually supports the left hemisphere specialization for the phonological processing of speech (Vigneau et al.
2006, Zago et al.
2008).
Though mastered afterwards, human beings have developed other ways of using language through other sensory modalities, such as the visual system in the case of reading. Accurate perception and production of speech sounds are essential for learning the relationship between sounds and letters. Phonological awareness, i.e. the ability to detect and manipulate speech sounds, or phonemes, is the best predictor of reading ability. Reading is based on both the ability to hear and segment words into phonemes and then to associate these phonemes with graphemes, with the mapping of orthographic to phonological representations during reading being intrinsically cross-modal (McNorgan et al.
2014). Research has revealed that a phonological processing deficit underlies reading difficulties in dyslexic children, establishing a link between perception and reading abilities (Gillon
2004). In the case of disorders of oral language development, specific language impairment (SLI) is the most frequently studied developmental disorder. Children with specific language impairment have been reported to present impairments in phonological processing, whether in phonological awareness or in phonological memory, which is evidence of a link between production and reading abilities; the neural support of this link still needs to be clarified (Catts et al.
2005). Different studies examining the word processing cerebral networks common to the auditory and visual modalities have revealed the supramodal involvement of anterior regions [supplementary motor area (SMA) and prefrontal, premotor and inferior frontal gyri], whereas variations have been observed in the temporal lobe depending on the language task (Booth et al.
2002a,
b; Buckner et al.
2000; Chee et al.
1999), making it difficult to conclude the existence of a common antero-posterior network for plurimodal word processing. Regarding semantic processing, it should be noted that one study addressing production and reading in four languages revealed a common bilateral network involved in these two tasks (Rueckl et al.
2015). Moreover, since the complete development of speech in literate individuals leads to the mastering of the written language, we expected that the core word areas developing conjointly in the three modalities would include some visual areas, which would be part of a large-scale plurimodal network underpinning word processing. It is worth emphasizing that, even if less investigated, the first phase of speech acquisition in newborn babies is perceptual, as the infant hears others’ vocalizations, highlighting the importance of prosody in speech processing. Speech prosody, i.e. the musical aspects of speech, is an early-developing component of speech, which could be compared to a musical stave upon which phonemes would be placed (Locke and Pearson
1990). This perceptual phase is crucial considering the inability to learn spoken language or even normally babble when infants are born deaf (Oller and MacNeilage
1983) or in the case of wild children (Curtiss
1977). In other words, children have to listen to the prosody of their mother tongue to be able to reproduce it. Lesional studies have revealed that the tonal prosodic brain areas are located in the right hemisphere along the STS, which includes the posterior human voice area (pHVA), highlighting the potential role of these right hemispheric regions during development. The second phase of speech acquisition is production. In fact, children master the prosodic dimension before producing their first words (Bever et al.
1971). Production develops through the process of imitation, highlighting that prosodic processing is one element of the construction of a strong dependency between perception and production throughout development. This is illustrated, for example, by persisting difficulties in speech production encountered by infants who were tracheotomized at a time when they should have normally babbled (Locke and Pearson
1990). Interestingly, metre in speech, whose acoustic correlate is stress, has been revealed to be important for both speech perception (Jusczyk et al.
1993) and production (Gerken et al.
1990). Once the metrical rules (which provide important cues for speech segmentation within the continuous speech stream) have been acquired, a speech metre contributes to phonological (Pitt and Samuel
1990), semantic (Schwartze et al.
2011) and syntactic (Roncaglia-Denissen et al.
2013) processing. Musical rhythmic priming, using metres, has been revealed to enhance phonological production in hearing-impaired children due to an enhanced perception of sentences (Cason et al.
2015). Furthermore, in the context of speech rehabilitation therapies, musical rhythm has been revealed to be a fluency-enhancing tool (Thaut
2013). More generally, the prosodic dimension of speech has been used to restore the speech of Broca’s aphasic patients, and the term Melodic Intonation Therapy was coined to refer to this technique based on the use of melody and singing, which would be core musical elements predominantly engaging the right hemisphere (Thaut and McIntosh
2014). The right STS specialization for tonal processing was evidenced by a neuroimaging study as a rightward asymmetry of activation (Zatorre and Belin
2001). Other studies have highlighted the role of the right hemisphere, particularly the right STS, in the prosodic dimension of speech (Beaucousin et al.
2007; Belin et al.
2004; Sammler et al.
2015). Given the importance of prosody in language development, we hypothesized that in addition to left hemisphere participation, right hemispheric regions hosting the tonal dimension of speech prosody may be involved in all three tasks, i.e. production, perception and reading tasks.
In summary, previous studies on phonological/word processing have, at best, dealt with two different language modalities (either production and listening or production and reading) focusing on discrete cortical areas a priori selected (articulatory motor areas and SMG). In the present work, we assessed the three main language modalities: listening, production and reading. Furthermore, considering the importance of lateralization reported above, we took into consideration the right and left hemisphere contribution to task completion at the word or syllable levels in the present work. Finally, we integrated the connectivity data provided by a resting-state acquisition to propose a comprehensive view of the plurimodal large-scale networks for phonological/word processing and their potential roles.
To achieve the identification of plurimodal large-scale networks for word-list processing, a large population of 144 right-handers who completed word-list processing in the three modalities, production, reading and listening, during task-induced fMRI acquisition, was selected from the BIL&GIN database (Mazoyer et al.
2016). In this sample of healthy right-handers, we (1) identified left brain regions showing both leftward joint activation and leftward joint asymmetry and right brain regions showing both rightward joint activation and rightward joint asymmetry during the three word-list tasks; (2) identified the network organization at play within the areas previously identified based on the hierarchical clustering of the BOLD temporal correlation measured during a resting-state acquisition completed in the same individuals; and finally, (3) conducted a comprehensive investigation of how these areas were modulated according to the task and integrate the present results into the literature to elucidate the identified supramodal word-list network’s function/role.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.