Gaze direction is extremely important for social interaction between human beings. Compared with other animals, the human eye region yields a vast amount of information, thanks to the high contrast between the bright sclera and the darker iris (Kobayashi & Kohshima, 1997). The social information conveyed by gaze enables the understanding of others’ states of mind, and decoding and interpreting another’s gaze is an essential component of the theory of mind (ToM)—that is, the ability to attribute mental states to other people (Baron-Cohen, 1997). A dedicated gaze perception system is therefore required to gather socially relevant information regarding one’s surroundings (George & Conty, 2008). Indeed, gaze direction constitutes a valuable cue to determine an interlocutor’s focus of attention (Langton, 2000) and to guide one’s attention (Driver et al., 1999).

A gaze oriented toward an observer, or direct gaze, is a socially significant signal that carries affective value, and communicates information such as desirability (Mason, Tatkow, & Macrae, 2005) or threat (Emery, 2000; Kleinke, 1986). Additionally, direct gaze captures visuospatial attention (Senju & Hasegawa, 2005) and prioritizes visual processing, which in turn influences face detection and increases speed of detection (Conty, Tijus, Hugueville, Coelho, & George, 2006; Doi & Ueda, 2007; Doi, Ueda, & Shinohara, 2009; Palanica & Itier, 2012; Senju & Hasegawa, 2005; von Griinau & Anston, 1995). Altogether, this evidence has been taken as an indication of the privileged status of direct compared with averted gaze on stimulus processing.

Attentional capture by direct gaze can also be revealed by an increase in speed and performance in face processing. As a typical example, classification of gender is facilitated for faces gazing directly at the observer, compared with faces looking in other directions (Macrae et al., 2002). In a seminal paper, Macrae et al. (2002) asked observers to categorize the gender of faces displaying either a direct gaze, an averted gaze, or closed eyes. They observed that direct gaze facilitated the identification of gender compared with other conditions (but see Vuilleumier, George, Lister, Armony, & Driver, 2005). These findings were taken as evidence of the social-cognitive relevance of direct gaze on the efficiency of the person-construal process.

However, in these tasks, it is unclear whether this effect occurs because low-level perceptual features of a direct gaze enhance visual processing or whether this is produced at a later, high-level categorisation-processing period. Behavioural response time investigations are ill-equipped to answer this question and electroencephalographic (EEG)/event-related potentials (ERPs) studies are essential to determine the temporal sequence of events associated with the improved performance under conditions of direct gaze, in situations such as gender categorization. Although several studies have explored the neural correlates of gaze processing using explicit gaze discrimination (i.e. Conty, N’Diaye, Tijus, & George, 2007; Itier, Alain, Kovacevic, & McIntosh, 2007a), few have examined the temporal dynamics of implicit gaze processing using gender categorisation.

Researchers studying face processing with EEG have focused their investigations mostly on the well-known N170 component. This face-sensitive component, which arises on occipitotemporal electrodes at about 170 ms, has been assumed to reflect the initial stage of structural encoding of the face (e.g., Bentin, Allison, Puce, Perez, & McCarthy, 1996). Interestingly, the N170 is also highly sensitive to the eye region. For instance, eyes trigger a larger N170 when presented in isolation than in the context of the entire face (Bentin et al., 1996; Itier, Alain, Sedore, & McIntosh, 2007b; Itier, Latinus, & Taylor, 2006). Consequently, some authors have stated that N170 might by an early marker of eye processing (Itier et al., 2006; Nemrodov & Itier, 2011; Taylor, Edmonds, McCarthy, & Allison, 2001a; Taylor, Itier, Allison, & Edmonds, 2001c).

Along these lines, one would assume that gaze direction might impact the N170, with direct gaze eliciting an earlier and stronger face-encoding process compared with averted gaze. However, outcomes from experiments focused on the N170 and gaze direction have produced contradictory results. Indeed, whereas some researchers failed to find any N170 difference between direct and averted gaze (Latinus et al., 2015; Taylor, Itier, et al., 2001a), others reported a larger N170, or M170, its magnetoencephalographic (MEG) counterpart, for averted compared to direct gaze for heads oriented frontally (Sato, Kochiyama, Uono, & Yoshikawa, 2008; Watanabe, Miki, & Kakigi, 2002) or for averted head orientations (Burra, Baker, & George, 2017). Overall, though, EEG and MEG studies in human adults have so far failed to confirm in a consistent manner that gaze direction modulates face encoding. However, the N170 is not the only ERP that may be sensitive to direction of gaze, and differences could be sought for in other periods.

On a theoretical level, the mechanism underlying the eye contact effect is hypothesized to be fast and automatic, occurring before a complete and detailed cortical analysis of gaze direction (Senju & Johnson, 2009), and most likely subserved by a subcortical pathway involving the amygdala (Burra et al., 2013; George, Driver, & Dolan, 2001; Kawashima et al., 1999). Indeed, some studies have revealed that cortical brain regions may process faces (Halgren, Raij, Marinkovic, Jousmaki, & Hari, 2000) as well as eyes (Rousselet, Ince, van Rijsbergen, & Schyns, 2014; Schyns, Petro, & Smith, 2007) at a period preceding the N170. For instance, arising around 100 ms after stimulus onset, the P1 has been associated with an enhanced processing of faces as compared with other categories (Herrmann, Ehlis, Muehlberger, & Fallgatter, 2005; Itier & Taylor, 2002, 2004; Pegna, Khateb, Michel, & Landis, 2004; Taylor, George, & Ducorps, 2001a; Thierry, Martin, Downing, & Pegna, 2007). Such early modulations may be driven mainly by eye detection, which plays a crucial role in the coarse detection of faces (Doi & Ueda, 2007; Fichtenholtz, Hopfinger, Graham, Detwiler, & LaBar, 2009; Itier, Alain, Kovacevic, et al., 2007a; Itier, Villate, & Ryan, 2007c; Kloth, Schweinberger, & Kovacs, 2010). Indeed, it seems plausible that gaze direction may be coded at the P1 level (Akechi et al., 2010; Schmitz, Scheel, Rigon, Gross, & Blechert, 2012), and possibly at an even earlier stage (Burra, Kerzel, & George, 2016). However, although there are indications that the P1 could be associated with an early stage of eye/gaze processing, this effect could also reflect low-level differences and/or attentional enhancements (Rossion & Jacques, 2008). The functional significance of the P1 in response to gaze direction therefore requires further clarification.

Beyond the N170 period, a sensitivity to direct gaze has also been measured, notably on the P300 ERP complex appearing between 300 ms and 600 ms. This ERP component is in fact composed of two subcomponents differing in their scalp topography and their functions. The first of these, the frontal-central P3a, is widely thought to reflect a preattentional, stimulus-driven orienting reflex to novelty (Soltani & Knight, 2000), and is highly sensitive to habituation (Friedman & Simpson, 1994). The second, the parietal P3b, is sensitive to the degree of attention required to process a stimulus (Johnson, 1988; Lammers & Badia, 1989; Polich & McIsaac, 1994) and is also known to respond to the emotional relevance of a stimulus (Cuthbert, Schupp, Bradley, Birbaumer, & Lang, 2000; Delplanque, Silvert, Hot, Rigoulot, & Sequeira, 2006; Delplanque, Silvert, Hot, & Sequeira, 2005; Eimer & Holmes, 2002). Accordingly, the P3b component is usually observed after the detection of a rare target stimulus (Pritchard, 1981). Interestingly, the P3a and P3b was shown to be sensitive to direct gaze regardless of head direction (Conty et al., 2007), while the P3b was seen to be enhanced by direct gaze irrespectively of the nature of the task (Itier, Alain, Kovacevic, et al., 2007a). Social context, such as, for example, extracting social meaning from a gaze (Carrick, Thompson, Epling, & Puce, 2007), processing the eye region (Naples, Wu, Mayes, & McPartland, 2017), or judging mental state based on the eyes (Sabbagh, Moulson, & Harkness, 2004) affect the P3b as well. Interestingly, the frontal P3a in response to direct compared with averted gaze is modulated only when the participants are aware that a perceived real face was able to see them (Myllyneva & Hietanen, 2015), a phenomenon that does not appear at P3b level. It therefore appears that fluctuations in P300 amplitude, especially the P3b, could reflect cognitive steps linked to gaze discrimination of static faces at a higher level of social cognition.

In summary, the processing of eyes and gaze seems to be mediated at different stages of visual processing (Sabbagh et al., 2004). At an early stage (before 300 ms), the visual system appears to detect the presence of the eye, including most likely direct gaze, while a later stage (after 300 ms) would lead to higher cognitive integration, including access to social significance.

Within this framework, the aim of the current study was to determine the timing and the precedence of direct gaze over averted gaze and closed eyes, while manipulating the direction of gaze. Because evidence of gaze direction on the N170 is rather inconsistent, we investigated additional components, such as the P1 and P300. Moreover, three important and often overlooked methodological aspects were included in our experiment. First, we ensured that gaze direction was incidental to the manual response, as in the experiment of Macrae et al. (2002). At a methodological level, the independence of the variable under examination (gaze) and the response set (gender) is important as it precludes any influence due to a response bias. Indeed, a predisposition is known to exist whereby gaze is expected to be directed toward the viewer (Mareschal, Calder, & Clifford, 2013). This effect is critical in investigations of gaze direction, and when the task requires this to be detected, as it could potentially bias the behavioural responses (see also Framorando, George, Kerzel, & Burra, 2017) and could thus bias the electrophysiological data. Consequently, by using a gender categorisation task, this eventuality was circumvented such that any effects should only be explainable by differences in gaze direction. Secondly, we included a baseline condition, namely the closed-eyes condition. In previous reports (but see Doi & Shinohara, 2012), direct gaze was generally compared with averted gaze, with the comparison revealing differences at the P1, N170, or P3 levels. However, a baseline condition is often missing, leaving unclear whether the reported effects are due to an enhancement of neuronal activity to direct-gaze or a decrease in the averted-gaze conditions. This issue can be addressed by including a closed-eyes condition as a baseline. Finally, stimuli were highly controlled for low-level features in two ways. We displayed solely the internal facial features (eyes, nose, and mouth) of our stimuli in order to avoid a categorisation based on external features, such as hairstyle or paraphernalia. Next, we equated our stimuli for luminance, in order to avoid any low-level influence of the stimuli on the early components.

In the current study, we first sought to replicate the findings of Macrae et al. (2002) and confirm that gender categorisation was enhanced when our stimuli displayed a direct gaze as compared to averted gaze and closed eyes (Experiment 1). Subsequently, in order to establish the electrophysiological correlates of this effect, we measured the ERPs of participants performing the same task (Experiment 2). Finally, in order to determine whether the effects measured in Experiment 2 were affected by task demands or were stimulus driven, we conducted an additional experiment with the same material, but using a procedure where the faces were irrelevant (Experiment 3).

Experiment 1

Method

Participants

Twenty-three participants performed a speeded behavioural experiment (nine male/14 female), age 22.4 ± 5.03 years. Participants were naïve to the purpose of the investigation. The local ethics committee had approved the study, and written informed consent was obtained from participants prior to the experiment.

Materials and procedure

We used the Cogent toolbox (www.vislab.ucl.ac.uk/Cogent2000) for MATLAB (MathWorks, Inc, Natick, MA) to display the stimuli. Stimuli consisted of faces that were taken from a database used in prior experiments of gaze perception (i.e., George et al., 2001). For the purpose of the current study, these stimuli were normalized with respect to their facial features, and all the external features were removed using an oval mask, as illustrated in Fig. 1. We used three male and three female faces, carefully matching their facial features position (eyes and mouth). Additionally, the mean and median luminance of the stimuli were carefully equalized, RGB = 75,75,75. Only the gaze area was modified, either displaying a gaze oriented toward the viewer (direct gaze) or to the side (averted gaze: 50% left, 50% right), or still displaying the eyes closed (control condition). Participants were seated at 85 cm from an LCD 17-in. screen. The stimuli were 3.5° × 4° in size and were displayed at the centre of the screen (similar to Macrae et al., 2002). The sequence of each trial was as follows: A fixation cross appeared for around 600–1,200 ms, followed by the stimulus for 600 ms. Following the stimulus offset, a black screen was displayed for 2,000 ms before the next fixation cross. Participants were then required to categorize the gender of the stimulus, with a button press, using the index and middle fingers. The response keys were counterbalanced across participants. In Experiment 1, participants were required to answer as quickly and accurately as possible, in line with Macrae et al.’s (2002) experiment. Experiment 1 was divided into four blocks of 75 trials, yielding a total of 298 trials. Stimuli were mixed randomly inside each block.

Fig. 1
figure 1

Stimuli: Example of a face stimulus in the three experimental conditions (direct gaze, averted gaze, eyes closed). We used cropped faces of three men and three women, 3.5° × 4°. Each stimulus appeared for 600 ms, positioned at centre of screen. External features were masked

Results

Behavioural results

Means of each condition were measured in response time (RT) values, all of which fell within a mean of ±3 standard deviations (similar to Macrae et al., 2002). A repeated-measures analysis of variance (ANOVA) was performed on the mean RT of direct, averted, and closed eyes conditions. This ANOVA revealed an effect of gaze direction on RT, F(2, 42) = 5.34, p = .009. Participants were significantly faster in categorizing the gender of faces with direct (M = 563 ms) than with averted gaze (M = 571 ms) and closed eyes (M = 573 ms), ts(22) > 2.2, ps < .034. The effect of gaze direction on gender identification accuracy did not reach significance, F(2, 42) = 2.63, p = .084. (see Table 1).

Table 1 Experiment 1: Behavioural result. Gender categorisation response times (RT; in ms) and accuracy (ACC; mean and standard errors), as a function of eye-gaze direction in Experiment 1

Discussion 1

In this first experiment, we replicated the behavioural results previously reported by Macrae et al. (2002). These results confirmed the validity of our stimuli and justified the subsequent investigation using electrophysiological measures. Moreover, these results corroborate different theoretical models (Baron-Cohen, 1994; Perrett & Emery, 1994; Senju & Johnson, 2009) which suggest that direct gaze is beneficial for face categorization when the observer is required to respond as quickly and effectively as possible.

As noted above, the timing remains unknown for this enhanced person-construal process when eye contact is established between perceiver and target. As direct gaze may also prompt the emergence of an attentional enhancement, as well as important social-cognitive effects pertaining to the efficiency of the person-construal process, we predict that, compared with other conditions, direct gaze should enhance components in the early (P1 or N170) and late time range (P3).

Experiment 2: Gender categorisation task

Method

Participants

Twenty-one participants performed this electrophysiological experiment (11 male), age 21.7 ± 3.7 years. One participant was discarded due to a bad EEG signal, which necessitated the removal of more than 50% of the data. Participants were naïve to the purpose of the experiment. The local ethics committee approved the study, and written informed consent was obtained from participants prior to the experiment.

Material and procedure

Apparatus and procedures were similar to those of Experiment 1. In order to avoid any contamination of the EEG data by muscular artefacts, mainly at P300 level, we instructed our participants only to respond after the stimulus had disappeared, hence to provide a “delayed response.”

EEG recording and analysis

A Biosemi (Amsterdam, The Netherlands) ActiveTwo amplifier system AD-Box with 64 active AG/AgCL electrodes sampled in 1024 Hz was used. Moreover, we used the voltage difference of two horizontal electrooculogram (HEOG) electrodes, fixed at the outer canthi sides of both eyes, to detect horizontal eye movements. We used an earlobe reference for the online reference and the classic average reference as an offline level (Joyce & Rossion, 2005). The data were filtered online with a 0.01 Hz high-pass filter and a 100 Hz low-pass filter. Using BrainVision Analyzer 2.1 (BrainProduct), we filtered our data to a low-pass Butterworth zero-phase filter (30 Hz with 24 dB/oct.). In order to remove eye-blink artefacts, we used the independent component analysis (ICA). Trials corresponding to incorrect behavioural performance were eliminated from the analysis, and signals were divided into epochs of 800ms (−200 ms to 600 ms). Then, after an artefact exclusion, a baseline correction (−200 ms to stimulus onset) was performed. Automatic trial exclusion occurred for sweeps with a voltage step larger than 50 μV per ms, a difference between the maximum and the minimum signal of 200 μV for an interval length of 200 ms, a minimum and maximum allowed amplitude of ±200 μV, and an activity lower than 0.5 μV at intervals of 100 ms. Moreover, we removed trials with saccades larger than 40 μV (HEOG), associated with eye movements during stimulus presentation. When the electric signal of an electrode was noisy during the entire recording, we used a spline interpolation technique (Order 4, Degree 10) in order to replace the electrode signal. No interpolation correction was used for the occipital-temporal electrodes, and no more than 2% of the electrodes were corrected. On average, 14% of the trials were removed from the data. We corrected for nonsphericity using a Greenhouse–Geisser correction of the degrees of freedom when required.

Analyses were performed on the P1, N170, P3a, and P3b. P1 and N170 were extracted during their maximal occurrence, that is, the 100–140 ms, 144–185 ms (i.e., ±20 ms around the peak of the overall component), and P3a and P3b 250–450 ms and 300–500 ms time windows, respectively, which are the time windows typically used to investigate these ERPs in studies involving gaze (Carrick et al., 2007; Conty et al., 2007; Myllyneva & Hietanen, 2015; Naples et al., 2017; Senju, Tojo, Yaguchi, & Hasegawa, 2005). The sites were based on the electrodes of maximal activity during these intervals, and were again consistent with those typically used to investigate these components in other studies (O1/OZ/O2 for the P1; PO7/PO8, P7/P8, and P9/P10 for the N170; C1/C2/Cz/FC1/FC2/FCz for P3a; PO7/PO3/POZ/PO4/PO8 for P3b; as depicted in Fig. 2). We used the factors hemisphere (left, right), electrodes (P7/P8, P9/P10, PO7/PO8) and gaze (direct/averted/closed) for the statistical analysis of the N170, and the factors electrode (O1/Oz/O2) and gaze (direct/averted/closed) for the P1. Similar to a prior study (Conty et al., 2007), electrodes were pooled for P3a and P3b using only gaze (direct/averted/closed) as a factor. In the Results section, we will emphasize the main effect of gaze as well as the relevant Gaze × Factors interaction.

Fig. 2
figure 2

P1/N170 and P3b in Experiment 2: a Main activity at parietooccipital region. ERPs are pooled over electrodes where N170 was maximally negative (selected channels are highlighted in orange on inset on right, showing scalp topography for P1 and N170 component). During the 100–140-ms period (blue highlighted box), we measured an early posterior P1 preference for direct gaze as compared with other conditions. During the 145–185-ms period (red highlighted box), N170 was larger for closed-eyes condition as compared with direct and averted gaze. b P3b measured over areas of heightened positive activity. Five electrodes of analysis are highlighted on inset on right. Inset represents scalp topography during the 300–500-ms period (period highlighted with a green box on right). Larger P3b is observed for direct gaze as compared with averted and closed-eyes conditions. (Colour figure online)

Results

Behavioural results

As participants were instructed to answer after the stimulus offset, speed was irrelevant and was thus not considered for analysis in this experiment.

Occipital P1

No effect of gaze was found during the P1 time window for the electrodes O1/Oz/O2, F(2, 38) = 1.07, p = .35. No effect of electrode or interaction of Gaze × Electrodes reached the level of significance, F(2, 38) = 0.85, p = .43, and F(4, 76) = 1.61, p = .17.

N170

As depicted in the Fig. 2a, using the mean amplitude of the electrodes P9/P10, PO7/PO8, and P7/P8, we found a main effect of gaze on the N170, F(2, 38) = 5.62, p = .007, with a larger N170 for closed eyes (−2.25 μV) than for averted and direct gaze (−1.92 μV, −1.77μV, respectively); ts(19) > 2.17, ps = .04. Moreover, we found an interaction effect of hemisphere and gaze, F(2, 38) = 4.57, p < .017. This effect was explained by the fact that the gaze effect did not reach the level of significance in the left hemisphere, p = .16, but was highly significant in the right hemisphere, F(2, 38) = 9.8, p < .001, with a larger N170 for closed eyes (−2.74 μV) than for averted and direct gaze (−2.08 μV, −2.06 μV, respectively); ts(19) > 3.54, ps < .002. Additionally, we found an effect of electrode, F(1.35,25.65) = 17.28, p < .001, because electrodes were more negative at P9/P10 sites (−3.09 μV) as compared with P7/P8 (−1.9 μV) and PO7/PO8 (−0.87 μV). No other main effects or interaction effects reached the level of significance.

Posterior P1

Visual inspection appeared to reveal an early effect of gaze at the P1 level on the electrodes used to compute the N170 (see Fig. 2a). As a post hoc analysis, we therefore explored the amplitude over these electrodes in the 100–140-ms period. The effect of gaze was indeed significant within this time range, F(2, 38) = 6.09, p = .005, with a greater amplitude for direct gaze (3.04 μV) than for averted and closed eyes (2.63 μV, 2.64 μV, respectively); ts(19) > 2.6, ps < .017. Additionally, we found an effect of electrode, F(1.16, 22.17) = 29.78, p < .001, because electrodes were more positive at PO7/PO8 sites (4.31 μV) as compared with P7/P8 (1.9 μV) and P9/P10 (2.1 μV). No other main effects or interaction effects reached the level of significance.

P3a

Using the mean amplitude pooled over electrodes C1/C2/Cz/FC1/FC2FCz, there was no significant effect of gaze on the P3a, F(2, 38) = 1.24, p = .29.

P3b

Using the mean amplitude pooled over electrodes PO7/PO3/POZ/PO4/PO8, we found an effect of gaze on P3b, F(2, 38) = 5.86, p = .006, due to a greater P3b for direct gaze (2.33 μV) than for averted and closed eyes (1.97 μV and 1.78 μV, respectively); ts(19) > 2.16, ps < .05 (see Fig. 2b).

Discussion

In Experiment 2, we investigated whether P1, N170, and P300 components were sensitive to the direction of gaze in a gender categorization task. In order to compare with a baseline condition which was similar in terms of head direction, but which did not convey any gaze direction, we introduced a condition with closed eyes. Data revealed that the preference for direct gaze was initiated in the P1 time range and reiterated at P3b, but was absent at the N170 level. These modulations echo the behavioural effect of Macrae et al. (2002), replicated in Experiment 1, and further suggest that direct gaze enhances visual encoding at an early stage of visual processing, likely reflecting preferential orienting toward this stimulus (see Burra et al., 2016), which potentially produces an attentional enhancement of the P1. Additionally, at a later stage, we found a similar effect at the P3b levels, as reported in previous studies (Conty et al., 2007; Doi & Shinohara, 2012; Itier, Alain, Kovacevic, et al., 2007a; Naples et al., 2017).

These results highlight the presence of both early and late processes for direct gaze, involving therefore both perceptual and categorisation. However, it remains possible that this effect is stimulus driven and not task dependent, an idea that is central to Macrae et al.’s (2002) conclusions. Indeed, it should be noted that the P1 component, located over occipital sites, is sensitive to differences in low-level features, such as stimulus luminance, as well as to attention (Johannes, Munte, Heinze, & Mangun, 1995), including both selective (Hillyard, Vogel, & Luck, 1998; Luck & Hillyard, 1995) and nonselective attention (Rugg, Milner, Lines, & Phalp, 1987; Taylor, 2002). Typically, paying more attention to a stimulus increases its associated occipital P1 amplitude (Luck et al., 1994). In order to address this possibility, we conducted an additional EEG experiment in which the paradigm was changed from a gender categorisation task to an oddball task. In this case, participants were asked to detect infrequent houses among faces. Our rationale was that in an oddball task, gender categorisation is not mandatory, even though the visual information presented to the observers is similar. Although this has not been systematically investigated, the P1 over posterior sites has been suggested to be sensitive to early configural processing of faces and facial features (Halit, de Haan, & Johnson, 2000), at least when attentional resources are sufficient (Wang, Guo, & Fu, 2016; Wang, Sun, Ip, Zhao, & Fu, 2015). As gaze direction is sensitive to configural processing (Jenkins & Langton, 2003), it is likely to entail a reduction or an abolition of the P1 effect.

Another interpretation of the dissociation in the ERP response to gaze direction involves whether the task contains a social or a nonsocial component (Latinus et al., 2015). In this sense, our P1 and P3 enhancements for direct gaze (Experiment 2) may in fact be task dependent, and could therefore disappear in a task that does not require a social judgement/ configural processing.

Additionally, this experiment would extend the understanding of the N170 modulation observed for closed eyes. If this result were stimulus driven (i.e., related to the closed-eyes condition per se), the larger N170 to direct gaze over averted and closed gaze amplitude should remain present in this oddball experiment.

Experiment 3: Oddball experiment

Method

Population

Twenty students (10 male), age 22.2 years (± 2.9), participated in this experiment. Participants were naïve to the purpose of the experiment. The study was approved by the local ethics committee, and written informed consent was obtained from participants prior to the experiment.

Apparatus and procedure

The apparatus and stimuli used in this experiment were the same as in Experiment 2, with the exception of three additional pictures of houses that were included as targets (1/7 of total trials), and which were equated for mean luminance with the other stimuli (RGB: 74, 74, 74). Our participants were instructed to detect the presence of a house, and to press a button using the index and middle fingers, to respond whether a house or a face was presented. By analogy with Experiment 2, a response was required in each trial (face or house) and had to be delivered after stimulus offset (600 ms). The experiment was divided into four blocks. Stimuli were mixed randomly inside each block, which consisted of 84 trials, yielding a total of 336 trials.

EEG recording and analysis

We used the same setup and analyses as in Experiment 2. However, due to the slightly later appearance of the N170, we measured its mean amplitude at 155–195 ms instead of 145–185 ms. On average, 19% of the trials were removed. As a matter of consistency, statistical analyses in Experiment 3 were similar to those in Experiment 2. However, in order to measure the difference between the house and face conditions, an additional ANOVA was included which comprised direct, averted, closed, and house as factors.

Results

Behavioural results

As participants were instructed to respond after the stimulus offset (i.e., after 600 ms), reaction times were irrelevant and therefore not analysed.

Occipital P1

No effect was found during the P1 time window for electrodes O1/Oz/O2. We did not measure any difference across faces, F(2, 38) = 2.15, p = .13, neither of electrodes, F(1.24, 23.6) = 0.19, p = .82, nor interaction of Faces × Electrodes, F(2.61, 49.65) = 1.03, p = .39. No effect was observed when the house condition was included, F(3, 57) = 0.93, p = .43, or electrodes, F(1.43, 27.28) = 0.82, p = .44, and no interaction of Faces × Electrodes, F(2.49, 47.3) = 0.78, p = .58

N170

We did not observe any significant difference across face conditions, F(2, 38) = 1.15, p = .32. We measured a main effect of electrodes, F(2, 38) = 25.7, p < .001, since electrodes were more negative at P9/P10 sites (−0.72 μV) as compared with P7/P8 (−0.19 μV) and PO7/PO8 (1.68 μV). However, when including the house condition, we measured a strong effect on the N170, F(3, 57) = 14.88, p = .001, with a less negative N170 for the house condition (1.94 μV) as compared with the direct-gaze, averted-gaze, and eyes-closed conditions (0.22 μV, 0.45 μV, 0.10 μV, respectively), ts(19) > 4.1, ps < .001 (see Fig. 2b). Additionally, we found a main effect of electrodes, F(2, 38) = 20.00, p < .001, similar to the effect on faces, and an interaction effect of Faces × electrodes, F(2.88, 50.03) = 9.83, p < .001.

Posterior P1

No effect was found during the P1 time window, as was the case in Experiment 2. We did not observe any significant difference across our faces conditions, F(1.3, 25.21) = 1.58, p = .23, even when the house condition was included, F(2.23, 42.44) = 1.68, p = .18. Across our faces stimuli and even including the house condition, we found an effect of hemisphere, Fs(1, 19) > 9.62, ps < .006, when the right hemisphere (3.5 μV) was more positive than the left hemisphere (5.17 μV), and a main effect of electrodes, Fs(2, 38) = 31.47, ps < .001 (P7/P8 = 3.19 μV; P9/P10 = 4.06 μV; PO7/PO8 = 5.8 μV).

P3a

The P3a computed on the mean amplitude of the pooled electrodes C1/C2/Cz/FC1/FC2/FCz was not significantly influenced by gaze, F(2, 38) = 1.62, p = .21, even when house condition was included, F(1.67, 31.87) = 0.53, p = .58.

P3b

The P3b computed on the mean amplitude of the pooled electrodes PO7/PO3/POZ/PO4/PO8 was not affected by gaze direction (p = .44). However, when including the house condition, a significant difference was measured, F(3, 57) = 10.48, p = .001. The P3b amplitude was significantly enhanced for houses (3.93 μV) as compared with direct gaze (2.59 μV), averted gaze (2.26 μV), and eyes closed (2.4 μV), ts(19) > 3.52, ps < .002 (see Fig. 3b).

Fig. 3
figure 3

P1/N170 and P3b in Experiment 3: a Main activity at parietooccipital region. ERPs are pooled over leads where N170 was maximally negative (selected electrodes are highlighted in orange on inset on right, which also illustrates the P1/N170 scalp topography). No difference between our faces conditions was observed during the posterior P1 time window (100–140-ms period, highlighted blue box) and the N170 (155–195-ms period, red highlighted box), but N170 for house condition was smaller than faces conditions. b P3b measured over areas of heightened positive activity. Electrodes used for analysis are highlighted in orange in inset on right. Inset represents scalp topography during the 300–500-ms period (period highlighted with green box on right). During this time window, a significantly larger P3b was not found for direct gaze as compared with averted and closed-eyes conditions. Nevertheless, the P3b was larger for odd condition as compared with other conditions. (Colour figure online)

Interexperiment effect on P1 and P3b

An interexperiment comparison of gaze direction was performed between Experiment 2 and Experiment 3 for the parietal P1 and the P3b. Our analysis revealed that the P1 interacts with gaze and experiment, F(2, 76) = 3.6, p = .034. Unfortunately, the same interexperiment interaction does not reach significance for the P3b, F(2, 76) = 0.87, p = .42. The main effect of gaze remained significant, F(2, 76) = 3.58, p < .033.

Discussion

In Experiment 3, we sought to clarify whether the components sensitive to gaze perception during gender categorization were also modulated in an oddball task. Using a house versus face categorization task, our data revealed that direct gaze no longer enhanced the P1 or the P3b, nor did closed eyes enhance the N170. Significant modulations were only observed on the N170 and P3b. The N170 effect was characterised by a smaller amplitude for houses compared with faces, reflecting the typical sensitivity of the N170 to faces (Itier & Taylor, 2004). The P3b effect was related to a greater amplitude for houses compared with faces and reflected in this case the well-known sensitivity of this component to oddball stimuli (Pritchard, 1981), although this may also have been due to a perceptual priming effect linked to the small number of exemplars in the house condition. In Experiment 3, across gaze conditions, the endogenous P3b was not clearly seen, and its topography was rather different (i.e., more occipital; see Fig. 3b, left) than the expected P3b component (more parietal; see Fig. 2b, left). This slight disparity between the components precludes an unequivocal comparison of Experiments 2 and 3 and could explain to some extent why we failed to find an interaction between the two experiments during the 300–500-ms time window. Critically however, the exogenous P1 revealed an interaction effect across experiments, attesting that the P1 is sensitive to gaze direction and that this effect is task dependent.

Taken together, our data demonstrate that the gaze effects observed in Experiment 2 were not stimulus driven but necessarily related to the categorisation task, a point that is central to McCrae et al.’s (2002) conclusions. This effect also echoes the results established by Framorando et al. (2017), who demonstrated that the “stare-in-the-crowd” effect is task dependent and not stimulus driven. As suggested by Halit et al. (2000), Wang et al. (2016) and Wang et al. (2015), the P1 might be seen as a marker of early configural processing of faces and facial features, which would be mandatory when gender categorisation is required (Jenkins & Langton, 2003), but not necessarily in other tasks. Critically, it would thus appear that the preferential processing for direct gaze, although likely driven by a rapid subcortical pathway (Burra et al., 2013; George et al., 2001; Kawashima et al., 1999) may be moderated by task demands and/or by context (Senju & Johnson, 2009).

General discussion

In the present study, we investigated the electrophysiological correlates of visual processing for direct compared with averted gaze in a gender categorisation task. In order to maintain a baseline with the same face direction (frontal face), we introduced a condition with closed eyes in which faces were present but did not convey any direction of gaze. An enhanced processing manifested itself behaviourally as a faster gender categorisation for faces displaying a direct gaze, compared with those with an averted gaze or with closed eyes (Experiment 1), replicating the seminal study by Macrae et al. (2002). This effect was further investigated using EEG. We examined the effect of gaze on the N170 ERP component, as well as the early P1 and the later P300. Data revealed that direct gaze increased the P1 response, and the effect that was reiterated at the P3b, while the N170 was unaffected (Experiment 2). However, when participants performed an oddball procedure using the same stimuli, no ERP modulation was found for direct gaze (Experiment 3), showing that the effect of direct gaze was task dependent.

The P1 as an early marker of task-relevant direct gaze

Our results support the view that perception of gaze direction relies on the processing of information of the eye region early in the course of face processing, as suggested by some theoretical models (Senju & Johnson, 2009). Our results reveal that, at posterior sites, the P100 amplitude was not only significantly more positive for direct compared with averted gaze but also compared with faces with closed eyes. Conversely, at occipital sites where the P1 is typically measured, no difference was found between conditions. This latter point, in our view, suggests the absence of global low-level differences (such as luminance) across stimuli in our experimental conditions. Finally, in Experiment 3, using the same stimuli as Experiments 1 and 2, but with an oddball paradigm, the differences in posterior sites disappeared.

Our results appear to support recent data suggesting that eye direction may be coded before the N170 peak. Some studies have observed that the whole face and the eyes are processed between 100 and 150 ms (Rousselet et al., 2014; Schyns et al., 2007). In addition, as noted above, the parieto-occipital P1 has been posited to be sensitive to configural face processing (Halit et al., 2000; Wang et al., 2015). The modulation of the P100 amplitude found in the present study corroborates these reports, suggesting an early pictorial categorisation stage (Desjardins & Segalowitz, 2013), during which a coarse signal is processed, indicating the presence and direction of the eyes (Burra et al., 2016; Doi & Ueda, 2007), possibly based on the strong difference in local contrast between the sclera and the iris, as well as the inherent symmetry associated with gaze direction (Doi & Ueda, 2007; Kobayashi & Kohshima, 1997). Critically, the fact that the P1 effect is abolished in Experiment3 argues unequivocally against the possibility that it may be purely stimulus driven.

Some investigations have reported that the P1 is not only modulated by selective (Hillyard et al., 1998; Luck & Hillyard, 1995), but also by nonselective attention (Rugg et al., 1987; Taylor, 2002). In our study, we would argue that attention to particular attributes of the stimulus might have enabled preferential access of this input to cognitive processes. The current findings suggest that nonspatial attention, such as that for particular attributes of the face, could modulate activity in the human occipitotemporal or ventral stream, allowing input from the attended regions to be processed at higher stages of configural analysis. Thus, depending on the task, attention could orient face processing either towards more configural or more featural aspects of processing, depending on whether they are relevant or irrelevant social stimuli (Wang et al., 2016; Wang et al., 2015).

Such an influence of the task on direct-gaze processing, revealed through this early nonspatial attention enhancement, is implemented in the fast-track modulator model, impacting at subcortical and cortical stages of gaze processing (Johnson, Senju, & Tomalski, 2015; Senju & Johnson, 2009). To our knowledge, a contextual modulation driven by task demands and social context top-down effects has never been measured at such an early stage. In fact, an enhanced processing of the eye region via configural processing may facilitate gender categorisation. Therefore, in Experiment 2, our participants may have been prepared to extract information from the eyes to perform the task, while this was not the case in Experiment 3. It is likely that the task set increased the prioritization of the most relevant stimulus displayed (i.e., direct gaze faces), which might explain why the posterior P1 was sensitive to gaze contact in Experiment 2 and not in Experiment 3. In sum, the influence of task set in the early processing stage of gaze direction is challenging for the fast-track modulator model (Johnson et al., 2015; Senju & Johnson, 2009), which proposes a rapid and automatic processing of direct gaze driven mainly by the amygdala. Our findings are a clear example of how contextual modulation, driven by task demands and social context, can influence the early cortical response to gaze contact.

In sum, our data reveal the existence of an early P100 enhancement of directly gazing faces, which is observable when the task requires gender categorisation, but which is abolished when detailed processing of the face is not mandatory for the correct execution of the task, or at least when basic object categorization is required.

The P3b as an indicator of the social relevance of direct gaze

As described above, the early effect of direct gaze on visual processing is reiterated at a later stage, around 300 to 400 ms. This sensitivity of the P300 for direct gaze is consistent with prior results (Conty et al., 2007; Itier, Alain, Kovacevic, et al., 2007a) and has been proposed to reflect the activation of a later “gaze direction module,” suggested by the fast-track modulator model (Senju & Johnson, 2009), albeit at a higher level of cognitive processing, and probably independently of head direction (Burra et al., 2017).

Interestingly, the P3a subcomponent of the P300, did not reveal any modulation by gaze direction, which implies that, in our study, it is unlikely that direct gaze produced a preattentional stimulus-driven orienting reflex (Soltani & Knight, 2000), despite previous claims in the literature (Conty et al., 2007; Senju et al., 2005). The lack of any P3a modulation in our study suggests that direct gaze did not give rise to an automatic response to novelty. However, the P3b and the P3a differ critically in the sense that the P3a habituates with repeated presentation, while the P3b does not (Friedman & Simpson, 1994). It is likely that the high number of repetitions per condition and per identity (16 repetitions per stimulus) dramatically reduced the overall novelty effect, although this remains an open question for future experimentation. Future investigations should use different approaches (such as, for example, principal component analyses; i.e., Delplanque et al., 2005) to determine, the role of P3a components on gaze direction within the P3 complex.

As mentioned above, the P3b is not modulated by habituation. Instead, it reflects a mechanism involved in event categorisation (Kok, 2001). The P3b is strongest when a template in mind matches the perceived stimuli (see also Chao, Nielsen-Bohlman, & Knight, 1995; Ford, 1978; Squires, Hillyard, & Lindsay, 1973). Importantly, the template of a stimulus can be related to the meaning or social salience of the stimulus (Cuthbert et al., 2000; Eimer & Holmes, 2002), with respect to the task at hand. Stimuli with a highly informative value would thus elicit a larger P3b than stimuli that do not (Johnson, 1988; Picton, 1992; Pritchard, 1981). However, previous results also reported that the P3b response to direct gaze was task independent (Itier, Alain, Kovacevic, et al., 2007a; Itier, Villate, et al., 2007c). Specifically, in this latter study, the authors measured an enhanced P3b for gaze contact over averted gaze, irrespectively of their task (in their case, discrimination of gaze direction or head orientation). This was taken as evidence of a stimulus-driven effect of direct gaze over averted gaze. It should be emphasised though that in the latter studies, faces were always relevant to the task and discrimination of facial features was thus always mandatory. In our oddball experiment (Experiment 3), the task required a basic discrimination of objects versus faces. This lack of active processing of facial features might explain why the direct gaze effect on P3b did not appear.

Overall, in the gender categorisation task, our results revealed an enhanced P3b, in addition to the early P1, for direct gaze compared with averted gaze and closed eyes, which was associated with faster responses in a gender categorization task for directly gazing faces. Altogether, the effects on P1 and P3b provide strong evidence of an early neural response to direct gaze but underscore an important role of the task.

N170 modulated by the absence of eyes during gender categorisation

Contrasting with the unequivocal P1 and P3b results, no significant difference emerged between direct and averted gaze at the N170 level. This result is convergent with the literature, which has so far not been able to demonstrate any consistent N170 effect for gaze direction (for instance, Burra et al., 2017; Burra et al., 2016; Conty et al., 2007; Itier, Alain, Kovacevic, et al., 2007a; Latinus et al., 2015; Sato et al., 2008; Taylor, George, et al., 2001b; Watanabe et al., 2002).

One unexpected result emerged revealing a larger N170 amplitude for closed eyes compared with open eyes (whether direct or averted) in the gender categorisation task. This finding was surprising and is in opposition with the literature (Eimer, 1998; Itier, Alain, Kovacevic, et al., 2007a; Kloth, Itier, & Schweinberger, 2013; Taylor, George, et al., 2001b). One tentative explanation for this could reside in the fixation points used in the study. In our experiments, stimuli were placed in the middle of the screen, which led to a fixation point at the level of the nose (i.e., in a central location), consistent with the procedure used by Macrae et al. (2002). Evidence suggests that the coding of facial features by dedicated neurons is inhibited when the target features are situated outside the fovea (Nemrodov, Anderson, Preston, & Itier, 2014). This led Nemrodov et al. (2014) to suggest that the inhibition of the foveated features by perifoveal features may ensure holistic processing, which is the type of processing underpinning the N170. Cortical magnification produces a stronger input from fixated features than from nonfixated features to the higher face-sensitive visual areas. Therefore, when the fixation point is at the level of the nose, nose-coding neurons should be inhibited by neurons coding for the eyes, mouth, and portions close to the outline of the face. In the closed-eyes condition of our study, the degree to which neurons coding for the eyes inhibited nose-coding neurons might therefore have been reduced, thereby possibly producing a larger N170 for closed eyes (or eyeless faces in the experiments by Nemrodov et al., 2014). The results of the current study therefore potentially support this Lateral Inhibition Face Template and Eye Detector (LIFTED) model, which provides a neuronal account of holistic and featural processing. However, the disappearance of this effect in Experiment 3 also suggests a top-down influence on the LIFTED model, a point that would require further experimentation.

Conclusion

Early and late ERP enhancements to direct gaze are task dependent

Our study highlights the critical role of the participants’ task in the processing of direct gaze. Recent evidence has posited the existence of two modes of visual information processing during the perception of social stimuli such as gaze direction. One mode, a “default mode,” would focus on the spatial information contained in the direction of gaze, while the second, a “social aware mode,” may be activated when an observer is required to make a social judgment (Latinus et al., 2015). Focusing on the N170 component during the perception of dynamic gaze shifts, the authors discovered that this component was dependent on the task carried out by the participants. Indeed, the N170 was differentially modulated according to whether the task was “nonsocial” (i.e., determining whether a gaze was oriented to the left or to the right) or “social” (i.e., determining whether a gaze was oriented towards or away from the participant). Based on these findings, we would hypothesize that our Experiment 2 may have tapped into the “social aware mode,” while Experiment 3 may have activated the nonsocial or “default mode.” This could yield a partial explanation for the different ERP results in the two experiments and would further corroborate the proposed existence of these two modes of visual processing.

An alternative interpretation could be that Experiment 2 required greater attentional resources than Experiment 3 did. Indeed, Experiment 2 (gender categorisation), which is seen here as a “social” task, may have necessitated more in-depth processing of the face, while Experiment 3 (the oddball task), seen as a “nonsocial” task, did not. Participants would thus have been more attentive in the social than in the default mode, simply because the former required greater attention in order to extract gender than is required to discriminate categories. However, these interpretations are not mutually exclusive. Indeed, the distinction between the “social mode” and simple resource demand is challenging, as a social task may in essence be more demanding than a nonsocial one. This issue cannot be resolved here and clearly necessitates further investigation.

Our findings appear to challenge the fast-track modulator model, as our scalp EEG data did not reveal any early modulation direct gaze when this was task irrelevant, which would have been suggestive of amygdala activation. No processing of direct gaze appears to have occurred when this feature was not attended, questioning the notion of an involuntary, automatic response by the amygdala. However, the findings do not necessarily invalidate the fast-track modulator model. Indeed, task-demand in Experiment 2 might have actively placed participants in a “social” mode associated with configural processing, which would have heightened the response for the most relevant social stimuli displayed during the task (i.e., direct gaze). By contrast, in the “nonsocial task” (Experiment3), where the processing of social cues was not mandatory, direct gaze could have been actively suppressed by top-down processes, despite their high level of relevance to the amygdala. In this context, task-demand may have acted as a “gatekeeper” which enables or not, the configural processing of relevant social cues. This interpretation would account for the apparent contradiction emerging from our findings. However, further empirical studies are needed to explore these questions.

The current data underscore the importance of top-down processes in the early and late ERP response to direct gaze. Although top-down modulations based on instructions and/or task demands were included in fast-track modulator model of gaze perception (Senju & Johnson, 2009), to our knowledge, our findings are the first to highlight their implication at such early stages of visual processing and illustrate the necessity of taking task demands into consideration in future investigations of gaze direction.