Effects on Automatic Attention Due to Exposure to Pictures of Emotional Faces while Performing Chinese Word Judgment Tasks

Huang Junhong; Zhou Renlai; Hu Senqi

doi:10.1371/journal.pone.0075386

Abstract

Two experiments were conducted to investigate the automatic processing of emotional facial expressions while performing low or high demand cognitive tasks under unattended conditions. In Experiment 1, 35 subjects performed low (judging the structure of Chinese words) and high (judging the tone of Chinese words) cognitive load tasks while exposed to unattended pictures of fearful, neutral, or happy faces. The results revealed that the reaction time was slower and the performance accuracy was higher while performing the low cognitive load task than while performing the high cognitive load task. Exposure to fearful faces resulted in significantly longer reaction times and lower accuracy than exposure to neutral faces on the low cognitive load task. In Experiment 2, 26 subjects performed the same word judgment tasks and their brain event-related potentials (ERPs) were measured for a period of 800 ms after the onset of the task stimulus. The amplitudes of the early component of ERP around 176 ms (P2) elicited by unattended fearful faces over frontal-central-parietal recording sites was significantly larger than those elicited by unattended neutral faces while performing the word structure judgment task. Together, the findings of the two experiments indicated that unattended fearful faces captured significantly more attention resources than unattended neutral faces on a low cognitive load task, but not on a high cognitive load task. It was concluded that fearful faces could automatically capture attention if residues of attention resources were available under the unattended condition.

Citation: Junhong H, Renlai Z, Senqi H (2013) Effects on Automatic Attention Due to Exposure to Pictures of Emotional Faces while Performing Chinese Word Judgment Tasks. PLoS ONE 8(10): e75386. https://doi.org/10.1371/journal.pone.0075386

Editor: Linda Chao, University of California, San Francisco, United States of America

Received: December 5, 2012; Accepted: August 14, 2013; Published: October 4, 2013

Copyright: © 2013 Junhong et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: This study was Supported by the National Basic Research Program of China (2011CB711000) and the Fundamental Research Funds for the Central Universities (code: 2009SC3). This work were supported by the Major Project of the National Social Science Fund (11&ZD187) and the evaluation and training to the people with drug addiction in Beijing. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

It has been recognized that the ability to automatically perceive threatening stimuli provides an adaptive advantage in the survival of an organism [1]. Previous studies have found that an observer’s nervous system automatically orients its attention to novel or threatening stimuli such as the negative facial expressions of fear and anger (e.g., [2], [3], [4], [5], [6], [7], ). The orienting response involves automatic attentional mechanisms that are unconscious and stimulus driven in what is often defined as a bottom-up process [4]. Recent behavioral studies have provided empirical evidence of automatic processing of negative emotional stimuli [9], [10], [11], [12], [13], [14].

Several recent studies using functional magnetic resolution imaging (fMRI) and event-related potential (ERP) techniques have also provided neuroscientific evidence that attention to negative facial expressions such as fear, anger, or disgust is processed automatically or unconsciously. The fMRI studies have shown that the activity of the amygdala increased in response to fearful faces in both attended and unattended conditions. Researchers found that in fMRI scans, differential amygdala responses to fearful versus happy facial expressions were influenced by mechanisms of attention and that the amygdala’s activity significantly increased when exposed to potentially threatening stimuli (fearful faces) than when exposed to non-threatening stimuli (happy faces) under conditions of inattention [15], [16].

Consistent with the fMRI findings, a recent ERP study [17] revealed that the amplitude of P2a component of ERPs increased while exposed to negative emotional faces but not to neutral faces under conditions of inattention, indicating that automatic attention was triggered by negative emotional face. Another recent study [18] also found an enhanced P1 component to negative faces. The source localization of this P1 effect indicated increased activity within the anterior cingulate cortex (ACC), providing mechanisms underlying automatic attention orienting towards negative faces. Other ERP studies [19], [20] also showed that exposure to facial expressions elicited the visual mismatch negativity (MMN) at bilateral occipito-temporal regions in time windows 170–360 ms when no attention was paid. The MMN elicited by the fearful faces started as early as about 70–120 ms after the stimulus onset. In summary, these previous studies have found that emotional facial stimuli, especially threatening stimuli, could capture automatic attention marked by either the increased amplitude of the ERP P1, P2 and visual MMN components or the increased neural activity of the amygdala observed in fMRI scans and that this processing demonstrated a bottom-up process.

Another source of the perception of emotional faces is the top-down control of the parietal and frontal cortex (for review, see [21]). fMRI and ERP studies have also shown that frontal cortex regions modulate the sensory cortex for the processing of unattended emotional stimulus. Emotional stimuli including negative emotional facial stimuli are not processed under unattended conditions and emotional stimuli are modulated by active attention and this processing reflects a top-down process. Top-down control from the amygdala might occur when sufficient attention resources were available for fearful faces in attended trials but not in unattended trials [22]. Therefore, threatening stimuli could not capture automatic attention and the processing of facial expression appeared to be under top-down control. Two ERP studies found that a greater frontal positivity of ERP activity was generated about 100 ms after stimulus onset in response to fearful faces when the faces were attended. However, when the faces were located outside the attention focus, this emotional expression effect was completely eliminated [15], [23]. In short, the results of these studies imply that negative emotional stimuli could not be automatically processed in the unattended condition and that emotional stimuli were modulated by active attention.

How can these conflicting results be explained? According to Pessoa et al. [22], participants’ failure to modulate attention processing of emotional stimuli in the unattended condition in some studies (e.g. [22], [23], [24]) was the result of a competing task fully absorbing subjects’ attentional resources. Williams et al. [25] proposed that focusing on houses rather than on faces in Vuilleumier and colleagues’ study [15] was relatively easy and might have left residual attentional capacity for face processing in the unattended condition. Thus these explanations for the conflicting results in previous studies are based on the absence or existence of residual attentional resources in participants under the unattended condition. However, this explanation raises a practical question: how can one determine that all attentional resources have been consumed without leaving any residual attentional resources to process the emotional stimuli in the unattended condition?

To date, no appropriate method has been developed to accurately measure the distribution of attentional resources to emotional stimuli in the unattended condition. The typical research paradigm of previous studies requires participants to undergo two conditions: an attended condition in which participants are required to pay attention to the emotional stimuli directly and an unattended condition in which participants are required to pay attention to non-emotional stimuli with emotional stimuli superimposed over them. The effects of emotional stimuli on attentional processes between attended and unattended conditions is compared and indexed by behavioral measures such as reaction time and accuracy or neural activity increase of the amygdala in fMRI scans (e.g., [15]) and amplitude of the early ERP components of P1 and P2 waves (e.g., [24]). Because the level of attentional demand differed under attended and unattended conditions and emotional stimuli varied in previous studies (e.g., [22]), this kind of comparison likely produced the conflicting results.

In order to accurately measure the levels of residual attention resources that are distributed across emotional stimuli in the unattended conditions, we reasoned that the processing of these emotional stimuli in different unattended conditions should be directly compared. Therefore, the present study was designed to have all subjects perform two types of attention tasks while the same emotional facial stimuli were superimposed over the non-emotional stimuli in the unattended condition. According to Lavie’s perceptual load theory [26], which defines perceptual load as the attention resources required for attended tasks, the magnitude of attention modulation upon unattended emotional stimuli is dependent on the demands of attended tasks. This theory further states that fewer residual attentional resources would be distributed to distracters (emotional stimuli) if attention is entirely consumed by performing a highly demanding task. Conversely, on a low demand task, residual attentional resources could be available to be distributed among distracters (emotional stimuli). Based on Lavie’s perceptual load theory, our first prediction is that more attention resources will be available for the emotional stimuli under the unattended condition when the perceptual load of the attention task is low. Secondly, we predict that the interfering effect of facial expressions on the current task will decrease or disappear entirely when the perceptual load of the attention task is increased and demands full attention.

ERP recordings were found to have a better temporal response than fMRI scans in exploring the time course of attention modulation for emotional stimuli under unattended conditions [27]. Several recent studies have shown that ERP recordings reveal the time course of emotional processing effectively. The early ERP components that are involved with attention were elicited by emotional stimuli in unattended conditions [24], [27]. These findings suggest that emotional facial expressions can rapidly trigger cortical circuits that are responsive in the detection of emotionally significant events. Because the processing of emotional stimuli, especially negative emotional stimuli, generally occurs within a very short time period, and because the attention level changes quickly with the time course of emotional stimuli perception after stimuli onset, we reasoned that ERP recording would be the most appropriate method to temporally measure transient brain activity in the process of automatic attention.

The purpose of the present study was to investigate the effects of exposure to emotional face pictures on attention while performing Chinese word judgment tasks. Two experiments were conducted. In Experiment 1, subjects performed low (judging the structure of Chinese characters) and high (judging the tone) cognitive load tasks while exposed to unattended pictures of fearful, neutral, or happy faces. Their reaction time and accuracy of performance on word structure judgment were measured. We hypothesized that the reaction time for the word structure judgment would be significantly faster than for performing word tone judgment and the accuracy for performing word structure judgment would be significantly higher than for performing word tone judgment. Since more residual attention resources would be available to automatically process emotional facial stimuli while performing word structure judgment tasks than while performing word tone judgment tasks under the unattended condition, we further hypothesized that the condition of exposure to fearful faces would have a significantly longer reaction time and lower accuracy than for the conditions of exposure to neutral and happy faces while performing the word structure judgment tasks. We expected that no significant differences in reaction time and accuracy would be found among fearful, happy, and neutral faces while performing the word tone judgment tasks. In Experiment 2, subjects performed the same word judgment tasks as in Experiment 1 and their brain event-related potentials (ERPs) were measured for a period of 800 ms after the onset of the task stimulus. Similarly, we hypothesized that the amplitude of the early ERP component of the P2 wave elicited by unattended fearful faces over frontal-central-parietal recording sites would be significantly larger than that elicited by unattended neutral or happy faces while performing word structure judgment tasks and that no significant differences in P2 amplitudes would be found among fearful, happy, and neutral faces while performing word tone judgment tasks.

Experiment 1

Methods

Participants.

Thirty-five healthy undergraduates and graduates (11 males and 24 females) were recruited from Beijing Normal University. Their ages ranged from 17 to 26 years old with a mean of 20.5±2.13. All subjects were right-handed as assessed with Chapman and Chapman’s scale [28]. Participants had no history of psychiatric or neurophysiological diseases and had normal or corrected vision. Each subject provided informed written consent before the experiment. The experimental procedures were approved by the Institutional Review Board of the State Key Laboratory of Cognitive Neurosciences and Learning. Twenty Yuan (RMB) was paid to each subject for participating in the experiment.

Materials

Chinese words.

42 high-frequency Chinese words were used in the experiment. The characters were selected from “Modern Chinese Frequency Dictionary” [29]. Each word was composed of two structural parts. Half of these Chinese words were composed with left-right structure (e.g. ), and the other half was up-down structure (e.g. ). The Chinese pronunciation system is composed of four tones. There is a significant contrast in pronunciation between tone two and tone four. Tone two is a volume rising tone and tone four is a volume falling tone. Half of the Chinese words were tone two in the Chinese pronunciation tones (e.g. ), and the other half was tone four (e.g. ). The participants viewed the same 42 words while performing word structure judgment or tone judgment. The words were presented on a monitor screen with one word for each trial. Participants were seated approximately 70 cm from the screen (no chinrest was used) so that the visual angle of the word was 2.5°.

Human faces.

Twenty-one gray pictures of human faces were used. The faces were obtained from Ekman’s series [30] and depicted 7 individuals’ facial expressions. Each individual displayed fearful, neutral, and happy facial expressions. Four identities were female (labeled as C, MF, SW, and M), and three were male (labeled as JJ, JB, and EM). The visual angle of the face was 3.3°, face and word separated by 0.3°. Two identical faces were presented to the left and right sides of each Chinese word on every trial. The equipment used in the study was a Founder PC and a 17-inch monitor display.

Experimental Design

A 2 (tasks: word structure vs. word tone) ×3 (emotional faces: fear, neutral, and happy) within-subjects design was adopted. Each of the six emotional conditions consisted of 42 trials. Subjects all had eight practice trials to familiarize the word structure and tone judgment tasks.

Tasks.

Subjects were asked to perform Chinese word structure and tone judgment tasks that were irrelevant to the perception of facial expression. The faces were presented in the unattended condition while performing word structure and tone judgment tasks. For the word structure judgment task, subjects were instructed to assess whether each word had a left-right structure or up-down structure by pressing ‘z’ or ‘/’ on the keyboard as quickly as possible. For the word tone task, subjects were asked to make their response about whether each word was tone two or tone four in the Chinese pronunciation tone system by pressing the corresponding buttons on the keyboard as quickly as possible. These buttons were also counterbalanced on the keyboard. The presentation order of the two tasks was counterbalanced between subjects.

Procedures

The experimental procedures were programmed with E-Prime 1.1 software (Psychology Software Tools Inc: www.pstnet.com/eprime). Participants were tested individually by sitting on a chair at a distance of 80 cm away from the computer screen in a well-lit room. All stimuli were presented visually on a white against black background at the center of the screen. A fixation mark was first presented in the center of the screen for 500 msec. Then a Chinese word was displayed in the center of the screen for 100 msec with two identical human faces shown on each side of the Chinese word. The subject was asked to pay attention to the Chinese word and ignore the human faces. The presentation of fearful, neutral, or happy faces with the Chinese word was randomly distributed. After that, there was a 1900 ms blank time for the subject to judge the word structure or word tone. The subject was asked to make a response by pressing the corresponding buttons as described earlier, and the interval between two trials ranged from 250 to 850 msec. The subject’s reaction time and correct judgment rate were measured electronically.

Results

A total of 35 subjects’ experimental data were collected and analyzed. Table 1 presents the means and standard deviations of the reaction time to complete the word structure and word tone judgment tasks when fearful, neutral and happy faces were unattended respectively. As can be seen in Table 1, subjects had a much shorter reaction time while performing word structure judgment tasks than while performing word tone judgment tasks.

Download:

Table 1. Means and standard deviations of the reaction time (ms) for the word structure (WS) and word tone judgment tasks (WT) when fearful, neutral and happy faces were unattended respectively.

https://doi.org/10.1371/journal.pone.0075386.t001

A 2 (tasks: word structure vs. word tone) ×3 (emotional faces: fear, neutral, and happy faces) within subjects repeated measures analysis of variance (ANOVA) on reaction time was performed. The results showed that the main effect of type of task was significant (F_{(1, 34)} = 135.223, P<0.0001). The mean reaction time for word structure judgment was significantly faster than that for word tone judgment, indicating that more time was needed to perform word tone judgment tasks. No significant main effects for emotional faces (F_{(1, 34)} = 2.178, P = 0.126, ε = 0.915) were found. There was a significant interaction effect between tasks and emotional faces (F_{(2, 68)} = 3.961, P = 0.031, ε = 0.836). Further analysis showed that there were significant differences in the reaction time among the conditions of exposure to fearful, neutral, and happy faces while performing the word structure judgment tasks (F = 5.24, P = 0.008). The paired-comparisons showed that the mean reaction time for the condition of exposure to fearful faces was significantly longer than that for the condition of exposure to neutral faces (t = 2.599, P = 0.014, LSD = 15.18 ms) and that the mean reaction time for the condition of exposure to happy faces was significantly longer than that for the condition of exposure to neutral faces (t = 2.569, P = 0.015, LSD = 10.38 ms). However, there were no significant differences in mean reaction time between exposure to fearful and happy faces. Further analysis showed that there were no significant differences in the mean reaction time among the conditions of exposure to fearful, neutral, and happy faces while performing the word tone judgment tasks.

Table 2 presents the means and standard deviations for accuracy in performing the word structure and word tone judgment tasks when fearful, neutral and happy faces were superimposed respectively. A 2 (tasks: word structure vs. word tone judgment task) × 3 (emotional faces: fearful, neutral, and happy faces) within subjects repeated measures ANOVA on correct percentage was performed. The results showed a significant main effect of type of task (F_{(1, 34)} = 38.796, P<0.0001). The mean correct percentage of the word tone judgment task was significantly lower than that of the word structure judgment task, indicating that it was more difficult to complete the word tone judgment task than the word structure judgment task. No significant main effects for emotional faces were found (F_{(2, 68)} = 0.657, P = 0.517, ε = 0.966). There was a significant interaction of correct percentage between tasks and emotional faces (F_{(2, 68)} = 3.934, P = 0.028, ε = 0.908). Further analysis showed that there were significant differences in the correct percentage rate among the conditions of exposure to fearful, neutral, and happy faces while performing the word structure judgment tasks (F = 3.43, P = 0.038). The paired-comparisons showed that the mean correct percentage for the condition of exposure to fearful faces was significantly lower than that for the condition of exposure to happy faces (t = 2.662, P = 0.012, LSD = 0.0177) and that the mean correct percentage for the condition of exposure to happy faces was marginally significantly higher than that for the condition of exposure to neutral faces (t = 2.026, P = 0.051, LSD = 0.014). No significant difference was found in the mean correct percentage between neutral and fearful faces. There were no significant differences in the mean correct percentage among the conditions of exposure to fearful, neutral, and happy faces while performing the word tone judgment tasks.

Download:

Table 2. Means and standard deviations of the correct percentage (1 = 100%) for the word structure (WS) and word tone judgment tasks (WT) when fearful, neutral and happy faces were unattended respectively.

https://doi.org/10.1371/journal.pone.0075386.t002