52
Title: Selective Neural Synchrony Suppression as a Forward Gatekeeper to Piecemeal Conscious Perception Authors: Jonathan Levy, 1,2,3,4,5 * Juan R Vidal, 6,7 Pascal Fries 2,8, Jean-François Démonet, 3,4,9, and Abraham Goldstein 1,10 1 The Gonda Multidisciplinary Brain Research Center, Bar Ilan University, Ramat Gan 52900, Israel, 2 Donders Institute for Brain, Cognition and Behaviour; Radboud University Nijmegen, Nijmegen 6500 GL, the Netherlands, 3 Inserm UMR825, Imagerie cerebrale et handicaps neurologiques, Toulouse 31024, France, 4 Université de Toulouse, UPS, Toulouse 31062, France, 5 Centre Hospitalier Universitaire de Toulouse, Pôle Neurosciences, CHU Purpan, Toulouse 31024, France, 6 University Grenoble Alpes, LPNC, F-38040 Grenoble, France, 7 CNRS, LPNC, UMR 5105, F-38040 Grenoble, France, 8 Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max-Planck-Society, Frankfurt 60528, Germany. 9 Leenaards Memory Center, Department of Clinical Neurosciences, CHUV and University of Lausanne, Lausanne 1011, Switzerlan 10 Department of Psychology, Bar-Ilan University, Ramat Gan 52900, Israel. * Correspondence: Jonathan Levy, The Gonda Multidisciplinary Brain Research Center, Bar University, Ramat Gan 52900, Israel. Email: [email protected]

Selective Neural Synchrony Suppression as a Forward Gatekeeper to Piecemeal Conscious Perception

Embed Size (px)

Citation preview

Title:

Selective Neural Synchrony Suppression as a

Forward Gatekeeper to Piecemeal Conscious Perception

Authors:

Jonathan Levy,1,2,3,4,5 * Juan R Vidal,6,7 Pascal Fries 2,8, Jean-François Démonet, 3,4,9, and

Abraham Goldstein 1,10

1 The Gonda Multidisciplinary Brain Research Center, Bar Ilan University, Ramat Gan 52900, Israel,

2 Donders Institute for Brain, Cognition and Behaviour; Radboud University Nijmegen, Nijmegen 6500 GL, the Netherlands,

3 Inserm UMR825, Imagerie cerebrale et handicaps neurologiques, Toulouse 31024, France,

4 Université de Toulouse, UPS, Toulouse 31062, France,

5 Centre Hospitalier Universitaire de Toulouse, Pôle Neurosciences, CHU Purpan, Toulouse 31024, France,

6 University Grenoble Alpes, LPNC, F-38040 Grenoble, France,

7 CNRS, LPNC, UMR 5105, F-38040 Grenoble, France,

8 Ernst Strüngmann Institute (ESI) for Neuroscience in Cooperation with Max-Planck-Society, Frankfurt 60528, Germany.

9 Leenaards Memory Center, Department of Clinical Neurosciences, CHUV and University of Lausanne, Lausanne 1011, Switzerland,

10 Department of Psychology, Bar-Ilan University, Ramat Gan 52900, Israel.

* Correspondence: Jonathan Levy, The Gonda Multidisciplinary Brain Research Center, Bar Ilan

University,

Ramat Gan 52900, Israel. Email: [email protected]

No conflict of interest is declared.

Acknowledgements: The present work was supported by the following funding: I-CORE Program of

the Planning and Budgeting Committee, the Israel Science Foundation (grant No. 51/11), the

Returning Scientists' Program from the Israeli Ministry of Absorption, the Gonda Centre's postdoctoral

allowance and the French Ministry's doctoral research allowance. We would like to thank Yuval

Harpaz both for his technical support and advice in Israel, as well as Bram Daams and Sander

Berends for their technical support in the Netherlands. We would also like to thank Jan-Mathijs

Schoffelen, Robert Oostenveld and Stan van Pelt for their advice on the analysis scheme, and Michal

Lavidor and Ian FitzPatrick for their linguistic support in Hebrew and Dutch, respectively.

Abstract

The emergence of conscious visual perception is assumed to ignite late (~250 ms) gamma-band

oscillations shortly after an initial (~100 ms) forward sweep of neural sensory (non-conscious)

information. However, this neural evidence is not utterly congruent with rich behavioral data which

rather point to piecemeal (i.e. graded) perceptual processing. To address the unexplored neural

mechanisms of piecemeal ignition of conscious perception, hierarchical script sensitivity of the

putative visual word form area (VWFA) was exploited to signal null (i.e. sensory), partial (i.e. letter-

level) and full (i.e. word-level) conscious perception. Two magnetoencephalography experiments

were conducted in which healthy human participants viewed masked words (Experiment I: active

task, Dutch words; Experiment II: passive task, Hebrew words) while high-frequency (broadband

gamma) brain activity was measured. Findings revealed that piecemeal conscious perception did not

ignite a linear piecemeal increase in oscillations. Instead, whereas late (~250 ms) gamma-band

oscillations signaled full conscious perception (i.e. word-level), partial conscious perception (i.e. letter-

level) was signaled via the inhibition of the early (~100 ms) forward sweep. This inhibition regulates

the downstream broadcast to filter-out irrelevant (i.e. masks) information. The findings thus highlight a

local (VWFA) gatekeeping mechanism for conscious perception, operating by filtering out and in

selective percepts.

Keywords: Conscious perception, VWFA, gamma-band oscillations, synchrony, MEG, partial

conscious perception.

1

Introduction

The question whether the boundary between unconscious and conscious perception is gradual or not

is still debated. Neuroimaging studies investigating the neural mechanisms sustaining conscious

subjective report provided sound evidence supporting a sharp nonlinear “all-or-none” transition in

brain activity (for a review, see Dehaene and Changeux, 2011). However, this neural account is partly

in conflict with various behavioral data, which point to a rather graded experience of conscious

perception, that is, an intermediate/partial level separating null and full consciousness (Mangan,

2001; Kouider and Dupoux, 2004; Overgaard et al., 2006; de Gardelle and Kouider, 2009), nor with

the notion of hierarchy in representational levels during visual perception (Nakayama, He and

Shimojo, 1995; Driver and Baylis, 1996; Greene and Oliva, 2009; Wu et al., 2014; Overgaard and

Mogensen, 2014). According to this hierarchical view, lower and higher levels of perception are

accessed gradually and independently (Kouider et al., 2010), thereby defining piecemeal conscious

perception. What remains unknown is the neural temporal chain of ignition of the different

representational levels.

Few studies have explicitly probed the putative underlying neural mechanisms during piecemeal

consciousness. For instance, several fMRI studies revealed a linear signal change in several brain

regions, correlating with graded subjective reports of perception (Bar et al., 2001; Haynes, Driver and

Rees, 2005; Imamoglu et al., 2014). Although these findings are informative, in order to fully address

the question of the neural mechanisms during piecemeal consciousness, it is additionally required to

(i) capture the early and fast neural processes sustaining partial access to consciousness (with a

2

millisecond temporal sampling), and (ii) to characterize the discrete independent levels of perception

(and not solely the visibility of one single level). As has been previously illustrated by Kouider and

colleagues (Kouider et al, 2010), the graded organization of words (Selfridge, 1959; McClelland and

Rumelhart, 1981) is particularly suitable for testing partial consciousness. That is, words can be

perceived at several levels (e.g. visual components, letters and whole-word), and these levels can be

separately represented as neural processes (Levy et al., 2008; Mainy et al., 2008; Vidal et al., 2014).

Much previous work has suggested that an area located in the left fusiform gyrus [coined the Visual

Word Form Area (VWFA)] houses the neural code for written scripts (for a review, see Dehaene and

Cohen, 2011). Recently, we conducted a magnetoencephalography (MEG) experiment which aimed

at dissociating partial from full conscious perception of masked words (Levy et al., 2013); this

dissociation was mirrored as modulation of alpha oscillations in the VWFA. Hence, neural events in

the VWFA could be indicative of conscious and selective access of written scripts.

Previous studies revealed a neural ignition of full consciousness after ~250 ms post stimulus onset

(Sergent et al., 2005; Del Cul et al., 2007; Gaillard et al., 2009; Fisch et al., 2009). However, the early

neuronal activity (before 250 ms) may be precisely indicative of the fast and early neural processes,

which may underlie partial perception. Although we successfully dissociated between independent

levels of perception at late (after 500 ms) processing stages (Levy et al., 2013), the relatively slow

alpha rhythm which we monitored is not well-suited to capture the early and short lived neural

transitions during the chain of ignition of piecemeal conscious perception. Hence, we reanalyzed the

data (Experiment I) in the high-frequency gamma (power and phase-synchrony) band range (40 Hz to

140 Hz), a rhythm that reliably reflects active states of cortical processing and neuronal

communication (Buzsaki and Draguhn, 2004; Fries, 2005; 2009), and constitutes a sensitive

3

processing marker during single-word recognition (Mainy et al., 2008; Gaillard et al., 2009; Vidal et

al., 2014).

Previous studies highlighted late (after ~250 ms) gamma oscillatory enhancement as a marker of (full)

conscious perception (Fisch et al., 2009; Gaillard et al., 2009). Accordingly, to explore the neural

mechanisms of piecemeal (null to partial to full) conscious perception, one possible hypothesis would

be that piecemeal conscious perception should trigger a piecemeal (i.e. graded) increase in gamma-

band oscillations. By contrast, partial conscious perception may operate a different neural mechanism

as letter and word processing cortically differ (Levy et al., 2008) and may thereby ignite earlier activity

(Tarkiainen et al., 2002). However, early (~100 ms) gamma activity (power enhancement and

stimulus phase-locking) has not been reported as a marker of conscious perception but rather as a

marker of bottom-up sensory (Aru et al., 2012; Bosman et al., 2012; Bastos et al., 2014; Vidal et al.,

2015; Bastos et al., 2015) and non-conscious (Gaillard et al., 2009) processes. Hence, if early

gamma-band activity is related to sensory and non-selective processing, an alternative hypothesis

would be that gradually inhibiting this early activity could act as a selective filter by regulating the

efficiency of its downstream broadcast. This early local broadcast control would then act as a gate-

keeper towards subsequent full conscious processing. Importantly, VWFA sensitivity to written scripts

could be exploited as neural marker to conscious perception (i.e. script processing), compared to null

perception (i.e. sensory activity). We therefore assumed that an area tuned to written-script

processing (i.e. VWFA) should not only process conscious script percepts via late gamma

oscillations, but may prior to that, inhibit sensory and non-conscious contents (i.e. visual mask

processing). This neural pattern is not expected to arise in neighboring occipital areas. Hence, if

piecemeal consciousness is achieved, it may be signaled via two consecutive stages in the VWFA:

4

an initial inhibition of sensory and non-conscious activity (thereby achieving partial perception),

followed by enhancement of selective and conscious activity (thereby reaching full perception).

Our analyses revealed early (~100 ms) inhibition of stimulus-locked gamma-band phase-synchrony

and power in the VWFA occurring prior to late (originating at ~250 ms) enhancement in gamma. A

complementary recording session (Experiment II) was conducted to capture a broader spectrum of

piecemeal perception (Experiment I: Partial and Full, Experiment II: Null, Partial and Full), and at the

same time tested the sensitivity of this neural mechanism to top-down task engagement (Experiment

I: active design, Experiment II: passive design) and to bottom-up input (i.e. orthographic script)

(Experiment I: Dutch, Experiment II: Hebrew). The two experiments showed that early (~100 ms)

gamma-band synchrony and power increase in the VWFA was triggered by null conscious perception

of scripts (i.e. general visual processing); then, inhibition of this activity reflected partial conscious

script perception followed by later (originating at ~250 ms) enhancement of gamma oscillations for

full-blown conscious perception. Hence, fast neural events in the VWFA selectively "gate-keep" the

access to conscious script perception in a piecemeal mechanism.

Materials and Method

Experiment I:

2.1 Participants

Thirteen healthy right-handed and native-Dutch human subjects (eight males and five females,

average age 23.92 ± 4.54 years) with normal or corrected-to-normal vision participated in the

experiment. None of the participants had a history of neurological or psychiatric disorders. The study

5

was approved by the local ethics committee, and a written informed consent was obtained from the

subjects before the experiment according to the Declaration of Helsinki, and received monetary

compensation.

2.2 Stimuli

Stimuli were generated using Presentation software (Neurobehavioral systems, Albany, USA) in a

dimly lit room and subtended a horizontal visual angle of ∼2.5°. They were projected on an LCD

monitor placed at a viewing distance of 70 cm. Responses were delivered by response pads. Stimuli

were Dutch nouns (300 animate, 300 inanimate) between 3 and 9 letters in length with a lemma

frequency between 100 and 300 occurrences per million (Baayen et al., 1993). Animate and

inanimate words were balanced with respect to their word frequency of occurrence and the number of

letters they contained. The animacy character of the selected words was unequivocal. Words were

presented only once to prevent automatic stimulus–response learning or repetition-suppression

effect.

2.3 Experimental Procedure

Each trial began with a fixation cross presented for randomly chosen latency within the interval from

600ms to 933 ms, followed by the presentation of a forward mask (67 ms), a word (33 ms) and a

backward mask (67 ms). The random variation in fixation controlled for plausible anticipation of

stimulus onset timing. In 16.67% of the totality of trials, we replaced the words with blank screens to

avoid subject’s habituation or anticipation and to control for the orthographic detection task

(letters/no-letters). All stimuli were presented randomly. After stimuli presentation, a larger fixation

crosshair prompted the subject to respond. Responses indicated trials' termination and lasted no

6

longer than 2100 ms. A smaller subsequent crosshair spanned 1000ms and signaled the interval

linking the next trial. Participants were allowed to make eye movements or blinks during that inter-trial

interval. A conservative response criterion was adopted so as to mitigate perceptual illusions

(reconstruction) and at the same time possible subliminal (unconscious) perception which both may

occur under thresholded stimulation (Kouider et al., 2010). This approach thereby aimed to maximize

the dissociation between the two levels of perception, and focus only on conscious perception (not

subliminal). To this end, subjects were discouraged from guessing by (i) explicitly asking them to

categorize the word only when they had clearly recognized it; (ii) training on the task with 90 training

trials (different words than during the experiment) before measurement; (iii) receiving visual feedback

on the correctness of their response. Furthermore, in the unlikely event in which the semantic

category (animate vs. inanimate) of a given word appeared equivocal, participants were instructed to

avoid response, and hence exclude such trials from the analysis. After blocks of 30 trials, participants

were allowed to pause. They initiated the start of each new trial block by a button press. In total, each

subject performed 720 trials in a total measurement time of approximately 55 min. The reports were

through button press on a response pad; they were counterbalanced across subjects between

right/left hands and middle/index fingers (four response buttons in total, and hence eight possible

combinations of response press, in the aim of controlling for differences in the motor representations

of each finger.

To maximize and to balance attentional engagement, the subjects’ task was to report (i) full or (ii)

partial word-form perception by semantically categorizing the word (animate/inanimate) or if failed to

perceive the word, by orthographically categorizing the percept (letters/none), respectively.

Correctness in word detection was probed using the semantic categorization task (Animate or

7

Inanimate), whereas correctness in letter detection was probed based on the ability to report the

presence of letters during word trials (hit) or during blank trials (false alarm). To achieve maximal

sensitivity of the neural signal and the statistical tests, the procedure aimed at collecting trials with a

50% distribution for each of the two reports. A modified “stair-case” version of the threshold

estimation procedure described by Levitt (Levitt, 1971) was determined individually in real-time. The

staircase effect was obtained by varying the contrast of a mask of scrambled characters (see Figure

1A) every six trials based on the given subjective response. Noteworthy, in order to obtain optimal

parameters resulting in a 50/50 distribution of the two perceptual tasks, prior to this study we

conducted pre-study piloting in which we obtained optimal stimulation/masking (i) latency and (ii)

illumination, as well as (iii) optimal amplitude (masking intensity) of stair-case steps.

2.4 MEG recordings and data preprocessing

Ongoing brain activity was recorded (sampling rate, 1200 Hz) using a whole-head CTF MEG system

with 275 DC SQUID axial gradiometers (VSM MedTech Ltd., Coquitlam, British Columbia, Canada).

Head position was monitored using three coils that were placed at the subject’s left ear, right ear, and

nasion. Bipolar EEG channels were used to record horizontal and vertical eye-movements as well as

the cardiac rhythm for the subsequent artifact rejection. Only correct trials with an identical contrast

level were kept for spectral analysis. They consisted in the largest fraction of report trials and were

equally distributed. Trials containing eye blinks, saccades, muscle artifacts and signal jumps were

rejected from further analysis using an automatized procedure.

8

2.5 Stratification

To minimize effects of fluctuations in response time on our analyses, we performed a post hoc

stratification of the data based on the response time (RT) values. The goal of procedure was to

assure that RT variance would not bias the tested contrast (i.e. full versus partial). We therefore

obtained from each of the two perceptual reports (partial and full), a subset of trials with an identical

distribution of RT values across the subsets in each of the two report conditions. This approach has

been adjusted from Roelfsema et al. (Roelfsema et al., 1998) and a similar strategy has been

successfully applied before to control for EMG fluctuations (Schoffelen et al., 2005). For every

participant, we binned the observations in each condition, while the bin centers were obtained by

dividing the range of all RT values into four equally spaced bins with equal bin centers. In this way,

each of the observations fell within one of the bins, for which we selected a subset of observations

such that across the two conditions, the number of observations was identical. From the condition

with the lowest number of observations in a given bin, all N observations constituting that particular

bin were selected for the stratified sample. From the other condition, a subset of N observations was

randomly drawn from the observations constituting that particular bin.

Experiment II:

2.1 Participants

Twenty two healthy right-handed and native-Hebrew speaking human subjects (thirteen males and

nine females, average age 27.22 ± 6.19 years) with normal or corrected-to-normal vision participated

in the experiment and received monetary compensation. Participants were screened before the MEG

experiment for piecemeal conscious perception (i.e. dissociating null, partial and full conscious

perception). None of the participants had a history of neurological or psychiatric disorders. The study

9

was approved by the local ethics committee, and a written informed consent was obtained from the

subjects before the experiment according to the Institution's Review Board.

2.2 Stimuli

Stimuli were generated using the E-prime software (Psychology Software Tools Inc.) in a dimly lit

room, and presented in the center of the screen in Courier New, size 12 so that the average stimulus

subtended a horizontal visual angle of ∼2.7°. Letters were gray on a black background projected

through a mirror on an LCD monitor placed at a viewing distance of 50 cm. Responses were

delivered by a response pad. A photosensitive diode on the screen recorded the onset time of visual

stimuli. Stimuli were 364 Hebrew nouns or gerunds with a mean length of 4.5 letters ± 0.5 and a

mean word frequency of occurrence of 28.39 ± 38.37 per million (Frost and Plaut, 2005). The words

were randomly counterbalanced across conditions and participants, with respect to their word

frequency of occurrence and the number of letters they contained. There was no difference in the

frequency of occurrence (F(3, 360)=1.17, p=0.31) nor in length (F(3, 360)=0.01, p=0.99) across

conditions. A particular emphasis was put upon selecting words with a minimal length variance (only

4-5 letter words were used) as this variable was found to significantly correlate with conscious visual

perception (Levy et al., 2013). Words were presented only once to prevent automatic stimulus–

response learning or repetition-suppression effect.

2.3 Experimental Procedure

Prior to MEG acquisition, subjects were familiarized with an orthographic detection (OD) task and with

a semantic decision (SD) task. The subjects then began the behavioral part of the experiment and

performed four sessions, two of each task, in order to predetermine the four perceptual levels of

10

interest: null (OD), partial (OD), prelexical (SD) and semantic (SD) perception. In the OD, the

subjects’ task was to orthographically categorize the percept (letters/non-letters). In the SD, the

subjects’ task was to semantically categorize words (animate/inanimate). A modified “stair-case”

version of the threshold estimation procedure described by Levitt (Levitt, 1971) was determined

individually by varying the contrast of a mask of scrambled characters (see Figure 1B) every six trials

based on the given subjective response. The procedure aimed at collecting trials with a correctness

threshold of 10% or less for the null and sublexical levels, and of 90% or more for the partial and full

levels.

The participants then re-performed the four sessions, this time at the predetermined level, in order to

compute the objective discriminability d' value separately for each one of the masking levels. We then

assessed d' values at the group level by means of a one-way ANOVA across the four levels of

perception. The d' can be computed only for hit-rates and false alarm rates which are different than 1

or 0, namely, when signal and noise are not perfectly discriminated. However, several approaches

exist for dealing with possible extreme values (1 or 0) (see Stanislaw and Todorov, 1999), out of

which, the most commonly used is an adjustment consisting in replacing rates of 0 with 0.5 / n , and

rates of 1 with (n - 0.5) / n, where n is the number of signal or noise trials (Macmillan and Kaplan,

1985).

After registration of the head position, the participants began the MEG measurement by performing a

simple word detection task (Figure 1B): They were instructed to passively watch the visual stimuli

presented on the screen, and to count the occurrence of two filler words. During the pauses, they

reported the occurrence by pressing the response pad. All words (except for the filler words) were

presented once and randomly distributed in one of the four perceptual levels by varying masking

11

luminance. Additionally to the four levels, two supplementary conditions were computed with identical

occurrence frequency, in which the words were replaced by blank screens. This was done in order to

simulate the neuronal response to masks only. The masking luminance of the first corresponded to

that of the null perception condition, whereas that of the second corresponded to the full perception

condition. The participants were asked to refrain, as much as possible, from moving their head and

from blinking during the experiment

Each trial began with a fixation cross (1133-1613 ms), followed by the presentation of a forward mask

(67 ms), a word (17 ms) a backward mask (67 ms) and a blank screen (704-896 ms). The random

time controlled for plausible anticipation of stimulus onset timing. In ca. 5% of all trials, we presented

one out of two target words (which were not included in the experimental word database); during

random pauses, participants were required to report the occurrences of those names since the last

pause. These names were not analyzed, but instead were used as filler trials aimed to avoid subject’s

habituation or anticipation and to maintain a steady alertness level throughout the experiment. All

stimuli were presented randomly. Participants were allowed to make eye movements or blinks during

that inter-trial interval. They initiated the start of each new trial block by a button press. In total, each

subject performed 546 trials in a total measurement time of approximately 25 min. At the end of the

experiment, head position was recorded again, and the perceptual levels (d') were again computed

using the same semantic and orthographic tasks as at the beginning of the experiment. A within-

subject repeated-measures ANOVA across the four levels of perception before and after the online

experiment tested for a possible adaptation effect to the different levels.

12

2.4 MEG recordings and data preprocessing

Ongoing brain activity was recorded (sampling rate, 1017 Hz, online 1–400 Hz band-pass filter) using

a whole-head 248-channel magnetometer array (4-D Neuroimaging, Magnes 3600 WH) in a supine

position inside a magnetically shielded room. Reference coils located approximately 30 cm above the

head oriented by the x, y and z axis were used to remove environmental noise. Five coils were

attached to the participant's scalp for recording the head position relative to the sensor. External

noise (e.g, power-line, mechanical vibrations) and heartbeat artifacts were removed from the data

using a predesigned algorithm for that purpose (Tal & Abeles, 2013). The analysis was performed

using Matlab 7 (MathWorks, Natick, MA, USA) and the FieldTrip toolbox (Oostenveld et al., 2011).

The data were segmented into 1900 ms epochs including baseline period of 700 ms, and trials

containing muscle artifacts and signal jumps were rejected from further analysis by visual inspection.

One sensor was excluded from the analysis due to malfunction. The data were then filtered in the 1-

200 Hz range with 10 s padding and were then resampled to 400 Hz. Finally, spatial component

analysis (ICA) was applied in order to clean eye-blinks, eye movements or any other potential noisy

artifacts.

2.4 Spectral analysis (Experiments I and II)

For each subject, a single shell brain model was built based on a template brain (Montreal

Neurological Institute), which was modified to fit each subject's digitized head shape using SPM8

(Wellcome Department of Imaging Neuroscience University College London,www.fil.ion.ucl.ac.uk). In

Experiment I, the head shape was computed using the inside of the skull (Nolte, 2003) based on each

individual’s anatomical MRI, which was spatially aligned to the MEG sensors. In Experiment II, the

head-shape was manually digitized (Polhemus Fastrak digitizer). The subject's brain volume was then

13

divided into a regular grid. The grid positions were obtained by a linear transformation of the grid

positions in a canonical 1cm grid. This procedure facilitates the group analysis, because no spatial

interpolation of the volumes of reconstructed activity is required. It should be noted that the spatial

precision in Experiment I was superior as subject-specific dipole grids were obtained based on the

subject's anatomical MRI. For each grid position, spatial filters were reconstructed in the aim of

optimally passing activity from the location of interest, while suppressing activity which was not of

interest.

Time series were extracted from the VWFA by applying a linear constrained minimum variance

beamformer. This analysis was followed up by confirming that the expected local gamma-band

response emanates from this region, at the whole brain level. To this end, an adaptive spatial filtering

(Gross et al., 2001) was applied, relying on partial canonical correlations: the CSD matrix was

computed between all MEG sensor pairs from the Fourier transforms of the tapered data epochs at

the two frequency-bands (high gamma, or HG, and middle gamma, or MG) at the early (0-250ms)

and late (250-500ms) phases of the response, respectively. Spatial filters were constructed for each

grid location, based on the identified frequency bin, and the Fourier transforms of the tapered data

epochs were projected through the spatial filters. In the aim of computing Time-Frequency

Representations (TFRs) of power for each trial, tapers were applied to each time window and in order

to calculate the Fast Fourier Transform (FFT) for short sliding time windows. Data were analyzed in

alignment to stimulus onset and power estimates were then averaged across tapers. To probe

gamma-frequency power (40–150 Hz), five Slepian multitapers (Percival and Walden, 1993) were

applied using a fixed window length of 0.2 s, resulting in a frequency smoothing of 15 Hz. Finally, in

order to probe a measure of behavioral acuity for full versus partial perception, we subtracted the d'

14

values during null perception of words from the d' values during the complete perception of words.

This measure would therefore compute how well the participants were able to recognize words during

full perception and at the same time not recognize words during partial perception – thereby reflecting

the behavioral acuity for full versus partial perception.

2.6 Statistical Analysis (Experiments I and II)

Statistical significance of the power values was assessed similarly at the sensor and at the source

levels. It was assessed using a randomization procedure (Maris and Oostenveld, 2007). This

nonparametric permutation approach does take the cross-subject variance into account, because this

variance is the basis for the width of the randomization distribution. This approach was chosen as it

does not make any assumptions on the underlying distribution, and it is unaffected by partial

dependence between neighboring time-frequency pixels. Specifically, the procedure was as follows: t-

values representing the contrast between the conditions were computed per subject, channel,

frequency and time. Subsequently, we defined the test statistic by pooling the t-values over all

subjects. Here, we searched time-frequency clusters with effects that were significant at the random

effects level after correcting for multiple comparisons along the time and the frequency dimensions.

Testing the probability of this pooled t-value against the standard normal distribution would

correspond to a fixed effect statistic. However, to be able to make statistical inference corresponding

to a random effect statistic, we tested the significance of this group-level statistic by means of a

randomization procedure: We randomly multiplied each individual t-value by 1 or by -1 and summed it

over subjects. Multiplying the individual t-value with 1 or -1 corresponds to permuting the original

conditions in that subject.

15

This random procedure was reiterated 2000 times to obtain the randomization distribution for the

group-level statistic. For each randomization, only the maximal and the minimal cluster-level test

statistic across all clusters were retained and placed into two histograms, which we address as

maximum (or minimum, respectively) cluster-level test statistic histograms. We then determined, for

each cluster from the observed data, the fraction of the maximum (minimum) cluster-level test statistic

histogram that was greater (smaller) than the cluster-level test statistic from the observed cluster. The

smaller of the two fractions was retained and divided by 2000, giving the multiple comparisons

corrected significance thresholds for a two-sided test. The proportion of values in the randomization

distribution exceeding the test statistic defines the Monte Carlo significance probability, which is also

called a p-value (Nichols and Holmes, 2002; Maris and Oostenveld, 2007). This cluster-based

procedure allowed us to obtain a correction for multiple comparisons at all sensor and source

analyses.

Results

Experiment I:

The online adjustment of masking luminance as a function of participant reports yielded one critical

mask contrast (per participant) under which physical stimulation was constant but perceptual

distribution was equally distributed between partial and full conscious perception (Figure 1A; for more

detail on the behavioral findings, see Levy et al., 2013). We proceeded to probe the time-frequency

(in the broad band gamma spectrum) representations in the VWFA (Dehaene and Cohen, 2011)

defining the gradual transition from Null, through Partial towards Full perception. The contrast Full vs

Partial was characterized particularly by middle gamma (MG) increase (overall at 55 - 100 Hz, with a

peak in power at 60 – 80 Hz) at approximately 450-650 ms (p<0.05 corrected) (Figure 2A, left panel).

16

Whole-brain source localization on this MG enhancement (Full perception compared to baseline

activity) determined its source in the left occipito-temporal junction (Figure 2B, left panel) thereby

confirming that during Full perception MG activity emanated from this region, including the VWFA.

This neural effect could be explained by perceptual change and was independent of stimulation level

which was constant in this experiment. We then examined the temporal dynamics within the VWFA

for each of the two word percepts. At 500 – 700 ms, MG in Full perception was significantly more

robust than Partial perception (p < 0.001 corrected, Figure 2C, left panel). Moreover, to check

whether piecemeal word perception could influence gamma-band synchrony in the VWFA, Rayleigh

z-values were computed to reflect uniformity of synchrony across trials. The contrast Full versus

Partial yielded a decrease in synchrony from 90 to 130 ms at 41 - 53 Hz (p < 0.001 corrected), as well

as from 30 to 70 ms at 50 - 66 Hz (p < 0.001 corrected) (Figure 3, right panel). Looking at each of the

two conditions separately (see Figure S4, bottom right panels) illustrated that the two inhibitory effects

related to two bursts of local synchrony during partial perception which are inhibited during full

perception. Hence, Experiment I conveys two signatures, in power and phase, within the gamma-

band, that convey a transition from partial to full conscious perception. We then proceeded with a

complementary recording session (Experiment II) so as to pinpoint plausible neural signatures for the

earlier transition step during conscious perception, namely from null to partial, altogether probing the

spectrum of piecemeal perception (Null, Partial and Full). Moreover, the net perceptual dynamics

were investigated in this supplementary session.

Experiment II:

First, participants were screened before the MEG experiment for piecemeal conscious perception (i.e.

dissociating null, partial and full conscious perception). The four perceptual levels were determined by

17

means of a staircase procedure, followed by an estimation of the perception was assessed as d'

values after the participants completed two SD and then two OD tasks with the four predetermined

levels. Out of the twenty two participants, fifteen were screened for piecemeal conscious perception

(Null: no letters, no words; Partial: letters, no words; Full: letters and words) and the remaining seven

were not included in the experiment. Out of the rejected subjects, six did not yield the Partial level and

instead both their perception of letters (d' = 1.08 ± 0.51) and that of words (d' = 0.88 ± 1.01; p = 0.68)

were constrained, and a seventh subject did not yield the Null level [the participant's detection of

letters was unimpaired (d' = 2.11) even after applying the highest masking possible]. Thus, the fifteen

participants yielding piecemeal conscious perception had a significant statistical difference between

the d' values across the four levels of perception (p = 1.77 x 10-15). Importantly, the SD tasks revealed

that complete perception of words (d' = 2.87 ± 0.42) was statistically higher (p = 1.15 x 10-10) than

purportedly null perception of words (d' = 0.51 ± 0.77), whereas the OD tasks revealed that complete

perception of letters (d' = 2.45 ± 0.68) was statistically higher (p = 8.30 x 10-8) than null perception of

letters (d' = 0.28 ± 0.77). Bridging the gap between word perception and letter perception is not a

straightforward issue as it is far from unequivocal whether the two can be dissociated. Here, we

chose a parametric approach in order to attempt the dissociation of the two: given that the perceptual

level ostensibly reflecting null perception of words was more weakly masked than that reflecting

complete perception of letters (except for one participant for whom the two levels were behaviorally

dissociable, despite corresponding to an identical masking level), one could assume with relatively

high confidence, that complete perception of letters conjointly mirrored close-to-null perception of

words. The "null words" level was used in the present experiment in order to better define the

"complete letters" level, although at present, "null words" assumes no clear perceptual functioning.

Thus, in the following analyses we will relate to three levels of perception: Null ("null letters"), Partial

18

("complete letters") and Full ("complete words") (Figure 1B). Furthermore, in order to control for

possible modulation of individual perception level throughout the experiment, the perceptual levels

were measured offline also at the end of the MEG experiment: complete perception of words (d' =

2.58 ± 0.33) was statistically higher (p = 9.03 x 10-7) than purportedly null perception of words (d' =

0.65 ± 0.65), whereas complete perception of letters (d' = 2.44 ± 0.72) was statistically higher (p =

1.22 x 10-6) than null perception of letters (d' = 0.24 ± 0.42). There was an insignificant statistical

difference between the d' values before and after the experiment (p = 0.43), hence one may rule out a

possible adaptation effect throughout the online experiment, and may thereby assume a reliable

estimation of the individual perceptual experience. Finally, during the online experiment itself, the

participants performed the name detection task successfully (90.58 ± 6.79 %) thereby pointing to a

reliable level of attention throughout the experiment.

To investigate the power modulation across the three conditions, we computed an ANOVA revealing

a significant difference (p = 0.02 corrected) across all time and frequency samples. Post-hoc tests

revealed that the contrast Full vs Partial was characterized by MG (overall at 55 - 80 Hz; note

however a later reduced effect at approx. 40-50 Hz and 550-650 ms) increase at approximately 150-

500 ms (p < 0.01 corrected) (Figure 2A, middle panel); power enhancement in that window could not

be explained simply by mask luminance decrease (p = 0.14). Whole-brain source localization on this

MG enhancement (Full perception compared to baseline activity) determined its source in the left

occipito-temporal junction (Figure 2B, right panel) thereby confirming that during Full perception MG

activity emanated from this region, including the VWFA. Interestingly and contrary to our expectation,

the contrast Partial vs Null was not accompanied by such power increase but rather by an early HG

(at approx. 80 – 100 Hz) decrease (at approximately 0-150 ms) (p = 0.03 corrected) (Figure 2A, right

19

panel); power suppression in that window could not be explained simply by mask luminance decrease

(p = 0.12). Whole-brain source localization revealed that brain activity (Full perception compared to

baseline activity) at the early and HG spectrum emanated rather from the bilateral occipital cortex (yet

including the VWFA in Experiment II) (Figure S1) thereby suggesting a rather general visual

processing at the early and HG oscillations during word perception. This was further strengthened by

the relative orthogonality of the effects measured in the VWFA with mask luminance (p > 0.11

uncorrected), compared to those measured in other key areas of the occipito-temporal cortex which

were rather driven by mask luminance (p < 0.0005 corrected) (See Results section in Supplementary

Information). Another measurement which pointed out to the relative orthogonality of the effects

measured in the VWFA with mask luminance was through Experiment I, in which no masking

modulation was applied. Importantly, the whole-brain analyses further suggest that whereas the early

HG effect was more related to visual processing (bilateral occipital activity), the later effect (MG) was

rather constrained to the left occipito-temporal junction, thereby pointing out to a preferential

response to words at that later phase of gamma oscillations. This conjecture was further supported by

additional analyses at three more coordinates of the occipito-temporal cortex: the right VWFA

homolog, and the bilateral occipital cortex (see Supplementary Material Results section). In those

other coordinates, there were hardly any differences which may reflect conscious perception. Instead,

the effects reflected (bottom-up) mask luminance modulation (Figure S2).

Moreover, to test for exclusivity of the two reversed power effects, the two non-overlapping time-

frequency windows were tested separately for each contrast: the late MG increase did not correlate

with the contrast Partial vs Null (p = 0.31), and conversely, the early HG suppression did not correlate

with the contrast Full vs Partial (p = 0.12). Furthermore, we tested whether the two steps take place

20

during the contrast Full vs Null, thereby confirming that piecemeal conscious perception (i.e. from Null

through Partial to Full) is also characterized by an early HG suppression (p = 0.004), followed by a

MG enhancement (p < 10-4). Looking at each condition separately (see Figure S3) sharpens the

interpretation: During conscious perception (particularly conspicuous at the first step, namely partial),

the brain inhibits early HG which is present under no conscious perception (either null perception, or

masks only, regardless of their luminance level). In summary, these results reveal in both

experiments a pattern of late MG power increase from Partial to Full word perception. Interestingly,

this mechanism is not the sole at hand, as another mechanism revealed an early high gamma (i.e.

HG) suppression, approximately between 80-100 Hz, resulting from the contrast Partial vs Null, yet

not from Partial to Full. These findings suggest a two-staged mechanism of piecemeal conscious

perception: through an initial suppression of HG, followed by an enhancement of MG.

The temporal dynamics of MG within the VWFA was then probed: An ANOVA was computed across

the three conditions (p<0.005 corrected), and post-hoc tests revealed, that MG in Full perception was

significantly more robust than Partial perception (p < 0.001 corrected) at 150 – 600 ms, and also more

robust than Null perception (p < 0.01 corrected) at 250-500 ms (Figure 2C, middle panel). This effect

could not be explained by mask luminance modulation, as revealed by testing the masked blank

conditions (p = 0.26). Furthermore, MG in the Partial perception was significantly weaker than Null

perception in the 50-200 ms window (p < 0.05 corrected), suggesting an enhancement of early visual

processing during Null Perception. To test for hemispheric laterality selectivity which is often assumed

for this region, we also tested for differences in the right homolog of the VWFA (Figure S2A): there

were no significant effects across conditions in the two experiments (p > 0.2). The analyses above

revealed that in both experiments MG power in the VWFA correlated with the contrast Full vs Partial.

21

Moreover, to check whether piecemeal word perception could influence gamma-band synchrony in

the VWFA, we combined all three conditions (in Experiment II) and computed their Rayleigh z-values

reflecting uniformity of synchrony across trials. The test found a significant (p < 0.05 corrected) early

synchrony in the low-gamma range (LG) from 0 until 180 ms at 40 – 70Hz (Figure S4, left upper

panel). A linear regression was then computed across the three conditions, then revealing a

significant linear decrease of synchrony from Null through Partial to Full (r = - 0.41, p < 0.05). This

decrease in synchrony could not be explained by bottom-up masking modulation (p = 0.43). Post-hoc

tests on this time-frequency window revealed: (a) The contrast Partial vs Null was characterized by a

burst of gamma synchrony suppression from 85 to 120 ms at 52 - 70 Hz (p < 0.05 corrected) (Figure

3, left panel), and this decrease in synchrony could not be explained by bottom-up masking

modulation (p = 0.12 uncorrected). Similarly, the contrast Full vs Partial was characterized by a burst

of gamma synchrony suppression from 120 to 150 ms at 42 - 52 Hz (p < 0.05 corrected) (Figure 3,

middle panel). Once again, this decrease in synchrony could not be explained by bottom-up masking

modulation (p = 0.12 uncorrected). Furthermore, this early burst did not occur during the contrast

Partial vs Null (p = 0.41 uncorrected), yet it did occur during the contrast Full vs Null (p < 0.001

uncorrected), hence confirming that this burst suppression occurs during word perception at the

second phase, from Partial to Full. Another, earlier inhibitory effect was found (30 to 60 ms at 56 - 66

Hz (Figure 3, middle panel), yet with lower statistical power (p = 0.005 uncorrected) and could not be

explained by masking strength modulation (p = 0.18 uncorrected). Despite having controlled for

masking strength modulation, a second control could be reflected through Experiment I, in which no

masking modulation was applied. Remarkably, the two inhibitory effects (Full vs Partial) were almost

identical in the two experiments (Figure 3, middle and right panels), thereby pointing to early effects

22

which are largely unchanged by task demands or by bottom-up stimulation. Importantly, these effects

were pinpointed in the VWFA, yet not in the right VWFA homolog, nor in the bilateral occipital cortex

(see Supplementary Material Results section). Looking at each condition separately (see Figure S4)

allowed for a more global outlook: two bursts of local synchrony arise and possibly reflect a bottom-up

sweep triggered by the two masks (and the unidentified word in the null condition). Three inhibitory

phases around 50-150 ms reflect piecemeal conscious perception. Hence, early gamma-band

synchrony is triggered in the VWFA during null conscious perception of scripts (i.e. by non-

representational visual processing). It is through the modulation of this activity that the VWFA

selectively gate-keeps the access to conscious script perception.

We then followed by testing whether the observed power and synchrony mechanisms could reflect an

integrated mechanism. Hence, we measured the correlation between the two frequency measures

during the two piecemeal perceptual phases (i.e. partial and full); power and synchrony were anti-

correlated during Full conscious perception in Experiment I (r = - 0.65, p < 0.05) (Figure 4, left panel)

as well as in Experiment II (r = - 0.55, p < 0.05) (Figure 4, middle panel), and positively correlated

during Partial conscious perception in Experiment II (r = 0.56, p < 0.05) (Figure 4, right panel). Hence,

the more local synchrony decreased in partial and full perception, the more power decreased or

increased, respectively. Hence, the power and phase were functionally coupled in the gamma-band

to yield piecemeal conscious perception.

Furthermore, the two distinct stages expressed in the MG and HG (Figure 2A) were functionally

related and could thereby co-evolve and represent together an integrated process. Hence, the

MG/HG ratio was computed at the trial level for each one of the perceptual conditions. In Experiment

23

I, the MG/HG ratio was significantly higher (p = 0.01) for Full compared to Partial perception (Figure

5A, left panel). In Experiment II, an ANOVA revealed a significant overall effect (p = 0.01) of the

MG/HG ratio across Full, Partial and Null; specifically, the Full compared to Partial perception was

also significantly higher (p = 0.008), and additionally, Partial compared to Null perception was

significantly higher (p = 0.03) (Figure 5A, middle panel). A linear regression across subjects revealed

that the ratio increased linearly from Null to Full, which could be seen both at the group (r = 0.78, p =

2.62* 10-12) (Figure 5A, middle panel) and single-subject levels (r = 0.10, p = 1.08* 10-6) (Figure 5A,

right panel illustrating an example of regression for a single subject). To rule out any plausible bias of

masking luminance modulation, the MG/HG ratio was also computed for the two additional control

conditions, which both contained blanks instead of words, and were masked at the Null and the Full

masking levels. There was no significant statistical difference (p = 0.28) between the ratio of the

lowest masking contrast (equal to that in the Full condition) and that of the highest masking contrast

(equal to that in the Null condition). Hence, a parametric increase in the MG/HG ratio could be

observed in both experiments and across all hierarchical levels, irrespective of bottom up modulation,

and thereby mirroring a cortical mechanism for piecemeal word perception. Altogether, the three

neuronal mechanisms (power and synchrony) across the broad band gamma (LG, MG and HG)

define piecemeal conscious perception and are summarized in Figure 5B. Finally, we probed whether

the extent of synchrony (we selected the strongest effect, i.e. during the latest second inhibitory burst)

during full perception could reflect the behavioral acuity of full versus partial perception (see Materials

and Methods). Conforming to our expectation, synchrony and full versus partial acuity were anti-

correlated (r = - 0.55, p < 0.05) thereby suggesting that the less the LG of participants was

synchronized during full perception, the more were they sharp in distinguishing full from partial

perception (Figure 5C).

24

Discussion

In this study, VWFA sensitivity to written scripts was exploited as neural marker to (script) conscious

perception, thereby revealing piecemeal ignition of conscious perception. Accordingly, the two MEG

experiments highlight: (i) that induced gamma-band oscillations (MG) originating around ~250 ms in

the VWFA reflect access to full conscious perception of words; (ii) that early (~100 ms) inhibition of

sensory driven activity in this area "gate-keeps" the access to conscious script perception via a partial

step in the chain-of-ignition of conscious perception; (iii) that this mechanism is largely unaffected by

sensory bottom-up stimulation, unlike the neighboring areas in the bilateral occipital cortex which

strongly mirror sensory stimulation; (iv) that these neural events operate at relatively distinct

frequency bands, yet they co-evolve and therefore possibly represent a coordinated neural

integration; (v) that the extent of phase-alignment inhibition predicts the acuity of dissociating full from

partial perception.

The first result in the present study, that is, the late (~250 ms) enhancement of gamma oscillations

signaling full conscious perception in both experiments, is congruent with the current state of

knowledge regarding conscious perception which is well accommodated by the Global Neuronal

Workspace (GNW) model (for a review, see Dehaene and Changeux, 2011). It cannot, however,

relate to other predictions made by the GNW model nor can it relate to other findings which are

explained by the model (e.g. long-distance cortico-cortical synchronization at beta and gamma

frequencies and ‘‘ignition’’ of a large-scale prefronto-parietal network) due to the local outlook of the

present study. However, the second finding, that is, the early (~100 ms) inhibition of gamma

25

synchrony and power, is not accommodated by the GNW model which strongly argues that "early

bottom-up sensory events, prior to global ignition (~100 ms) contribute solely to nonconscious percept

construction and do not systematically distinguish consciously seen from unseen stimuli" (for a

review, see Dehaene and Changeux, 2011).

Other authors coined these early events the "feedforward sweep" of information processing due to

their spreading from low-level to high-level areas of the visual cortical hierarchy; it is claimed that this

early sweep in itself reflects non-conscious processing (Lamme and Roelfsema, 2000; Lamme,

2006). More recent studies further suggest that both cortical reactivity (Aru et al., 2012; Vidal et al.,

2015) and synchrony (Bosman et al., 2012; Bastos et al., 2014; Bastos et al., 2015) of early occipital

gamma-band responses may functionally mark this sweep. From a local perspective, the increase in

phase-synchrony could reflect heightened efficiency of information broadcast (Salinas and Sejnowski,

2001). The present study shows that early gamma-power (Figure S3) and -synchrony (Figure S4)

increase in the VWFA is driven by sensory processing (Null perception), and that during piecemeal

conscious (script) perception this local increase is selectively inhibited. Thus, the enhanced gamma-

band sweep in the VWFA under strong masking (Null perception) could be interpreted as the

broadcasting of sensory information (i.e. masks) to which the VWFA is not tuned to. By doing so, the

VWFA is using resources for processing sensory information at the expense of processing selective

information (i.e. script). Thus, by locally (i.e. in the VWFA) inhibiting the early sweep, irrelevant

information could be filtered-out by acting on the efficiency of its downstream broadcast. The current

study therefore predicts not only that conscious perception is piecemeal, but also that it is achieved

via a mechanism operating in opposite ways: first filtering-out non-specific information and then

tuning into the targeted percepts. Although the current analyses revealed that in both experiments the

26

effects were not modulated by low-level masking parameters, future efforts should probe whether the

early inhibition reported here may reflect a general mechanism related to conscious perception (also

visible under other visual paradigms) or alternatively solely restricted to masking paradigms.

Moreover, it would be most interesting to probe the piecemeal hypothesis through the perspective of

the large-scale brain network (e.g. the GNW) and to its recurrent loops (see Lamme, 2006).

Although the main focus of the present study was to probe piecemeal conscious perception, the

present findings can also be of interest to the domain of reading. Namely, the assumption of an

intermediate conscious level was approached here by exploiting the two perceptual levels involved in

word reading (i.e. letter perception and whole word perception) as two levels of conscious perception

(i.e. partial and full blown perception, respectively). Hence the findings relate to the long-lasting

debate regarding the processing of words' sub-features during reading. Traditionally, it has been

assumed that words are recognized as a whole and not letter by letter. The assumption was driven by

several observations: letters are more quickly and accurately identified within the context of words

(i.e. the 'word superiority effect') (Cattell, 1886; Reicher, 1969; Wheeler, 1970), and word recognition

speed is insensitive to the number of letters in (3–6-letter) words (Nazir et al., 1998). The contrasting

approach has stressed the importance of processing letters (McClelland and Rumelhart,1981),

thereby corroborating other observations: separately identifying letters constrains word recognition

(Pelli et al., 2003); clinical cases showing an impairment of letters identification with a preservation of

word identification in patients with specific brain lesions (Patterson and Kay, 1982); the deficiency of

dyslexic children to identify objects (e.g. letters) in clutter (e.g. a word) (Martelli and Di Filippo, 2009)

which is substantially remediated when increasing letter spacing (Zorzi et al., 2012). Recently,

neuroimaging studies contrasted letter strings and whole words and thereby suggested that despite a

27

cortical whole word selectivity (Glezer et al., 2009) it is highly probable that visual stimuli are first

encoded as letters before their combinations are encoded as words (Thesen et al., 2012). The

present findings provide further support to this assumption by modulating perceptual levels of single

words.

Future studies of the VWFA can also benefit from the current results: Despite extensive research on

the functionality and expertise of the VWFA during reading, very little is known about the functional

dynamics of this area especially due to the scarce neuroimaging methods combining high temporal

sampling with good spatial resolution. The present findings altogether suggest that orthographic

features by themselves trigger a first gamma response (at ca. 200 ms) in the VWFA, regardless of

pronounceability, but also that genuine word forms induce a concurrent, yet much stronger response

in that area (Figure 2C). These findings corroborate similar recent intracranial reports of the dynamics

of different orthographic categories in the VWFA (Vidal et al., 2010; Hamame et al., 2013; Vidal et al.,

2014). Furthermore, in both Experiments I and II, we note power enhancement at the descent from

the peak during Full Perception, although in Experiment II it is mostly masked out by the earlier

activation peak (Figure 2C, middle panel). A similar pattern was also observed in recent intracranial

studies (Thesen et al., 2012; Hamame et al., 2013) and may reflect top-down feedback from anterior

(lexical and phonological) areas (Song et al., 2012) which are activated earlier than the occipito-

temporal cortex during word reading (Pammer et al., 2004; Cornelissen et al., 2009) and object

perception (Bar et al., 2006). Finally, this two-staged scenario may reconcile the two major (opposing)

views on the role of the occipito-temporal junction during word perception (for a recent review see

Carreiras et al., 2014). Specifically, it confirms an early first bottom up sweep in this region highly

sensitive to word representational units (for a review see Dehaene and Cohen, 2011; for an empirical

28

demonstration see Hamame et al, 2013), followed by a top-down phonological/semantic input from

higher areas which does not necessarily assume a selective tuning to those representational units

(for a review see Price and Devlin, 2011; for an empirical demonstration see Woodhead et al., 2014).

In the future, combining time-resolved approaches with network dynamics should further investigate

this novel outlook on word processing in the brain.

References

Aru J, Axmacher N, Do Lam AT, Fell J, Elger CE, Singer W, Melloni L. 2012. Local category-specific

gamma band responses in the visual cortex do not reflect conscious perception. J Neurosci.

32:14909–14914.

Bar M, Tootell RB, Schacter DL, Greve DN, Fischl B, Mendola JD, Rosen BR, Dale AM. 2001.

Cortical mechanisms specific to explicit visual object recognition. Neuron. 29(2):529-535.

Bar M, Kassam KS, Ghuman AS, Boshyan J, Schmid AM, Dale AM, Hämäläinen MS, Marinkovic K,

Schacter DL, Rosen BR, Halgren E. 2006. Top-down facilitation of visual recognition. Proc Natl Acad

Sci U S A. 103(2):449-454.

Bastos AM, Vezoli J, Fries P. 2014. Communication through coherence with inter-areal delays.

Curr Opin Neurobiol. 31C:173-180.

29

Bastos AM, Vezoli J, Bosman CA, Schoffelen JM, Oostenveld R, Dowdall JR, De Weerd P, Kennedy

H, Fries P. 2015. Visual Areas Exert Feedforward and Feedback Influences through Distinct

Frequency Channels. Neuron. 85:390-401.

Bosman CA, Schoffelen JM, Brunet N, Oostenveld R, Bastos AM, Womelsdorf T, Rubehn B, Stieglitz

T, De Weerd P, Fries P. 2012. Attentional stimulus selection through selective synchronization

between monkey visual areas. Neuron. 75:875-888.

Carreiras M, Armstrong BC, Perea M, Frost R. 2014. The what, when, where, and how of visual word

recognition. Trends Cogn Sci. 18:90–98.

Cattell JM. 1886. The time it takes to see and name objects. Mind. 11:53–65.

de Gardelle V, Sackur J, Kouider S. 2009. Perceptual illusions in brief visual presentations.

Conscious Cogn. 18(3):569-577.

Dehaene S, Cohen L. 2011. The unique role of the visual word form area in reading. Trends Cogn

Sci. 15:254–262.

30

Dehaene S, Changeux JP, Naccache L, Sackur J, Sergent C. 2006. Conscious, preconscious, and

subliminal processing:a testable taxonomy. Trends Cogn Sci. 10:204–211.

Del Cul A, Baillet S, Dehaene S. 2007. Brain dynamics underlying the nonlinear threshold for access

to consciousness. PLoS Biol. 5:e260.

Driver J, Baylis GC. 1996. Edge-assignment and figure-ground segmentation in short-term visual

matching. Cognitive Psychology. 31: 248–306.

Fisch L, Privman E, Ramot M, Harel M, Nir Y, Kipervasser S, Andelman F, Neufeld MY, Kramer U,

Fried I, Malach R. 2009. Neural ‘‘ignition’’: Enhanced activation linked to perceptual awareness in

human ventral stream visual cortex. Neuron. 64: 562–574.

Fries P. 2005. A mechanism for cognitive dynamics:neuronal communication through neuronal

coherence. Trends Cogn Sci. 9:474–480

Fries P. 2009. Neuronal gamma-band synchronization as a fundamental process in cortical

computation. Annu Rev Neurosci. 32:209–224

31

Frost R, Plaut D. 2005. The word–frequency database for printed Hebrew. Retrieved from http://word–

freq.mscc.huji.ac.il/index.html

Glezer LS, Jiang X, Riesenhuber M. 2009. Evidence for Highly Selective Neuronal Tuning to Whole

Words in the ‘‘Visual Word Form Area". Neuron. 62:192–204.

Greene MR, Oliva A. 2009. Recognition of natural scenes from global properties: seeing the forest

without representing the trees. Cogn Psychol. 2009. 58:137-176.

Grill-Spector K, Kanwisher N. 2005. Visual recognition: as soon as you know it is there, you know

what it is. Psychol Sci. 16(2):152-160.

Hamame CM, Szwed M, Sharman M, Vidal JR, Perrone–Bertolotti M, Kahane P, Bertrand O,

Lachaux JP. 2013. Dejerine's reading area revisited with intracranial EEG:selective responses to

letter strings. Neurology. 80:602–603.

Haynes JD, Driver J, Rees G. 2005. Visibility reflects dynamic changes of effective connectivity

between V1 and fusiform cortex. Neuron. 46:811-21.

32

Imamoglu F, Heinzle J, Imfeld A, Haynes JD. 2014. Activity in high-level brain regions reflects

visibility of low-level stimuli. Neuroimage. 102:688-694.

Kouider S, Dupoux E. 2004. Partial awareness creates the "illusion" of subliminal semantic priming.

Psychol Sci. 15:75-81.

Kouider S, de Gardelle V, Sackur J, Dupoux E. 2010. How rich is consciousness? The partial

awareness hypothesis. Trends Cogn Sci. 14:301–307.

Lamme VA. 2006. Towards a true neural stance on consciousness. Trends Cogn Sci. 10:494–501.

Levy J, Pernet C, Treserras S, Boulanouar K, Berry I, Aubry F, Démonet JF and P Celsis. 2008.

Piecemeal recruitment of left–lateralized brain areas during reading:a spatio–functional account.

Neuroimage. 43:581–591.

Levy J, Vidal JR, Oostenveld R, Fitzpatrick I, Démonet JF and Fries P. 2013. Alpha–band

Suppression in the Visual Word Form Area as a Functional Bottleneck to Consciousness.

Neuroimage. 78:33–45.

33

Macmillan NA, Kaplan HL. 1985. Detection theory analysis of group data: Estimating sensitivity from

average hit and false-alarm rates. Psychological Bulletin. 98:185–199.

Mainy N, Jung J, Baciu M, Kahane P, Schoendorff B, Minotti L, Hoffmann D, Bertrand O, Lachaux JP.

2008. Cortical dynamics of word recognition. Hum Brain Mapp. 29:1215–1230.

Mangan B. 2001. Sensation’s ghost—the non-sensory ‘‘fringe’’ of consciousness. Psyche. 7:18.

Martelli M, Di Filippo G. 2009. Crowding, reading, and developmental dyslexia. J Vis. 9:1–18.

McClelland JL, Rumelhart DE. 1981. An interactive activation model of context effects in letter

perception: Part 1. An account of basic findings. Psychological Review. 88:375–407.

Nakayama K, He ZJ, Shimojo S. 1995. Visual surface representation: A critical link between lower-

level and higher-level vision. In SM Kosslyn & DN Osherson (Eds.), An invitation to cognitive science:

Visual cognition (pp. 1–70). Cambridge, MA: MIT Press.

34

Nazir TA, Jacobs AM, O'Regan JK. 1998. Letter legibility and visual word recognition. Mem Cogn.

26:810–821.

Overgaard M, Rote J, Mouridsen K, Ramsøy TZ. 2006. Is conscious perception gradual or

dichotomous? A comparison of reportmethodologies during a visual task. Conscious Cogn. 15:700-

708.

Overgaard M, Mogensen J. 2014. Visual perception from the perspective of a representational, non-

reductionistic, level-dependent account of perception and conscious awareness. Philos Trans R Soc

Lond B Biol Sci. 369:20130209.

Patterson K, Kay J. 1982. Letter–by–letter reading:psychological descriptions of a neurological

syndrome. Q J Exp Psychol. A 34:411–441.

Pelli DG, Farell B, Moore DC. 2003. The remarkable inefficiency of word recognition. Nature.

423:752–756.

Price C, Devlin JT. 2011. The Interactive Account of ventral occipito–temporal contributions to

reading. Trends Cogn Sci. 15:246–253.

35

Reicher GM. 1969. Perceptual recognition as a function of meaningfulness of stimulus material. J Exp

Psychol. 81:275–280.

Salinas E, Sejnowski TJ. 2001. Correlated neuronal activity and the flow of neural information.

Nat Rev Neurosci. 2:539-550.

Sergent C, Baillet S, Dehaene S. 2005. Timing of the brain events underlying access to

consciousness during the attentional blink. Nat Neurosci. 8:1391–1400.

Stanislaw H, Todorov N. 1999. Calculation of signal detection theory measures. Behav Res Methods

Instrum Comput. 31:137–149.

Tarkiainen A, Cornelissen PL, Salmelin R. 2002. Dynamics of visual feature analysis and object-level

processing in face versus letter-string perception. Brain. 125:1125–1136.

Tal I, Abeles M. 2013. Cleaning MEG artifacts by external cues. J Neurosci Methods. 217:31–38.

36

Thesen T, McDonald CR, Carlson C, Doyle W, Cash S, Sherfey J, Felsovalyi O, Girard H, Barr W,

Devinsky O, Kuzniecky R, Halgren E. 2012. Sequential then interactive processing of letters and

words in the left fusiform gyrus. Nat Commun. 3:1284.

Vidal JR, Ossandon T, Jerbi K, Dalal SS, Minotti L, Ryvlin P, Kahane P, Lachaux JP. 2010.

Category–specific visual responses:an intracranial study comparing gamma, beta, alpha, and ERP

response selectivity. Front Hum Neurosci. 4:195.

Vidal JR, Perrone-Bertolotti M, Levy J, De Palma L, Minotti L, Kahane P, Bertrand O, Lutz A, Jerbi K,

Lachaux JP. 2014. Neural repetition suppression in ventral occipito-temporal cortex occurs during

conscious and unconscious processing of frequent stimuli. Neuroimage. 95:129-135.

Vidal JR, Perrone-Bertolotti M, Kahane P, Lachaux JP. 2015. Intracranial spectral amplitude

dynamics of perceptual suppression in fronto-insular, occipito-temporal, and primary visual cortex.

Front Psychol. 5:1545.

Wheeler DD. 1970. Processes in word recognition. Cogn Psychol. 1:59–85.

Woodhead ZV, Barnes GR, Penny W, Moran R, Teki S, Price CJ, Leff AP. 2014. Reading front to

back:MEG evidence for early feedback effects during word recognition. Cereb Cortex. 24:817–825.

37

Wu CT, Crouzet SM, Thorpe SJ, Fabre-Thorpe M. 2015. At 120 msec you can spot the animal but

you don't yet know it's a dog. J Cogn Neurosci. 27:141-149.

Zorzi M, Barbiero C, Facoetti A, Lonciari I, Carrozzi M, Montico M, Bravar L, George F, Pech–Georgel

C, Ziegler JC. 2012. Extra–large letter spacing improves reading in dyslexia. Proc Natl Acad Sci U S

A. 109:11455–11459.

Figure captions

Figure 1. Experimental Design. (A) Experiment I: Masked Dutch words were presented in an online

staircase procedure resulting in two levels of conscious perception: Full and Partial in a 50/50%

proportion, respectively. The two levels were reflected by either a forced-choice task of semantic

decision ("is the word animate/inanimate?") or of orthographic detection ("did the visual input contain

any letters or none?"), respectively. The bottom panel shows the luminance of the forward and

backward masks which fluctuated between three levels, while depending on the reported percept,

only the level with a 50/50% proportion was selected for further spectral analyses. (B) Experiment II:

Masked Hebrew words were presented in an offline staircase procedure resulting in three levels of

conscious perception: Full, Partial and Null, with their corresponding d' values. The three levels were

then used online in a semi-passive paradigm (filler-words' detection; frequency of occurrence ca. 5%).

38

Figure 2. Time and frequency representations of piecemeal conscious perception in the VWFA and

along the gamma-band spectrum. (A) Statistical maps illustrate time-frequency dynamics in the

VWFA for the contrast between Full and Partial in Experiments I (left) and II (middle), and for the

contrast between Partial and Null (right). Rectangles illustrate the presentation times of the masks

('M', black rectangles) and the stimulus ('S', gray rectangles). (B) Plots of the temporal evolution of

MG power (normalized to baseline activity) in the VWFA for the different perceptual levels, in

Experiments I (left) and II (middle). Shades represent ±1 SEM, while purple and gray rectangles

indicate statistically significant effects on the time axis. Rectangles illustrate the presentation times of

the masks ('M', black rectangles) and the stimulus ('S', gray rectangles). (C) Source-reconstruction of

gamma oscillatory activity (at 250-500ms and 60-80 Hz) during Full perception is represented on

overlaid cortical surface (MNI template) in Experiment I (left) and in Experiment II (right).

Figure 3. Statistical maps illustrate the masked significant (*: p < 0.005 uncorrected; **: p < 0.05

corrected ; ***: p < 0.001 corrected) time-frequency phase (Rayleigh) dynamics in the VWFA for the

contrast between Partial and Null (left), and for the contrast between Full and Partial in Experiment II

(middle) and I (right). Rectangles illustrate the presentation times of the masks ('M', black rectangles)

and the stimulus ('S', gray rectangles).

Figure 4. Power and synchrony correlations during Full perception in Experiments I (left) and II

(middle) and during Partial perception (right). Dashed lines represent the confidence interval (p <

0.05) for the regression curve.

39

Figure 5. (A) The group-mean middle gamma/ high gamma (MG/HG) ratio is shown across the

different perceptual levels in Experiments I (left panel) and II (middle panel), as well as at the single-

subject level (right panel for an illustration of an exemplar subject). Error-bars represent ±1 SEM, and

asterisks convey statistically significant (p < 0.05) differences between the levels. (B) The temporal

curves of the corrected significant (p < 0.05 corrected) t/z-values for the three neural processes.

Rectangles illustrate the presentation times of the masks ('M', black rectangles) and the stimulus ('S',

gray rectangles). (C) The correlation between phase-synchrony during full perception and the

behavioral acuity (Δd' of full and partial perception).

Supporting Information

SI: Results

Processing levels of word perception along key coordinates in the visual system

In order to further probe the conjecture that the time-frequency dynamics in the gamma band

originate from the left occipito-temporal junction in general, and the VWFA in particular

(Figure 2), additional virtual channel analyses were conducted at three more coordinates in

the occipito-temporal cortex: the VWFA right hemispheric homolog (Figure S1, upper

panels), the left posterior occipital cortex (Figure S1, middle panels) and the right posterior

occipital cortex (Figure S1, bottom panels). The following analyses examined the latency

during which (corrected) significant effects could be detected. First, to test for hemispheric

laterality selectivity which is preponderantly assumed for this region, we tested for differences

in the right homolog of the VWFA: there were no significant effects across conditions in the

two experiments (p > 0.2 in both experiments) (Figure S1, upper panels).

Second, in order to test whether those effects were visual effects propagating throughout the

visual cortex, in Experiment I, we selected the coordinates in the left occipital cortex (the left

middle occipital gyrus, BA18), wherein the broad band gamma (60-140 Hz) peaked when

contrasting all stimuli versus baseline. At 600 – 700 ms, broad band gamma in Full perception

was significantly more robust than Partial perception (p = 0.01 corrected) (Figure S1, middle

panels). Similarly, in Experiment II, we also selected the coordinates in the left occipital

cortex (the left lingual gyrus, BA17), wherein the broad band gamma peaked when

contrasting all stimuli versus baseline (Figure 3B, upper middle panel). Hence, we expected

that activity in those coordinate would not reflect word perception but rather general visual

processing. The result matched our expectation; An ANOVA was computed across the three

conditions (p = 0.009 corrected) in that left occipital region, and post-hoc tests revealed

insignificant power change between Full and Partial (p > 0.2 uncorrected), and instead, a

significant power increase of the Null condition relative to the Partial condition (p < 0. 01

corrected) at 100-250ms. Furthermore, contrast luminance modulation had a clear-cut effect

on broad band gamma in those coordinates at 100-250 ms (p < 0.0005 corrected), that is, the

blank condition with higher contrast yielded stronger response than the blank condition with

the lower contrast at 50-300ms. Hence, we conclude that word perception, Full or Partial,

could not be explained by broad band gamma power modulation in the left occipital cortex.

Instead, broad band gamma modulation in this region was highly sensitive and driven by

visual masking strength alone.

Finally, in Experiment I, we also tested the right homologue of the previous coordinates in the

left occipital cortex, that is, the right lingual gyrus (BA17). There was no significant

difference in the broad band gamma between Full and Partial at these coordinates (p = 0.10

uncorrected) (Figure S1, bottom panels). In Experiment II, we also tested the right homologue

of the previous coordinates in the left occipital cortex, that is, the right lingual gyrus (BA17).

An ANOVA was computed across the three conditions (p<0.03 corrected) in that left occipital

region, and post-hoc tests revealed a significant power increase of the Null condition relative

to the Full condition (p < 0. 05 corrected) at 100-150ms, and relative to the Partial condition

(p < 0. 05 corrected) at 100-250ms. There was no power modulation between Full and Partial

(p = 0.09 uncorrected). Similarly to its left homolog, also in this area the masked blank

conditions were significantly enhanced (p < 0. 0005 corrected) for the stronger masking at 50-

300ms. Hence, we conclude that in this right occipital region, power was unchanged in the

Full and Partial conditions, and instead it increased for Null perception at an early time

interval, and that this increase was clearly attributed to bottom up load increase (masking

strength).

The synchrony effects (Figure 3) were also tested along the occipito-temporal cortex. The

contrast Partial versus Null (Experiment II) did not yield any significant effect in either the

rVWFA (p = 0.25), the right (p = 0.45) or the left occipital (p = 0.24). Furthermore, the

contrast Full versus Partial did not yield any significant effect in either the rVWFA

(Experiment I: p = 0.71, Experiment II: p = 0.70), or the left occipital (Experiment I: p = 0.26,

Experiment II: p = 0.13). In the right occipital, there was no significant effect in Experiment I

(p = 0.43), however, in Experiment II there was a significant suppression effect (p = 0.02

corrected) peaking at 40-50Hz 55-75ms; yet, this latter effect was not overlapping with either

effects found in the VWFA (Figure 3, middle and right panels). Hence, the findings suggest

that the synchrony effects located in the VWFA (Figure 3) in both experiments were selective

to this area, at least along the occipito-temporal cortex.

Supplementary Figures

Figure S1. Source-reconstruction of gamma oscillatory activity (at 0-250ms and 80-

100 Hz) during Full perception is represented on overlaid cortical surface (MNI

template) in Experiment I (upper panel) and in Experiment II (lower panel).

Figure S2. Plots of the temporal evolution of MG power (normalized to baseline

activity) in the right-VWFA (upper panel), left (middle panel) and right (lower panel)

posterior occipital cortex, for the different perceptual levels, in Experiments I (left) and

II (middle). Shades represent ±1 SEM, while purple and gray rectangles indicate

statistically significant effects on the time axis.

Figure S3. Statistical (masked at p < 0.05 corrected) maps illustrate time-frequency

dynamics in the VWFA for the conditions Full, partial, Null and the two blank controls

in Experiment II.

Figure S4. Statistical maps illustrate the extent of phase-locking (Rayleigh) of all

conditions in Experiment II (9>z-values >8.5) (left upper panel), as well as each

experimental and control conditions, separately in Experiment II (upper panel and left

lower panel), and in Experiment I (right lower panel).

Supplementary Tables

Table S1 Perceptual parameters for each level

Perceptual level d' Hit rate Y False-alarm rate

Full Word Perception

Mean 2.87 0.92 9 0.08

Standard deviation 0.42 0.04 9 0.05

Null Word Perception

Mean 0.51 0.46 9 0.30

Standard deviation 0.77 0.20 9 0.20

Full Letter Perception

Mean 2.45 0.84 9 0.10

Standard deviation 0.68 0.13 9 0.07

Null Letter Perception

Mean 0.28 0.39 9 0.27

Standard deviation 0.77 0.28 9 0.14