Projects and supervisors

CLOSING THE LOOP BETWEEN SOUNDS AND AUDITORY COGNITION BY LEARNING METRICS, INTERPRETING DEEP NEURAL NETWORKS AND SYNTHESIZING SOUNDS

Supervisors: Richard Kronland-Martinet (PRISM) & Valentin Emiya (LIS) / Stéphane Ayache (LIS)

Collaborations: Bruno Torresani (I2M)

Summary

Extracting meaning from sounds is a crucial ability for humans to communicate. Although transformations of the physical vibrations into a neural activity conveyed by the peripheral auditory system are reasonably well known, non-linear transformations carried out at higher cortical levels remain remarkably poorly understood. However, understanding these transformations is crucial to significantly improve the knowledge on auditory cognition by linking the properties of sounds to their perceptual and behavioral outcomes. Simultaneously, progress in the field of machine learning now allows for the training of deep neural networks that are able to reproduce complex cognitive tasks, such as musical genre classification or words recognition (Kell et al., 2018), and even to generate realistic sounds such as those produced by a human (Van Den Oord et al., 2016). These frameworks hence provide artificial auditory systems that compete with and sometimes outmatch human abilities. However, interpreting the transformations carried out by these “black boxes” remains a crucial challenge in particular to understand which acoustic information they use to achieve these tasks. This post-doc project aims to capitalize on the unique expertise of three ILCB laboratories to address this challenge with Richard Kronland-Martinet at the PRISM lab carrying expertise is the field of auditory cognition and sound synthesis, and with Valentin Emiya and Stéphane Ayache at the LIS, respectively experts in the fields of signal processing and machine learning. In addition to this supervision, the post-doc will benefit of collaborations with other ILCB members and in particular with Bruno Torrésani at I2M, expert researcher on mathematical representations of sounds. The project will be more specifically organized in three tasks (1) training deep networks and metrics to have similar performances to that of humans (LIS) (2) interpreting these computational frameworks in the light of neuromimetic mathematical representations of sounds (LIS/PRISM) (3) evaluating the perceptual plausibility of these representations by the production of sounds (PRISM). This project, at the intersection of computational auditory cognition, machine learning, and signal processing, will set the foundation for a systematic investigation of auditory representations by developing a methodology to train networks, probe their internal representations, and evaluate their perceptual relevance. In this sense, this project will manage to leverage new transdisciplinary synergies in the ILCB through the interlocking of complementary scientific methodologies.

Kell, A. J., Yamins, D. L., Shook, E. N., Norman-Haignere, S. V., & McDermott, J. H. (2018). A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron, 98(3), 630-644.

Van Den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., ... & Kavukcuoglu, K. (2016). WaveNet: A generative model for raw audio. SSW, 125.

Neural mechanisms supporting brain asymmetry for speech and music

PhD topic proposed in co-supervision between:
Benjamin Morillon (Institut de Neurosciences des Systèmes, INS)
Robert Zatorre (McGill University, Montreal, Canada)

(potential) ILCB collaborators: Daniele Schon (INS), Andrea Brovelli (INT), Pascal Belin (INT)
(potential) external collaborators: Anne-Lise Giraud (Geneva, Switzerland), Philippe Albouy (Quebec, Canada), Luc Arnal, (Paris, France)

A major debate in cognitive neuroscience concerns whether brain asymmetry for speech and music emerges from differential sensitivity to acoustical cues or from domain-specific neural networks. This debate is closely related to the question of the origins of hemispheric specialization. Despite years of debate and empirical work, these issues have remained unresolved, and indeed have generated intense disagreement in the literature. We believe this situation is due to the insufficiently specific computational specification of prior models, and to a lack of clear grounding in neurophysiology.
This PhD project will tackle these questions by taking advantage of the spectrotemporal modulation framework, a rigorous approach that has received much support from single-neuron recordings and human imaging. According to this framework, auditory cortical neurons are best characterized functionally in terms of their responses to spectral and temporal power fluctuations (Singh and Theunissen 2003, Chi et al. 2005, Flinker et al. 2019).

In a set of inter-related studies involving human participants, the PhD candidate will investigate the respective sensitivity of the left and right hemispheres to low-level acoustical cues. The respective neural dynamics underlying auditory processing in left and right hemispheres will be characterized, and their selective role in the processing of speech and music will be highlighted. This will be done by 1- taking advantage of the spectrotemporal modulation framework, 2- capitalizing on a recently created corpus of sung speech stimuli in which melodic and verbal content is crossed and balanced, and 3- recording neural responses with intracranial and scalp recordings of human brain activity (with intracranial electroencephalography, iEEG and magnetoencephalography, MEG).

PhD cosupervisors: Nicolas Claidière (laboratoire de Psychologie cognitive) and Noël Nguyen (laboratoire Parole et langage), with the participation of Leonardo Lancia (laboratoire de phonétique et phonologie, Univ Paris 3 & CNRS)

Abstract

In spoken language interactions, and for people to understand each other, speech sounds must be categorized consistently across listeners. Within a linguistic community, a common set of criteria must therefore be agreed upon, as regards how phonemic categories are delineated in the speech sound space. In spite of their central importance for social cognition and speech sciences, little attention has been devoted so far to the mechanisms that allow this shared perceptual landscape to emerge. The goal of this project will be to explore these mechanisms as they deploy within a group of participants, in an experimental framework.

The PhD student will contribute to devising an ensemble of innovative, joint-perception experiments, in which each listener's perceptual behavior can be affected by that of the other listeners. This will consist, for example, in having groups of listeners construct a mapping between unfamiliar speech sounds and sets of entities (e.g., visual shapes) in a coordinated way, within an experimental set-up that will make it possible for information to flow between listeners. Issues of interest will include the impact of local, pairwise interactions on the dynamics of the entire group, the geometry of the shared speech sound space and how it evolves over the course of the interactions between listeners, the potential benefit of performing a speech perception task collectively rather than individually.

To a very large extent, these issues remain unexplored in the speech perception domain. However, fruitful connections can be made with related, albeit different lines of work, which are concerned with cultural transmission in both humans and non-human species (Claidière, Smith, Kirby & Fagot, 2014) and with experimental approaches to the emergence and evolution of language (e.g. Kirby et al., 2008, Xu et al, 2013). In this project, a bridge will be established between these different domains, which will allow the PhD candidate to exploit the experimental methods and mathematical tools developed in studies on cultural transmission, to further our understanding of how speech perception works, and how it contributes to the evolution of phonological systems.

For more information:

Nicolas Claidière’s website: http://www.nicolas.claidiere.fr/
Noël Nguyen’s webpage: https://cv.archives-ouvertes.fr/noel-nguyen
Leonardo Lancia’s list of publications: http://lpp.in2p3.fr/Publications-776

Wavelet-based multidimensional characterization of brain networks in language tasks

Advisors:
B. Torrésani, Institut de Mathématiques de Marseille
C. Bénar, Institut de Neurosciences des Systèmes

Collaborations.
Agnès Trébuchon, AP-HM, Marseille
Jean-Marc Lina, Centre de Recherches Mathématiques, Montréal, Canada

Abstract
Current techniques for extracting spatio-temporal networks in MEG and EEG suffer from the inherent difficulties arising from solvin the inverse problem (i.e. projecting the data from surface sensors to brain sources). We propose here to use a novel wavelet analysis approach in order to improve the extraction of language networks from MEG signals. The methods will be validated using simultaneous MEG-intracerebral EEG recordings.

Rationale
Brain function involves complex interactions between cortical areas at different spatial and temporal scales. Thus, the spatio-temporal definition of brain networks is one of the main current challenges in neuroscience. With this objective in view, electrophysiological techniques such as electroencephalography (EEG) and magnetoencephalography (MEG) offer a fine temporal resolution that allows capturing fast changes (at the level of the millisecond) across a wide range of frequencies (up to 100 Hz).
However, the spatial aspects require solving a difficult (extremely ill-posed) inverse problem that projects the signals recorded at the level of surface sensors to the cortex. Most existing methods suffer from several drawbacks, two of which will be addressed in this project:

• data are processed at each time sample independently, disregarding time correlations. This is not optimal in terms of robustness to noise - a key issue in such ill-posed inverse problems, where noise sensitivity is extremely high. In addition, not accounting for time correlations at the sensor level is extremely penalizing if one aims at estimating spatio-temporal networks at the source level.
• Current methods suffer severely from 'leakage', i.e. the activity in a given region 'spills' onto neighbouring regions. This is mainly a consequence of the ill-posedness of the inverse problem, which impose regularizations that tend to oversmooth the solution. Reducing leakage can be obtained by using sparsity-enforcing spatial regularizations, however defining such regularizations for signals supported on the cortical surface requires adequate representations for such signals

Recent advances in computer capacities, algorithm design and computational statistics allow now handling together space, time and frequency aspects of the brain signals into a combined simultaneous approach, instead of applying them in consecutive steps. The use of wavelet representations in the time domain has been shown to yield very significant dimension reduction, wavelet representations of functions supported on surfaces are expected to allow similar reductions in the spatial domain. The use of (large) spatio -temporal covariance matrix enables taking advantage of temporal correlation, provided the “curse of dimensionality” can be correctly controlled. Multivariate tensorial techniques can now extend classical principal component analysis and handle data with many dimensions (time, space, frequency, trials conditions, subjects). Taken together, this has the potential to improve considerably the signal to noise ratio, and also to handle the source leakage issues by capturing all leaked (zero-lag) activity originating from one region into a single component, thus providing a much finer spatio-temporal resolution.

Objectives
The objective of this PhD project is to develop algorithms and data analysis procedures for spatio-temporal characterization of brain networks across multiple frequencies, for EEG and MEG signals, validate them on simulated and real signals, and apply the developed methodology on language protocols in the freamework of ILCB
In terms of algorithms and data analysis procedures, two ways will be investigated.
• On the one hand, the Bayesian combined space-time inverse problems approaches (the KwMEM algorithm) currently developed at I2M (Roubaud et al 2018) will be extended. The latter exploit sophisticated dimension reduction (in time and space), matrix factorization and optimization techniques to control the curse of dimensionality and process directly space-time measurements. A main extension will involve the use of cortical wavelets , i.e. spatial domain wavelets (Özkaya 2013, Özkaya & Vandeville 2011) to describe spatial variations of activity on the cortical suface. The use of time domain wavelet frames (which are translation invariant) instead of bases and the investigation of several (space-time) prior distributions for cortical sources (in addition to the currently used gaussian mixture priors) will be investigated. Also, sparse multivariate techniques will be applied to the estimated sources to infer space-time graphs for modeling brain networks. Again, given the size of these data, the curse of dimensionality will have to be handled appropriately, which should be permitted thanks to the cortical wavelet expansions.
• Besides, multivariate analysis techniques will be considered as alternatives to the inverse problem resolution. It has been shown that these provide simple tools for source localization and separation at the sensor level, that can be exploited further for localization. Modern multivariate approaches developed at I2M in another context (i.e. NMR and/or fluorescence spectroscopy), namely sparse tensor factorizations, are expected to provide simple approaches that could handle higher dimensional data (such as time-frequency-space, or time-frequency-space-trial).
• The developed approaches will be compared with classical methods (beamformer, minimum norm estimates), first on simulated data and then in real data obtained in language tasks.

In particular, we will use simultaneous EEG-MEG-intracerebral data obtained at the INS in patient during presurgical evaluation of epilepsy (Figure 1). These data will provide an intracerebral "ground truth" to which non-invasive results can be compared.

Context
This project will be a collaboration between the Institut de Mathématiques de Marseille (I2M; B Torrésani) and the Institut de Neuroscience des Systèmes (INS, C Bénar, JM Badier, A Trébuchon). The I2M Signal-Image team is specialized in the design of state-of-the art signal processing algorithms, involving sparsity constraints and wavelet/time-frequency analysis, together with computational statistics. The INS has extensive experience in the recording and analysis of brain signals, including trimodal EEG-MEG-intracerebral acquisitions.

Figure 1: Multivariate and multimodal graph characterization in an auditory language taks. A non linear correlation (h2) graph was computed between the SEEG signal (H' electrode, in the auditory cortex) and MEG signals (sources obtained from independent component analysis), in response to an auditory language protocol ("Ba" and "Pa" sounds,(Trebuchon-Da Fonseca et al. , 2005)) (figure credit S. Medina, Dynamap team INS). The ICA permits to render the problem sparse (by reducing data dimension). It is based though on a constraint of independence that is not fully justified; the methods proposed in this project will help introducing physiologically-relevant sparsity constraints based on a multiscale (wavelet) approach.

References
Badier J M, Dubarry A S, Gavaret M, Chen S, Trebuchon A S, Marquis P, Regis J, Bartolomei F, Benar C G and Carron R 2017 Technical solutions for simultaneous MEG and SEEG recordings: towards routine clinical use Physiol Meas 38 N118-N27
Cong F, Lin Q H, Kuang L D, Gong X F, Astikainen P and Ristaniemi T 2015 Tensor decomposition of EEG signals: a brief review J Neurosci Methods 248 59-69
Dubarry A S, Badier J M, Trebuchon-Da Fonseca A, Gavaret M, Carron R, Bartolomei F, Liegeois-Chauvel C, Regis J, Chauvel P, Alario F X and Benar C G 2014 Simultaneous recording of MEG, EEG and intracerebral EEG during visual stimulation: from feasibility to single-trial analysis Neuroimage 99 548-58
Lina JM, Chowdhury R, Lemay E, Kobayashi E and Grova C 2014 Wavelet-based localization of oscillatory sources from magnetoencephalography, IEEE Trans Biomed Eng 61 2350-64
Özkaya SG 2013 Randomized Wavelets on Arbitrary Domains and Applications to Functional MRI Analysis, PhD Thesis, Princeton University, Program in Applied and COmputational Mathematics
Özkaya SG and Van De Ville D 2011 Anatomically adapted wavelets for integrated statistical analysis of fMRI data, 2011 IEEE International Symposium on Biomedical Imaging: From Nano to Macro
Palva J M, Wang S H, Palva S, Zhigalov A, Monto S, Brookes M J, Schoffelen J M and Jerbi K 2018 Ghost interactions in MEG/EEG source space: A note of caution on inter-areal coupling measures Neuroimage 173 632-643
Roubaud MC, Carrier J, Lina JM and Torrésani B 2018 Space-time extension of the MEM approach for electromagnetic neuroimaging, IEEE conference on Machine Learning and Signal Processing (MLSP 2018)
Toumi I, Torresani B and Caldarelli S 2013 Effective Processing of Pulse Field Gradient NMR of Mixtures by Blind Source Separation Anal Chem 85 11344-51
Vu XT, Chaux C, Thirion-Moreau N, Maire S and Carstea EM 2017 Journal of Chemometrics, 31 4

The role of accent in French spoken word recognition

Supervisors: Sophie Dufour & Amandine Michelas

In contrast to languages such as Spanish, in French, the position of accent within a word does not change its meaning (i.e. /'bebe/ “s/he drinks” vs. /be'be/ “baby” in Spanish, whereas in French the two forms mean the same word, ‘baby’). In French, the main accent, called primary accent, affects the last syllable of a larger unit than the word, that is, the accentual phrase. For instance, the monosyllabic word chat “cat” receives primary accent in the following sentence un petit 'chat “the little cat” because it is the last full syllable of the accentual phrase. In contrast, it is unaccented in the sentence un chat 'triste “the sad cat” because it is not in final position within the accentual phrase. To this date, there are numerous demonstrations that in French, accent is used in syntactic parsing and in the segmentation of continuous speech into words (Christophe et al., 2004; Spinelli et al., 2010). However, the role of accent in spoken word recognition is still poorly documented. Since French speakers are inevitably exposed to both the accented and unaccented versions of words, models assuming the storage of multiple variants (Connine, 2004; Goldinger, 1998) predict that accent in French could be represented in the mental lexicon. In this PhD project, using both EEG and behavioral experiments, we will examine how accent is represented in French and how it affects spoken word recognition.

PhD candidates will be expected to have a background in psycholinguistics and/or phonetics and to demonstrate an interest in word recognition and prosody.

Christophe, A., Peperkamp, S., Pallier, C., Block, E., & Mehler, J. (2004). Phonological phrase boundaries constrain lexical access I. Adult data, Journal of Memory and Language, 51, 523-547. Connine, C. M. (2004). It’s not what you hear, but how often you hear it: On the neglected role of phonological variant frequency in auditory word recognition. Psychonomic Bulletin & Review, 11, 1084–1089. Goldinger, S. D. (1998). Echoes of echoes? An episodic theory of lexical access. Psychological Review, 105, 251–279. Spinelli, E., Grimault, N., Meunier, F., & Welby, P. (2010). An intonational cue to word segmentation in phonemically identical sequences, Attention, Perception, & Psychophysics, 72, 775-787.

Behavioural Studies of Voice Perception in Baboons

What does dorsal premotor cortex do during reading and writing?

Supervisors : M. Bonnard (INS), C. Pattamadilok (LPL)

What does dorsal premotor cortex do during reading and writing? Reading and writing activities are closely related.

Several brain imaging studies provided data that suggest a close relationship between these two activities.

For instance, activation of the left dorsal premotor cortex -dPM- (known as Exner’s area: Exner, 1881) which is a key area in writing (Planton et al., 2013) was also reported during visual words and letters processing (Longcamp et al., 2003, 2011, Nakamura et al. 2012).

Our recent study (Pattamadilok et al., 2016) where Transcranial Magnetic Stimulation (TMS) was used to interrupt the function of the left dPM during a visual lexical decision task showed that this area contributes to fluent reading and, therefore, has a functional role in this activity. However, the nature of its contribution remains unclear.

According to the “motor hypothesis”, the left dPM might have a motor function, i.e., learning to read and write strengthens the connectivity between visual and motor systems such that the presence of visual words/letters would automatically activate the associated gestures.

This implicit evocation of writing motor processes would, in turn, reinforce the recognition of written stimuli.

This view is nevertheless questioned by the observations that the left dPM was also activated during keyboard typing, that is, when handwriting gestures were not explicitly or implicitly required (Purcell et al., 2011). These observations have led to an alternative hypothesis that the area may play a more central role in language processing.

According to this “cognitive hypothesis”, the contribution of this area to reading would be due to shared cognitive components between writing and reading, more specifically, the sublexical and serial processes. The main goal of the thesis is to investigate the properties of the left dPM during reading and writing.

More specifically, three issues will be addressed using three applications of stereotaxic TMS: 1) The functional role of this area, with a particular attention to the involvement of the left dPM in motor vs. cognitive aspect of these activities (using –TMS interruptive protocol-), 2)

The properties of neuronal populations in the left dPM based on the hypotheses that the area may contain a homogeneous population of neurons that have both cognitive and motor functions or that it may contain two functionally segregated subpopulations (using TMS adaptation paradigm, Pattamadilok et al., submitted) 3)

The functional connectivity of this area with other brain regions (motor and visual cortices -primary, supplementary-, and other regions within the language network (using combined TMS-EEG). References Exner S. 1881. Untersuchungen uber die Localisation der Functionen in der Grosshirnrinde des. Menschen: Wilhelm Braumuller. Longcamp M, Anton JL, Roth M, Velay JL (2003): Visual presentation of single letters activates a premotor area involved in writing. Neuroimage 19:1492–1500. Nakamura K, Kuo WJ, Pegado F, Cohen L, Tzeng OJL, Dehaene S (2012): Universal brain systems for recognizing word shapes and handwriting gestures during reading. Proc Natl Acad Sci USA 109:20762–20767. Pattamadilok, C., Planton, S., & Bonnard, M. (submitted). Phonology-coding neurons in the ‘Visual Word Form Area’: Evidence from TMS adaptation paradigm.Pattamadilok, C., Ponz, A., Planton, S., & Bonnard, M. (2016). Contribution of writing to reading: Dissociation between cognitive and motor process in the left dorsal premotor cortex. Human Brain Mapping, 37, 1531–1543. Planton S, Jucla M, Roux F-E, D emonet J-F (2013): The “handwriting brain”: a meta-analysis of neuroimaging studies of motor versus orthographic processes. Cortex 49:2772–2787. Purcell JJ, Napoliello EM, Eden GF (2011): A combined fMRI study of typed spelling and reading. Neuroimage 55:750–762.

Le fonctionnement cognitif de l’adulte dyslexique

Supervisors : Pascale Colé

Le fonctionnement cognitif de l’adulte dyslexique/ Cognitive functioning in adults with dyslexia Pascale Colé (Laboratoire de Psychologie Cognitive, UMR 7290) & Christine Assaiante (Laboratoire de Neurosciences Cognitives, UMR 7291)

La lecture de l’adulte dyslexique pose un véritable défi scientifique car malgré des déficits importants dans les bas niveaux de la lecture (décodage), un certain nombre d’entre eux parviennent à poursuivre des études supérieures.

Bien que certains troubles manifestés pendant l’enfance persistent chez l’adulte dyslexique, les rares études conduites chez ces sujets suggèrent l’émergence d’un profil cognitif particulier qui serait, en partie, le résultat de compensations cognitives développées naturellement ou par la rééducation. Les recherches proposées s’intéressent en particulier au rôle du contrôle cognitif et à la mémoire sémantique dans le système de compensations cognitives mis en place et utilisent des techniques très variées : EEG, mouvements oculaires et indices comportements classiques (temps de réaction).

Elles s’intéressent également aux liens qu’entretiennent les représentations du langage et de la sensorimotricité dans l’explication des troubles phonologiques des dyslexiques.

Reading in adults with dyslexia adult poses a real scientific challenge because despite significant deficits in the low levels of reading (decoding), some of them manage to pursue higher education.

Although some of the deficits showed during childhood persist in dyslexic adults, the rare studies conducted in these participants suggest a particular cognitive profile that is, in part, the result of cognitive compensations developed naturally or through rehabilitation. The proposed research focuses in particular on the role of cognitive control and semantic memory in the cognitive compensation system put in place and uses a wide variety of techniques: EEG, eye movements and classical behavior indices (reaction time).

They are also interested in the links between representations of language and sensorimotricity in explaining the phonological deficits in adults with dyslexia.

Adaptive prediction in the joint production of speech

Supervisors : Noël Nguyen, Elin Runnqvist

Adaptive prediction in the joint production of speech Supervisor: Noël Nguyen (LPL) ILCB collaborators: Elin Runnqvist (LPL), Kristof Strijkers (LPL), Mireille Besson (LNC) External collaborators: Alessandro D’Ausilio (IIT / University of Ferrara, Italy), Cristina Baus (UPF, Barcelona) ILCB

Transversal Question #4 (Cerebral and cognitive underpinnings of conversational interactions) In conversational interactions, the mechanisms employed by speakers to predict what their interlocutors will say next, are an essential feature.

Predictive mechanims can account for the fact that turn-taking between conversational partners is performed both smoothly and rapidly.

They are also assumed to contribute to making it easier for each partner to process and understand the other partner’s utterances.

In current influential theoretical frameworks (e.g. Pickering & Garrod, 2013), they rely on a close perception-production link, as it is assumed that prediction of what the other is about to say is based on the speaker’s own spoken language production system.

The goal of this project will be to further explore the brain and cognitive underpinnings of prediction in conversational interactions.

We will use a joint-action experimental paradigm, in which participants will perform a speech production task in conjunction with another human partner or a robot. Recent EEG studies (e.g. Baus et al, 2014) on joint speech production in dyads of human participants have provided evidence that participants predict their partner’s upcoming word using processes that they also use in producing words themselves.

The question at the heart of the present project will be: to what extent is prediction adaptive, ie fine-tuned to the partner’s individual speech production characteristics? If tuning of prediction to the partner’s idiosyncratic speech behavior does take place, over which time scale does it arise, and which speech properties does it focus on?

EEG will be used to explore EEG components (such as the N100, P200, and N400) associated with prediction processes in speaking with a human partner. Building upon previous work on the somatotopic activation of the motor cortex in both speech production and perception (e.g., D'Ausilio et al, 2009, 2014), we will also employ MEG combined with MRI for source localization (Strijkers et al, 2017) to determine to what extent participants predict the articulatory make-up of their partner’s upcoming word.

A robot (Furhat) will be used as the participant's partner in some experiments, with a view to accurately manipulating both the timing and sound shape of the robot's utterances.

The project will contribute to a better understanding of the brain mechanisms that allow us to anticipate our partner's upcoming utterance in conversational interactions.

PhD candidates will be expected to have a solid background in neurolinguistics. Experience with EEG and/or MEG, and with speech processing techniques, will be appreciated. The candidates will be prepared to make stays at both the IIT / Ferrara and the UPF / Barcelona in the course of the PhD.

Why the basal ganglia, cerebellum and medial frontal cortex are critical for the learning of speech sequences

Supervisors : Elin Runnqvist, Sonja Kotz

Why the basal ganglia, cerebellum and medial frontal cortex are critical for the learning of speech sequences

• PhD-project proposal supervised by Elin Runnqvist (LPL) and Sonja Kotz (University of Maastricht)

• Collaborators: Andrea Brovelli (INT)

• QT5: “Temporal Networks” (but also related to QT3 “Language & motor control”)

The ability to interpret and produce structured sound/motor sequences is at the core of human language.

This aspect of language learning most frequently involves auditory input leading to articulatory output (i.e., speech perception used to learn speech production).

The basal ganglia (BG), cerebellum (CB), and medial frontal cortex (MFC) are important neural pillars of reward-based, error-based, and unsupervised learning respectively.

The main aim of the current research is to shed light on the involvement of each type of learning as well as their potential interactions in the acquisition of novel speech sequences. An interesting and open question is to what extent the successful acquisition of novel speech motor sequences engages all learning mechanisms and systems.

Furthermore, a growing body of evidence concerning the reciprocal structural and functional connectivity between BG and CB as well as between these subcortical structures and the MFC raises the question as to what extent the three learning mechanisms work independently or in concert (e.g., Hoshi et al., 2005; Akkal et al., 2007; Bostan et al, 2010; 2013).

For example, it was proposed that specific behaviors or functions can be realized by a combination of multiple learning modules (Doya, 2000). Some authors also argued that such cooperation between BG, CB, and cortex could be benefitial for solving the so-called “credit assignment” problem in learning (Minsky, 1963), that is, getting the right information to the right place and at the right time for it to be effective in guiding the learning process (e.g., Houk and Wise, 1995).

Others have argued that BG and CB may be involved to different extents during different stages of learning (e.g., Doyon et al., 2003). Concretely, in the case of motor sequence learning, the contribution of CB would precede that of BG such that with extended practice CB would no longer be essential and long-lasting retention of a skill should involve representational changes in BG and its associated structures in cortex. In this project, we aim at shedding light on the implication(s) of the three learning mechanisms in speech motor sequence learning as indexed by the involvement of BG, CB, and MFC, putting special emphasis on the functional and dynamical interactions of these brain areas. A multi-methods approach using both fMRI and MEG while participants engage in a shadowing task (i.e., auditory+visual perception followed by overt production) will allow gathering information about the involvement of these regions as well as their structural (diffusion tensor imaging) and functional connectivity. Within the shadowing task we will (a) manipulate reward by providing feedback on accuracy (i.e., well done!), maximizing the possibility of relying on reward based learning and (b) manipulate sensorimotor predictability through the level of noise in the auditory feedback of participants’ own speech, modulating the extent to which it is possible to rely on error-based learning. Participants will be tested behaviorally over several training sessions, and we will manipulate the quantity of training across two conditions so as to have an index of early and late learning stages during a final testing session with fMRI or MEG. Time-series analyses will also be conducted of both fMRI and MEG data in order to examine learning as a continuum. The results will advance our knowledge on the human ability to acquire and produce speech sequences and clarify how two of the most important learning and monitoring systems in the human brain (basal ganglia and cerebellum) might be functionally interconnected and work in concert with the cerebral cortex to sustain learning in cognition.

Multidimensional characterization of brain networks in language tasks

Supervisors : Bruno Torrésani, Christian Bénar

Multidimensional characterization of brain networks in language tasks

B. Torrésani, HDR, Institut de Mathématiques de Marseille C. Bénar, HDR, Institut de Neurosciences des Systèmes

Related transverse question: "Temporal networks" Rationale Brain function involves complex interactions betwen cortical areas at different spatial and temporal scales.

Thus, the spatio-temporal definition of brain networks is one of the main current challenges in neuroscience.

With this objective in view, electrophysiological techniques such as eletroencephalography (EEG) and magnetoencephalography (MEG) offer a fine temporal resolution that allows capturing fast changes (at the level of the millisecond) across a wide range of frequencies (up to 100 Hz). However, the spatial aspects require solving a difficult (extremely ill-posed) inverse problem that projects the signals recorded at the level of surface sensors to the cortex. So far, most existing methods process data at each time sample separately.

This is not optimal in terms of robustness to noise - a key issue in the ill-posed inverse problem which is very sensitive to noise. Moreover, current methods suffer severely from 'leakage', i.e. the activity in a given region 'spills' onto neighbouring regions because of the blurry aspect of typical inverse problem algorithms (Palva et al., 2018).

The use of (large) spatio -temporal covariance matrix enables taking advantage of temporal correlation, provided the curse of dimensionality can be correctly controlled. Multivariate tensorial techniques can now extend classical principal component analysis and handle data with many dimensions (time, space, frequency, trials conditions, subjects) (review in (Cong et al., 2015)). Taken together, this has the potential to prove considerably the signal to noise ratio, and also to handle the source leakage issues by capturing all leaked (zero-lag) activity originating from one region into a single component, thus providing a much finer spatio-temporal resolution.

On the one hand, the Bayesian combined space-time inverse problems approaches currently developed at I2M will be extended. The latter exploit sophisticated dimension reduction (in time and space), matrix factorization and optimization techniques to control the curse of dimensionality and process directly space-time measurements.

Extensions of these techniques will involve the use of wavelet frames (which are translation invariant) instead of bases and the investigation of several (space-time) prior distributions for cortical sources (in addition to the currently used Gaussian mixture priors).

Also, sparse multivariate techniques will be applied to the estimated sources to infer space-time graphs for modeling brain networks. Again, given the size of these data, the curse of dimensionality will have to be handled appropriately.

Besides, multivariate analysis techniques will be considered as alternatives to the inverse problem resolution. It has been shown that these provide simple tools for source localization and separation at the sensor level, that can be exploited further for localization.

Modern multivariate approaches developed at I2M in another context (i.e. NMR and/or fluorescence spectroscopy (Toumi et al., 2013)), namely sparse tensor factorizations (Vu et al 2017), are expected to provide simple approaches that could handle higher dimensional data (such as time-frequency-space, or time-frequency-space-trial).

The developed approaches will be compared with classical methods (beamformer, minimum norm estimates), first on simulated data and then in real data obtained in language tasks.

In particular, we will use simultaneous EEG-MEG-intracerebral data obtained at the INS in patient during presurgical evaluation of epilepsy (Figure 1).

These data will provide an intracererebral "ground truth" to which non-invasive results can be compared (Dubarry et al., 2014; Badier et al., 2017).

We will use data from language protocols that either have been already acquired (Ba/Pa, collaboration with JM Badier), or will be acquired in the next months in the framework of the 'Scales' Project (FLAG-ERA, PI Bénar).

The I2M Signal-Image team is specialized in the design of state-of-the art signal processing algorithms, involving sparsity constraints and time-frequency analysis, together with computational statistics.

The INS has extensive experience in the recording and analysis of brain signals, including trimodal EEG-MEG-intracerebral acquisitions. Figure 1: Multivariate and multimodal graph characterization in an auditory language taks.

A non linear correlation (h2) graph was computed between the SEEG signal (H' electrode, in the auditory cortex) and MEG signals recorded simultaneously (sources obtained from independent component analysis), in response to an auditory language protocol ("Ba" and "Pa" sounds,(Trebuchon-Da Fonseca et al. , 2005)).

Comparaison de représentations neuronales artificielles

TAL et oculométrie

Supervisors : Alexis Nasr, Françoise Vitu

Ce sujet de thèse s'inscrit dans une double problématique de traitement automatique des langues par l'ordinateur (TAL) et d'oculométrie.

Il sera effectué sous la double direction de Françoise Vitu du Laboratoire de Psychologie Cognitive et d'Alexis Nasr du Laboratoire Informatique et Systèmes.

Les modèles de TAL permettent de réaliser des tâches d'analyse linguistique des énoncés, tel que l'analyse morphologique, syntaxique, sémantique ou discursive.

Ces modèles permettent de prédire des représentations abstraites (syntaxiques, sémantique ...) à partir des observables que sont le texte ou le signal de parole. L'analyse du comportement de ces outils permet d'identifier des zones d'incertitudes : celles où le traitement peut se poursuivre dans différentes directions.

Elles correspondent, en général, à des ambiguïtés et c'est souvent à ces moments que des erreurs sont commises par l'ordinateur.

D'autre part, les données issues de l'oculométrie, constituent une trace du mouvement des yeux lors de la lecture.

Ces données révèlent que l’énoncé n'est pas traité strictement séquentiellement, c'est-à-dire d'un mot au mot suivant.

Une partie de ces mouvements est sensible aux influences linguistiques.

Ils se sont révélés comme des indices d’ambiguité syntaxique ou sémantique.

Il s'agit du temps total de fixation sur un mot, de la probabilité que ce mot soit à plus long terme (c’est-à-dire une fois que les yeux auront avancé un peu plus loin dans la phrase)

l’objet d’un second passage ou relecture et le temps total de visualisation du mot (la somme de toutes les durées des fixations sur le mot au travers de tous les passages).

Le sujet de la thèse est à l'intersection de ces deux types d'observations.

Il vise à étudier comment les données issues de l'oculométrie pourraient être intégrées dans des modèles de TAL et, inversement, comment les modèles de TAL pourraient aider à la compréhension, voire contribuer à la modélisation/prédiction, du comportement oculaire pendant la lecture. Le travail reposera sur trois acquis :

- Le modèle MASC (Model of attention in the Superior Colliculus), développé par F. vitu (LPC) en collaboration avec H. Adeli & G. Zelinsky (NY, USA), permettant de déterminer la part du comportement oculomoteur qui est purement le reflet de mécanismes visuo-moteurs (non linguistiques).

- Les données d'oculométrie enregistrées dans le cadre du projet "BLRI book reading Corpus".

- Le logiciel MACAON, développé au laboratoire d'Informatique et Systèmes permettant de réaliser divers traitements linguistiques.

Ce sujet de thèse s'inscrit dans la question tranversale 5 (Deep Learning) dans la mesure où les prédictions réalisées par les modèles de TAL reposent sur des réseaux de neurones profonds.

The role of emotions in the perception of abstract words: An embodied perspective

Supervisors : Marie Montant, Christine Deruelle

The role of emotions in the perception of abstract words: An embodied perspective

Co-direction: Marie Montant1 & Christine Deruelle2 1 Laboratoire de Psychology Cognitive, LPC, UMR 7290, Marie.Montant@univ-amu.fr 2 Institut de Neurosciences de la Timone, INT, UMR 7289, Christine.Deruelle@amu-univ.fr

Depuis l’émergence du cognitivisme dans les années 50, la pensée est considérée comme le résultat de computations proches de celles que réaliserait un ordinateur, détachées du corps organique et de l’environnement dans lequel ce corps sent et agit.

Aux antipodes du cognitivisme, une conception incarnée de la pensée – l’embodiment- a vu le jour dans les années 90.

Il s’agit cette fois d’envisager la cognition sous une approche empiriste selon laquelle les objets de pensée (par exemple, le concept de chien ou celui de liberté) seraient le fruit d’un dialogue incessant entre le corps percevant/agissant et son environnement.

Ce projet de thèse a pour objectif de repenser les représentations portées par le terme d’abstraction tel qu’il est aujourd’hui utilisé en neurolinguistique pour s’interroger plus précisément sur le mode d’encodage des mots abstraits dans le cerveau humain.

En effet, l’abstraction sémantique est souvent définie par défaut, comme le négatif du concret ou de l’imageable : le mot abstrait –liberté ou vérité- est celui qui ne se rattache pas directement à une expérience sensible.

Un chien se caresse tandis que la liberté est impalpable.

La thèse de l’embodiment suppose qu’il n’existe pas de termes « abstraits » per se : d’une part, la liberté a un sens assez concret pour une personne qui sort de prison, et d’autre part, « un chien » comme « la liberté » peuvent être considérés comme des termes singuliers généralisés à fin de classification, d’économie et de communication: sont désignés sous le terme de « chien » tous les animaux qui partagent un « air de famille ».

Notre hypothèse est que « l’air de famille » permettant de catégoriser sous un même mot abstrait (liberté) diverses situations (très variables d’un individu à l’autre, bien plus que celles associées au mot chien) repose sur les émotions (entre autres) générées par ces situations : l’accumulation des situations (scènes, événements) au cours desquelles le mot liberté est utilisé conduirait à un codage neuronal de ce mot passant par le réseau neuronal impliqué dans le ressenti émotionnel (entre autres).

Ainsi, les émotions, avec leur cortège de manifestations physiologiques, joueraient le rôle d’ancrage corporel des mots abstraits.

Notre objectif est de montrer la reconnaissance même des mots abstraits (l’accès lexical) peut être affectée dès lors que l’on agit sur leur composante émotionnelle en modifiant l’état corporel des participants.

Il s’agit donc de mettre en évidence l’existence d’une chaîne causale (bottom-up) entre modifications corporelles (physiologiques, mécaniques), émotions et traitement des mots abstraits.

Nous jouerons sur des modifications corporelles susceptibles d’entraîner des perturbations émotionnelles lesquelles devraient affecter la perception et la compréhension des mots abstraits.

Ces perturbations, qui seront physiologiques (rythme cardiaque par ex.) ou mécaniques (contraintes exercées sur les muscles expressifs du visage par ex.) devraient affecter

- de manière facilitatrice ou inhibitrice

- la reconnaissance visuelle de mots abstraits associés à des émotions, selon que la valence de ces dernières corresponde ou non à celles induites par ces perturbations.

Les études empiriques seront menées à l’aide de deux techniques d’imagerie cérébrale : l’IRMf pour la précision spatiale de ses cartes d’activation et la TMS pour sa précision temporelle et ses potentiels effets perturbateurs sur les étapes précoces de la reconnaissance des mots abstraits. Le ou la candidat.e devra avoir une solide formation en neurosciences, avoir déjà marqué son intérêt pour les approches pluridisciplinaires et être ouvert.e à des collaborations internationales (bon niveau exigé d’au moins une langue étrangère).

Time as a functional mechanism of sensorimotor integration in speech and language?

Supervisors : Benjamin Morillon (INS) , Kristof Strijkers (LPL) (potential) ILCB

Collaborators: Daniele Schon (INS), Andrea Brovelli (INT), Elin Runnqvist (LPL), Marie Montant (LPC) (potential) external collaborators: Anne-Lise Giraud (UNIGE), Sonja Kotz (UM), Friedemann Pulvermuller (FUB)

ILCB PhD & Postdoctoral Topic Proposal

Primary QT: QT3

Secondary QT: QT5

While traditional models proposed a strict separation between the activation of motor and sensory systems for the production versus perception of speech, respectively, by now most researchers agree that there is much more functional interaction between sensory and motor activation during language behavior. Despite this increasing consensus that the integration of sensorimotor knowledge plays an important role in the processing of speech and language, much less consensus exists on what that exact role may be as well as the functional mechanics that could underpin it.

Indeed, many questions from various perspectives remain open issues in the current state-of-the-art: Is the role of sensorimotor activation modality-specific in that it serves a different functionality in perception than production?

Is it only relevant for the processing of speech sounds or does it also play a role in language processing and meaning understanding in general?

Can sensory codes be used to predict motor behavior (production) and can motor codes be used to predict sensory outcomes (perception)?

And if so, how are such predictions implemented at the mechanistic level (e.g., do different oscillatory entrainment between sensory and motor systems reflect different dynamical and/or representational properties of speech and language processing)?

And in which manner can such sensorimotor integration go from arbitrary speech sounds to well-structured meaningful words and language behavior?

The goal of this project is to advance on our understanding on these open questions (in different ‘sub-topics’) taking advantage of the complementary knowledge of the supervisors, with B. Morillon being an expert on the cortical dynamics of sensorimotor activation in the perception of speech, and K. Strijkers being an expert on the cortical dynamics of sensorimotor activation in the production and perception of language.

At the center of the project, and as connecting Red Thread, is the shared interest of the supervisors in the role of ‘time’ (temporal coding) as a potential key factor that causes sensorimotor activation to bind during the processing of speech and language. Upon this view, ‘time’ transcends its classical notion as a processing vehicle (i.e., simple propagation of activation from sensory to motor systems and vice versa) and may reflect representational knowledge of speech and language.

One of the main goals of the current project is thus to test the hypothesis that temporal information between sensory and motor codes serves a key role in the production and perception of speech and language.

More specifically, we will explore whether sensorimotor integration during speech and language processing reflects: (a) the prediction of temporal information; (b) the temporal structuring of speech sounds and articulatory movement; (c) the temporal binding of phonemic and even lexical elements in language.

We will consider PhD-candidates and post-doctoral researchers to conduct research around any of the three topics specified above (a-c), and interested candidates can contact us via email (Benjamin Morillon: bnmorillon@gmail.com; Kristof Strijkers: Kristof.strijkers@gmail.com)

including a CV and motivational letter (1-2 pages).

Candidates who have a strong background in speech and language processing and/or knowledge of spatiotemporal neurophysiological techniques and analyses, will be considered as a strong plus.

Contour, rhythm or content? What does dogs brain grasp from human speech?

Efficiency of a Virtual Reality Headset to improve reading in low vision persons

CLOSING THE LOOP BETWEEN SOUNDS AND AUDITORY COGNITION BY LEARNING METRICS, INTERPRETING DEEP NEURAL NETWORKS AND SYNTHESIZING SOUNDS

Supervisors: Richard Kronland-Martinet (PRISM) & Valentin Emiya (LIS) / Stéphane Ayache (LIS)

Collaborations: Bruno Torresani (I2M)

Summary

Extracting meaning from sounds is a crucial ability for humans to communicate.

Although transformations of the physical vibrations into a neural activity conveyed by the peripheral auditory system are reasonably well known, non-linear transformations carried out at higher cortical levels remain remarkably poorly understood.

However, understanding these transformations is crucial to significantly improve the knowledge on auditory cognition by linking the properties of sounds to their perceptual and behavioral outcomes.

Simultaneously, progress in the field of machine learning now allows for the training of deep neural networks that are able to reproduce complex cognitive tasks, such as musical genre classification or words recognition (Kell et al., 2018), and even to generate realistic sounds such as those produced by a human (Van Den Oord et al., 2016).

These frameworks hence provide artificial auditory systems that compete with and sometimes outmatch human abilities.

However, interpreting the transformations carried out by these “black boxes” remains a crucial challenge in particular to understand which acoustic information they use to achieve these tasks.

This post-doc project aims to capitalize on the unique expertise of three ILCB laboratories to address this challenge with Richard Kronland-Martinet at the PRISM lab carrying expertise is the field of auditory cognition and sound synthesis, and with Valentin Emiya and Stéphane Ayache at the LIS, respectively experts in the fields of signal processing and machine learning.

In addition to this supervision, the post-doc will benefit of collaborations with other ILCB members and in particular with Bruno Torrésani at I2M, expert researcher on mathematical representations of sounds.

The project will be more specifically organized in three tasks (1) training deep networks and metrics to have similar performances to that of humans (LIS) (2) interpreting these computational frameworks in the light of neuromimetic mathematical representations of sounds (LIS/PRISM) (3) evaluating the perceptual plausibility of these representations by the production of sounds (PRISM).

This project, at the intersection of computational auditory cognition, machine learning, and signal processing, will set the foundation for a systematic investigation of auditory representations by developing a methodology to train networks, probe their internal representations, and evaluate their perceptual relevance.

In this sense, this project will manage to leverage new transdisciplinary synergies in the ILCB through the interlocking of complementary scientific methodologies.

Van Den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., ... & Kavukcuoglu, K. (2016). WaveNet: A generative model for raw audio. SSW, 125.

Abstract

For more information:

Time as a functional mechanism of sensorimotor integration in speech and language?

Post-doc topic proposed in co-supervision between: Benjamin Morillon (Institut de Neurosciences des Systèmes, INS)
Kristof Strijkers (Laboratoire Parole et Langage, LPL)

(potential) ILCB collaborators: Daniele Schon (INS), Andrea Brovelli (INT), Elin Runnqvist (LPL), Marie Montant (LPC)
(potential) external collaborators: Anne-Lise Giraud (Geneva, Switzerland), Sonja Kotz (Maastricht, Netherlands), Friedemann Pulvermuller (Berlin, Germany)

Indeed, many questions from various perspectives remain open issues in the current state-of-the-art:

Is the role of sensorimotor activation modality-specific in that it serves a different functionality in perception than production?

Is it only relevant for the processing of speech sounds or does it also play a role in language processing and meaning understanding in general?

Can sensory codes be used to predict motor behavior (production) and can motor codes be used to predict sensory outcomes (perception)?

And in which manner can such sensorimotor integration go from arbitrary speech sounds to well-structured meaningful words and language behavior?

The goal of this project is to advance on our understanding on these open questions (in different ‘sub-topics’) taking advantage of the complementary knowledge of the supervisors, with B. Morillon being an expert on the cortical dynamics of sensorimotor activation in the production and perception of speech, and K. Strijkers being an expert on the cortical dynamics of sensorimotor activation in the production and perception of language.

Upon this view, ‘time’ transcends its classical notion as a processing vehicle (i.e., simple propagation of activation from sensory to motor systems and vice versa) and may reflect representational knowledge of speech and language.

Etude neurocognitive des relations entre perception et production de parole du décodage phonologique à l’accès lexical, alliant l’expérimentation comportementale et EEG

Projet postdoctoral supervisé (proposition) par Sophie Dufour (LPL) et Jean-Luc Schwartz (GIPSA-lab)

Durée : 2 ans

La perception et la production de la parole mettent en œuvre une série de processus cognitifs qui peuvent être observables et caractérisés fonctionnellement par des expériences psycholinguistiques.

Or, curieusement, ces processus sont le plus souvent étudiés indépendamment.

L’ambition de ce projet est d’examiner s’il existe des liens entre perception et production de parole dans un même processus phonologique (manipulation de règles computationnelles et à catégorisation de phonèmes).

En français continental, il existe deux variétés qui différent par leurs systèmes phonologiques.

En effet, le français du sud (SF) n’oppose pas les phonèmes /ɛ/ et /e/ alors qu’ils contrastent en français du nord (NF). En SF, les deux variantes existent, mais il s’agit d’allophones dérivés par une règle computationnelle régie par la structure syllabique, avec [ɛ] en contexte CVC. e.g. [fɛt] « fête », mais [e] en contexte CV. e.g. [fete] « feter ».

Les études comportementales permettent de mettre en lumière d’éventuelles différences de temps de réponses entre différents processus phonologiques ralentissant ainsi l’accès lexical et par conséquent la production et/ou la perception de parole.

Parallèlement, les approches par exploration cérébrale en Électroencéphalographie (EEG) permettent de modifier/confirmer/affiner les résultats comportementaux. Le plus souvent, les études en EEG sont focalisées sur la perception de parole.

Néanmoins, quelques études montrent qu’il est possible d’adapter le paradigme d’EEG à la production de parole (Indefrey & Levelt 2004; Sain et al. 2009 ; Sato & Shiller, 2018).

Ce type d’expérience nécessite un important travail de posttraitement visant à débruiter les signaux EEG des artefacts des mouvements musculaires de la production de parole (Vos et al. 2010) pour obtenir des ERPs propres.

Deux séries d’expériences ont été effectuées au LPL (Aix-en-Provence) et à GIPSA-lab (Grenoble) en collaboration avec Sophie Dufour, Noël Nguyen et Jean-Luc Schwartz, l’une présentée à Labphon 2018 (expérience comportementale sur la production) et l’autre soumise à la revue Neuroscience Letters (données EEG en perception).

Les résultats de ces travaux nous poussent à mener d’autres investigations centrées sur les liens perception-production, et combinant différentes approches sur les mêmes sujets pour évaluer les corrélations à travers différents paradigmes, pour en caractériser les principes communs et les représentations partagées.

Indefrey, P., & Levelt, W. J. M. (2004). The spatial and temporal signatures of word production components. Cognition, 92(1), 101–144. Sahin, N. T., Pinker, S., Cash, S. S., Schomer, D., & Halgren, E. (2009). Sequential processing of lexical, grammatical, and phonological information within Broca’s area. Science, 326(5951), 445-449. Sato, M., & Shiller, D. M. (2018). Auditory prediction during speaking and listening. Brain & Language, 187, 92-103 Vos, D. M., Riès, S., Vanderperren, K., Vanrumste, B., Alario, F. X., Huffel, V. S., & Burle, B. (2010). Removal of muscle artifacts from EEG recordings of spoken language production. Neuroinformatics, 8(2), 135-150.

Functional organization of the Visual Word Form Area and its communication with the spoken language system: evidence from fMRI and intra-cerebral EEG recording

Reading acquisition establishes the functional and anatomical connections between the auditory and visual systems.

Interestingly, this recurrent communication between the two systems also induces more profound changes in the activity and the property of neurons within each sensory system itself (Dehaene et al., 2010).

Our recent study using a combination of Transcranial Magnetic Stimulation and an adaptation protocol (Pattamadilok, Planton, & Bonnard, 2019) showed that the Visual Word Form Area (VWFA), i.e., the key area of the reading network, not only contains neurons that encode orthographic information as currently assumed, but also those that encode phonological information.

The emergence of these spoken language coding neurons in the ventral visual pathway could be considered as cortical reorganization subsequent to learning to read.

The present proposal aims to further investigate 1) the fine-scale spatial organization of different (functionally segregated) neuronal populations within the VWFA and 2) the temporal dynamic of the communication between this area and those that belong to the spoken language network.

The first issue will be addressed in a fMRI study using a cross-modal activation protocol. Both univariate analysis and multivariate Representational Similarity Analysis will be applied to examine fine-grained patterns of activity within the VWFA.

The second issue will be addressed using an intracerebral EEG recording in epileptic patients.

In addition to a better understanding of the theoretical questions mentioned above, the present project will contribute to an ongoing elaboration of a cerebral cartography for pre-surgical evaluations of epileptic patients and to the development of MIA toolbox, software for analysis of intracerebral EEG signals over multiple patients (https://github.com/annesodub/mia).

We are looking for a candidate with a background in cognitive neuroscience, experience in functional MRI (experimental design, data acquisition, preprocessing, analysis) and relevant programming skills (e.g., Matlab).

Experience with MEG or EEG is a plus.

This 2-year project will be supervised by Chotiga Pattamadilok (Laboratoire Parole et Langage, Aixen-Provence), Dr. Agnès Trébuchon (Institut de Neurosciences des Systèmes; Timone Hospital, Marseille) and Anne-Sophie Dubarry (Laboratoire Parole et Langage, Aix-en-Provence). Interested candidates can contact C. Pattamadilok via email (chotiga.pattamadilok@lpl-aix.fr).

A CV with complete list of publications, a letter of motivation (1-2 pages) and a letter of recommendation or contact information of a potential referee will be requested at a later stage.

References Dehaene, S., Pegado, F., Braga, L. W., Ventura, P., Nunes Filho, G., Jobert, A., … Cohen, L. (2010). How Learning to Read Changes the Cortical Networks for Vision and Language. Science, 330(6009), 1359–1364. Pattamadilok, C., Planton, S., & Bonnard, M. (2019). Spoken language coding neurons in the Visual Word Form Area: Evidence from a TMS adaptation paradigm. NeuroImage, 186, 278–285.

The role of accent in French spoken word recognition

Supervisors: Sophie Dufour & Amandine Michelas

In French, the main accent, called primary accent, affects the last syllable of a larger unit than the word, that is, the accentual phrase.

For instance, the monosyllabic word chat “cat” receives primary accent in the following sentence un petit 'chat “the little cat” because it is the last full syllable of the accentual phrase.

In contrast, it is unaccented in the sentence un chat 'triste “the sad cat” because it is not in final position within the accentual phrase.

To this date, there are numerous demonstrations that in French, accent is used in syntactic parsing and in the segmentation of continuous speech into words (Christophe et al., 2004; Spinelli et al., 2010).

However, the role of accent in spoken word recognition is still poorly documented.

Since French speakers are inevitably exposed to both the accented and unaccented versions of words, models assuming the storage of multiple variants (Connine, 2004; Goldinger, 1998) predict that accent in French could be represented in the mental lexicon. In this PhD project, using both EEG and behavioral experiments, we will examine how accent is represented in French and how it affects spoken word recognition.

PhD candidates will be expected to have a background in psycholinguistics and/or phonetics and to demonstrate an interest in word recognition and prosody.

Learning the structure of auditory tasks in cerebral dynamics and deep neural networks

Location: Institut de Neurosciences de la Timone, Marseille, France.

Principal Investigators: Dr. Bruno L. Giordano (Institut de Neurosciences de la Timone, Marseille);
Prof. Thierry Artières (Laboratoire d’Informatique et Systèmes, Marseille).

Collaborator: Dr. Christian G. Bénar (Institut de Neurosciences de Systèmes, Marseille).

We learn about the acoustical environment through a variety of tasks, such as discriminating, categorizing and identifying diverse sound sources across many domains (environmental sounds, music, voice, speech, etc.; Giordano et al., 2013, 2014).

The ability to perform many different tasks across multiple domains is a key aspect of behavioral flexibility, and is thought to rely on the function of the prefrontal cortex, a structure that subserves flexible inference and task representations for effective learning (Cao et al. 2019; Cole et al., 2013).

Computational models, however, often struggle with such flexibility (catastrophic forgetting in deep neural networks - DNNs; cf. Yang et al., 2019), and overlook prefrontal functions in the perception and learning of sound-generating sources (Kell et al., 2018; cf. Jiang et al., 2018).

As a consequence, it is currently unknown how the auditory system readily learns to perform multiple tasks in a short time, and how cerebral representations are formed to guide behavior (e.g., learning rate) throughout the learning process.

One candidate strategy to achieve this flexibility is that the brain represents multiple tasks as a small number of their underlying components and uses knowledge of this generalizable structure to facilitate learning (Reverberi et al., 2012).

To investigate this hypothesis, we will carry out a task-rich magnetoencephalography (MEG) study using sounds of natural sources (speakers, musical instruments, non-music non-living objects).

This project will in particular capitalize on the superior unsupervised learning ability of the auditory system (Goudbeek et al., 2017) to disentangle statistical and rule-based learning effects on the trial-by-trial evolution of cerebral and behavioral responses, and to examine the role of dissimilarity estimation as a core component of task representation in the brain (Ashby & Valentin, 2017).

Key aims include: (1) using MVPA to track the trial-by-trial evolution of task representations in source-localized MEG data (Cao et al., 2019; GIORDANO, BÉNAR); (2) developing novel neural-network models of task compositionality (e.g., DNNs trained with continual learning and unsupervised methods; Chen et al., 2018; Yang et al., 2019; ARTIÈRES); (3) using connectivity methods to pinpoint network hubs associated with task representations and task-modulated information transfer (Cole et al., 2013; Giordano et al., 2017; GIORDANO).

The post-doctoral fellow will lead the design, execution and analysis for this MEG study on task representation in the human auditory system. The ideal candidate will have a strong background in computational modelling of behavior as applied to the multivariate analysis of MEG data, be proficient in Matlab and Python, and show evidence of the ability to lead a scientific project under the supervisions of multiple PIs and of the commitment to publish in high profile journals.

Candidates should send their CV, two reference letters and a motivation letter to:

Bruno L. Giordano (bruno.giordano@univ-amu.fr)

Thierry Artières (thierry.artieres@lis-lab.fr)

References
Ashby, G., & Valentin, V. (2017). Multiple systems of perceptual category learning: Theory and cognitive tests. In H. Cohen & C. Lefebvre (Eds.), Handbook of categorization in cognitive science (pp. 157-188). San Diego, CA, US: Elsevier Academic Press.
Cao, Y., Summerfield, C., Park, H., Giordano, B. L., & Kayser, C. (2019). Causal inference in the multisensory brain. Neuron, in press.
Chen, M., Denoyer, L., Artières, T. (2018). Multi-view data generation without view supervision. International Conference on Learning Representations (ICLR).
Cole, M. W., Reynolds, J. R., Power, J. D., Repovs, G., Anticevic, A., & Braver, T. S. (2013). Multi-task connectivity reveals flexible hubs for adaptive task control. Nature Neuroscience, 16(9), 1348.
Giordano, B. L., McAdams, S., Zatorre, R. J., Kriegeskorte, N., & Belin, P. (2013). Abstract encoding of auditory objects in cortical activity patterns. Cerebral Cortex, 23(9), 2025-2037.
Giordano, B. L., Pernet, C., Charest, I., Belizaire, G., Zatorre, R. J., & Belin, P. (2014). Automatic domain-general processing of sound source identity in the left posterior middle frontal gyrus. Cortex, 58, 170-185.
Giordano, B.L., Ince, R.A., Gross, J., Schyns, P.G., Panzeri, S. and Kayser, C. (2017). Contributions of local speech encoding and functional connectivity to audio-visual speech perception. Elife, 6, p.e24763.
Goudbeek, M., Smits, R., Cutler, A., & Swingley, D. (2017). Auditory and phonetic category formation. In Handbook of Categorization in Cognitive Science (pp. 687-708). Elsevier.
Jiang, X., Chevillet, M. A., Rauschecker, J. P., & Riesenhuber, M. (2018). Training humans to categorize monkey calls: auditory feature-and category-selective neural tuning changes. Neuron, 98(2), 405-416.
Kell, A. J., Yamins, D. L., Shook, E. N., Norman-Haignere, S. V., & McDermott, J. H. (2018). A task-optimized neural network replicates human auditory behavior, predicts brain responses, and reveals a cortical processing hierarchy. Neuron, 98(3), 630-644.
Reverberi, C., Görgen, K. & Haynes, J.-D. (2012). Compositionality of rule representations in human prefrontal cortex. Cerebral Cortex, 22, 1237–1246.
Yang, G. R., Joglekar, M. R., Song, H. F., Newsome, W. T., & Wang, X. J. (2019). Task representations in neural networks trained to perform many cognitive tasks. Nature Neuroscience, 22(2), 297.

Nested cortical models for human and non-human primate inter-species comparisons

Supervisors: Olivier Coulon (Institut de Neurosciences de la Timone),

Adrien Meguerditchian (Laboratoire de Psychologie Cognitive, AMU, Marseille, France), W. Hopkins (Neuroscience Institute and Language Research Center, Georgia State University, Atlanta USA)

location: MeCA team, Institut de Neurosciences de la Timone, Marseille, France.

The MeCA team at INT has developed a human cortical organization model that provide a statistical description of relative position, orientation, and long-range alignment of cortical sulci on the surface of the cortex [1].

This model can be implemented on the surface of any individual (extracted from MR images), and provides inter-subject comparisons and cortical parcellation [2].

The goal of this project is to build new models for non-human primate species.

Starting from the human model, a nested sub-model can be developed for chimpanzees, from which in turn a model for baboons can be build, then again for macaques.

This series of nested models will define a hierarchy of cortical complexity, and will provide the mean to transport any cortical information (functional, anatomical, geometrical) from one species to another and to perform direct inter-species comparisons.

A proof of concept has already been proposed for humans and chimpanzees [3, Fig.1].

The post-doctoral fellow will develop complete models for chimpanzees, baboons, and macaques, and apply them to study local cortical expansions across species, as well as to compare the localization of known cortical asymmetries across species.

Models and associated tools will be made available to the neuroimaging community via the BrainVisa software platform.

The candidate will use existing tools and adapt them to new species.

MR image databases will be provided for each species.

Basic knowledge of programming languages such as Matlab or Python is expected, as well as a strong interest in neuroimaging and/or computational anatomy. [1] Auzias G, Lefèvre J, Le Troter A, Fischer C, Perrot M, Régis J, Coulon O (2013). Model-driven Harmonic Parameterization of the Cortical Surface: HIP-HOP, IEEE Trans Med Imaging, 32(5):873-887. [2] Auzias G, Coulon O, Brovelli A (2016) MarsAtlas : A cortical parcellation atlas for functional mapping, Human Brain Mapping 37(4), p. 1573-1592 [3] Coulon O, Auzias G, Lemercier P, Hopkins W (208).

Nested cortical organization models for human and non-human primate inter-species comparisons.

Int. Conference of the Organization for Human Brain Mapping.

Cumulative culture in nonhuman primates and the evolution of language

Supervisors: Joël Fagot, Nicolas Claidière (Laboratoire de Psychologie Cognitive), Noel Nguyen (Laboratoire Parole et Langage)

Transverse question 1: “Precursors of Language”

Contact: Dr. J. Fagot, joel.fagot@univ-amu.fr, Webpage: https://lpc.univ-amu.fr/fr/profile/fagot-joel Call

Cumulative culture in nonhuman primates and the evolution of language

Children learn a language by being exposed to the speech production of speakers of that language, they then become speakers themselves.

This process of iterated learning largely explains why language evolve through time: every generation, the changes that are introduced by new generations of speakers are passed on to future generations.

Experiments involving transmission chains can capture such process.

For instance, Kirby, Cornish, and Smith (2008) introduced a non-structured language (random associations between a set of visual stimuli and artificially constructed labels) as input in transmission chains and found that this language became progressively more structured and easier to learn. However, the importance of iterated learning in determining the structure of a language is difficult to evaluate in humans, because humans have necessarily already acquired a language before participating in experiments.

That first acquisition will then inevitably guide the evolution of the experimental language according to the principles just described (participants will be biased by their first language).

Studies with non-human animals, such as baboons, can overcome this difficulty and the proposed project is to explore the effect of iterated learning on language like structures in the baboon, a nonhuman, nonlinguistic, primate species.

The post-doc will be based at the CNRS primate station in Rousset (nearby Aix-en-Provence), and will work with a word-unique “primate cognition and behavior plateform” where baboons can interact freely with experiments presented on touch screens (for a range of experiments using this system see https://www.youtube.com/watch?v=6Ofd8cHVCYM).

This platform has previously been used to present transmission-chain experiments to baboons.

In relation to this project, previous experiments have revealed that transmission chains promote the appearance of typically linguistic features (structure, systematicity and lineage specificity, see e.g. Claidière et al, 2014).

The post-doc will have to explore this line of research further.

A major challenge will be to extend our previously used visual pattern reproduction task to sound patterns, which may lend themselves to the emergence of a combinatorial structure along the transmission chain.

We are looking for candidates who are highly motivated with a PhD in Biology or Psychology, preferably with a focus on either evolutionary mechanisms and/or language-related issues.

Candidates are also expected to have very good skills in programming and data analysis.

A previous experience with nonhuman primates would be a plus.

Candidates should contact Dr. Joël Fagot at joel.fagot@univ-amu.fr References: Claidière, N., Smith, K., Kirby, S. & Fagot, J. (2014). Cultural evolution of systematically structured behaviour in a non-human primate, Proc. R. Soc. B 2014 281, 20141541 Kirby S, Cornish H, Smith K. (2008). Cumulative cultural evolution in the laboratory: an experimental approach to the origins of structure in human language. Proc. Natl Acad. Sci. USA 105, 10 681–10 686. (doi:10.1073/pnas. 0707835105)

Functional Connectivity Dynamics of the Language Network

Supervisors : Andrea Brovelli (Institut de Neurosciences de la Timone - www.andrea-brovelli.net/), Demian Battaglia (Institut de Neurosciences des Systèmes - www.demian-battaglia.net), Frédéric Richard (Institut de Mathématiques de Marseille - www.latp.univ-mrs.fr/~richard/)

Scientific context and state-of-the-art Language is a network process arising from the complex interaction of regions of the frontal and temporal lobes connected anatomically via the dorsal and ventral pathways (Friederici and Gierhan, 2013; Fedorenko and Thompson-Schill, 2014; Chai et al., 2016).

An open question is how these brain areas coordinate to support language. Functional Connectivity (FC) analysis can provide the methodological framework to address this question.

FC analysis includes various forms of statistical dependencies between neural signals, ranging from linear correlation to more sophisticated measures quantifying directional influences between brain regions, such as Granger causality (Brovelli et al., 2004, 2015). Recently, however, it has become clear that a time-resolved analysis of FC, also known as Functional Connectivity Dynamics (FCD), can yield a novel perspective on brain networks dynamics (Hutchison et al., 2013; Allen et al., 2014). Indeed, we have shown that non-trivial resting-state FCD is expected to stem from complex dynamics in cortical networks (Hansen et al., 2015) and that the fluency of FCD correlates with single-subject level cognitive performance across the human lifespan (Battaglia et al., 2017). In task-related conditions, FCD analyses have shown that visuomotor transformations follow a schedule of recruitment of different networks over time intervals in the order of hundreds of milliseconds (Brovelli et al., 2017).

Objective of the research project These recent advances open up the possibility to tackle one of the long-term objectives of the ILCB, which is to characterise how language-related brain regions communicate.

This challenge, however, is limited by the lack of knowledge about the underlying neurophysiological mechanisms.

The objective of the Post-Doc research project is to characterise the neural correlates that could be used to track information transfer between brain regions in task-related conditions. At first, the post-doc researcher will optimise current tools for the estimate of source-level brain activity (both power and phase information of neural oscillations) from magnetoencephalographic (MEG) data using an atlas-based approach (Auzias et al., 2016). Information transfer between brain regions will be quantified by means of FC and FCD analyses based on different metrics, including multivariate spectral methods, directional influences, such as Granger causality, and information-theoretical quantities, which can track information storage, sharing and transfer (Kirst et al., 2016).

These metrics will be applied to different potential correlates of brain communication, such as power-to-power correlations, phases-to-phase relations and phase-to-amplitude couplings.

The analysis of FC and FCD representations and extraction of functional modules will then be performed using graph theory and temporal network representations (Holme and Saramäki, 2012; Brovelli et al., 2017).

To do so, we will exploit two MEG datasets. A first MEG dataset collected by Andrea Brovelli, in which participants were asked to perform finger movements in response to the presentation of numerical digits (simple visuomotor task).

And a second MEG experiment collected by Xavier Alario, in which participants were required to name objects depicted a screen (naming task). Profile of the Post-Doc candidate The Post-Doc candidate will have a PhD in cognitive and computational neuroscience, bioengineering, physics or applied mathematics.

Proficient computational skills (Matlab and/or Python) and experience in the analysis of MEG data is required. Experience in the cognitive bases of language is welcome.

Contacts Candidates should send their CV, 1 or 2 reference letters and a motivation letter to: Andrea Brovelli andrea.brovelli@univ-amu.fr Demian Battaglia demian.battaglia@univ-amu.fr Frédéric Richard frederic.richard@univ-amu.fr References Allen EA, Damaraju E, Plis SM, Erhardt EB, Eichele T, Calhoun VD (2014)

Tracking whole-brain connectivity dynamics in the resting state.

Cereb Cortex 24:663–676. Auzias G, Coulon O, Brovelli A (2016) MarsAtlas: A cortical parcellation atlas for functional mapping.

Hum Brain Mapp 37:1573–1592. Battaglia D, Thomas B, Hansen ECA, Chettouf S, Daffertshofer A, McIntosh AR, Zimmermann J, Ritter P, Jirsa V (2017) Functional Connectivity Dynamics of the Resting State across the Human Adult Lifespan.

Available at: http://dx.doi.org/10.1101/107243. Brovelli A, Badier J-M, Bonini F, Bartolomei F, Coulon O, Auzias G (2017) Dynamic Reconfiguration of Visuomotor-Related Functional Connectivity Networks. J Neurosci 37:839–853. Brovelli A, Chicharro D, Badier J-M, Wang H, Jirsa V (2015) Characterization of Cortical Networks and Corticocortical Functional Connectivity Mediating Arbitrary Visuomotor Mapping. J Neurosci 35:12643–12658. Brovelli A, Ding M, Ledberg A, Chen Y, Nakamura R, Bressler SL (2004) Beta oscillations in a large-scale sensorimotor cortical network: directional influences revealed by Granger causality.

Proc Natl Acad Sci U S A 101:9849–9854. Chai LR, Mattar MG, Blank IA, Fedorenko E, Bassett DS (2016) Functional Network Dynamics of the Language System. Cereb Cortex 26:4148–4159. Fedorenko E, Thompson-Schill SL (2014) Reworking the language network. Trends Cogn Sci 18:120–126. Friederici AD, Gierhan SME (2013) The language network.

Curr Opin Neurobiol 23:250–254. Holme P, Saramäki J (2012) Temporal networks. Phys Rep 519:97–125.

Hutchison RM, Womelsdorf T, Allen EA, Bandettini PA, Calhoun VD, Corbetta M, Della Penna S, Duyn JH, Glover GH, Gonzalez-Castillo J, Handwerker DA, Keilholz S, Kiviniemi V, Leopold DA, de Pasquale F, Sporns O, Walter M, Chang C (2013) Dynamic functional connectivity: promise, issues, and interpretations. Neuroimage 80:360–378. Kirst C, Timme M, Battaglia D (2016) Dynamic information routing in complex networks. Nat Commun 7:11061.

Speech monitoring in conversation with human and artificial intelligence interlocutors

Supervisors : Elin Runnqvist (LPL) and Magalie Ochs (LSIS)

Collaborators: Noël Nguyen (LPL), Kristof Strijkers, (LPL) & Martin Pickering (University of Edinburgh)

QT4: “Cerebral and cognitive underpinnings of conversational interactions” Traditionally, researchers have focused on either production or comprehension to investigate the underlying mechanisms of language processing.

However, in recent years a switch in focus has occurred towards the examination of both production and comprehension by looking at language processing in a conversational setting.

While this trend has started in many key fields of language processing, not all research domains have taken up this exciting and new challenge.

With the current project, we would examine how the interaction with another interlocutor might impact the processes involved in error monitoring (i.e., detection and repair of errors) during language production.

While little to no research has examined monitoring in a conversational setting, there are monitoring models that take dialogue into account (e.g., Pickering & Garrod, 2014).

In the current proposal, we would test the predictions put forward by theses models by employing several different tasks (e.g., the SLIP task, Runnqvist et al., 2016; the network description task, Declerck et al., 2016), and manipulating several variables related to the speaker, the task demands and to the different levels of linguistic representations while using both behavioral and electrophysiological methods.

Furthermore, the use of an artificial agent as a conversational partner for parts of the project will allow for manipulation of conversational variables (e.g., location or type of feedback) and it will further allow us to examine whether the patterns observed for humans would be similar, speaking to the issue of monitoring being an automatic or controlled process.

Both a virtual agent and a humanoid robot (Furhat) would be used to measure the effect of physical presence. Finally, multimodal aspects such as head nodding and smiling would be manipulated (e.g., Ochs et al., 2017).

The end goal of this project would be twofold: Concerning language processing, the objective is to have a better understanding of monitoring in conversation and its relation to monitoring in isolation. Concerning artificial intelligence, the end-goal would be to further improve our understanding of the linguistic, social and emotional factors that are essential for successful human-robot interactions.

Discrimination des différents processus cognitifs de reconnaissance de la voix.

Time as a functional mechanism of sensorimotor integration in speech and language?

Supervisors : Benjamin Morillon (INS) , Kristof Strijkers (LPL) (potential) ILCB

ILCB PhD & Postdoctoral Topic Proposal

Primary QT: QT3

Secondary QT: QT5

Is it only relevant for the processing of speech sounds or does it also play a role in language processing and meaning understanding in general?

Can sensory codes be used to predict motor behavior (production) and can motor codes be used to predict sensory outcomes (perception)?

And in which manner can such sensorimotor integration go from arbitrary speech sounds to well-structured meaningful words and language behavior?

including a CV and motivational letter (1-2 pages).

Candidates who have a strong background in speech and language processing and/or knowledge of spatiotemporal neurophysiological techniques and analyses, will be considered as a strong plus.

Contour, rhythm or content? What does dogs brain grasp from human speech?

Efficiency of a Virtual Reality Headset to improve reading in low vision persons

PhD grants

Postdoctoral