PhD projects

1/10/2020 - 30/09/2023

Bridging communication in behavioural and neural dynamics

Isaïh Mohamed
Supervisors : Daniele Schön, Institut de Neurosciences des Systèmes & Leonardo Lancia Laboratoire de Phonétique et Phonologie

The aim of this project is to bridge interpersonal verbal coordination and neural dynamics. In practice, we will collect neurophysiological data on individuals (mostly patients with intracranial recordings) performing different interactive language tasks. We will use natural language processing methods to estimate objective features of verbal coordination on speech/language signals. Then we will use machine learning and information theory driven approaches to bridge the dynamics of the coordinative verbal behavior to spatio-temporal neural dynamics.
More precisely, we plan to use several tasks that have been proven to be efficient in the study of verbal interactions. Some tasks are rather constrained and controlled (allowing to manipulate the coordinative dynamics) while others assess conversation in more natural conditions. Speech recordings allow quantifying coordination at different linguistic levels in a time resolved manner. These metrics can then be used to interpret changes in neural dynamics as a function of verbal coordination. We plan to use different approaches, a machine learning approach (decoding the speech signal of the speaker based on the neural signal of the listener) as well as information-theoretic approach (to model to what extent the relation between neural signals and upcoming speech is influenced by the current level of coordination estimated by convergence, for instance).
Overall, this project will allow gathering a better understanding of the link between behavioural coordinative dynamics and neural dynamics. For instance, compared to simple coordinative dynamics, more difficult coordinative behaviour will probably require a change in the ratio between top-down and bottom-up connections between frontal regions and temporal regions in specific frequency bands (increase of top-down beta and decrease of bottom-up gamma).
The strength of this project is to merge sophisticated coordination designs, advanced analysis of verbal coordination dynamics and front edge neuroscience tools with unique neural data in humans.


1/10/2020 - 30/09/2023

Ouvrir une fenêtre sur l'esprit des lecteurs : Détermination par TMS et EEG du réseau cortical impliqué dans le comportement oculomoteur de lecture

Régis Mancini
Supervisors : Françoise Vitu, Laboratoire de Psychologie Cognitive & Boris Burle Laboratoire de Neurosciences cognitives

Les mouvements oculaires pendant la lecture ont été étudiés depuis plus d’un siècle, révélant un comportement très stéréotypé, en dépit même d’une importante variabilité de l’amplitude des saccades et des positions des fixations sur les lignes de texte. La majorité des modèles proposés pour rendre compte de ce comportement repose sur un guidage cognitif du regard, et suppose donc un contrôle essentiellement descendant. Ces modèles descendants sont néanmoins contredits par le fait rapporté récemment qu’un modèle analphabète de programmation des saccades dans le colliculus supérieur, une structure sous-corticale multi-intégrative, prédise assez fidèlement le comportement oculomoteur des lecteurs simplement à partir de traitements visuels précoces effectués dès la rétine (contraste de luminance). Ce résultat suggère au contraire un rôle secondaire du néocortex dans le contrôle oculomoteur pendant la lecture.
La thèse envisagée aura pour but d’une part de caractériser le réseau cortical impliqué dans le contrôle oculomoteur pendant la lecture, et d’autre part de déterminer la dynamique temporelle d’activation de ces différentes aires corticales. Ces recherches reposeront d’abord sur l’utilisation de la stimulation magnétique transcrânienne (TMS), permettant d’inactiver transitoirement une aire corticale donnée chez des participants sains, conjointement à l’enregistrement des mouvements oculaires pendant une tâche de lecture de phrases. L’effet de l’inactivation d’une aire corticale donnée sur les comportements oculomoteurs classiquement observés renseignerait donc de son implication dans la lecture. Dans un second temps, les études TMS seront complétées par une approche basée sur des enregistrements électroencéphalographiques (EEG).


Eye movements during reading have been studied for more than a century, revealing a very stereotyped behaviour, despite a significant variability in the amplitude of saccades and the positions of fixations on the lines of text. Most of the models proposed to account for this behaviour are based on a cognitive guidance of the gaze, and therefore presuppose an essentially top-down control. These top-down models are nevertheless contradicted by the recently reported fact that an illiterate model of saccade programming in the superior colliculus, a multi-integrative subcortical structure, fairly accurately predicts the oculomotor behaviour of readers simply from early visual processing (luminance contrast). This result suggests on the contrary a secondary role of the neocortex in oculomotor control during reading.
The thesis aims on the one hand to characterize the cortical network involved in oculomotor control during reading, and on the other hand to determine the temporal dynamics of activation of these different cortical areas. This research is primarily based on the use of transcranial magnetic stimulation (TMS), which temporarily inactivates a given cortical area in healthy participants, in conjunction with the recording of eye movements during a sentence-reading task. The effect of the inactivation of a given cortical area on the oculomotor behaviours classically observed would therefore indicate its involvement in reading. In a second step, TMS studies will be complemented by an approach based on electroencephalographic (EEG) recordings.

1/10/2020 - 30/09/2023

Development of Children's Communicative

Mitja Nikolaus
Supervisors : Abdellah Fourtassi (Aix-Marseille Université, Laboratoire d'Informatique Système), Laurent Prévot (Aix-Marseille Université, Laboratoire Parole et Langage)

Research Lab: CoCoDev
The study of how the ability for coordinated communication emerges in development is both an exciting scienti c frontier | at the heart of debates about the uniqueness of human cognition (Tomasello, 2014) | as well as an important applied issue for AI (Antle, 2013).
Early signs of coordination (e.g., through gaze and smile) can be found in preverbal infants (Yale, 2003), but the ability to engage in coordinated verbal communication (Clark, H. & Brennan, 1991) takes years to mature.
Learning such coordination, especially with the caregivers, is crucial for the child's healthy cognitive development (Ho , 2006; Gelman, 2009).
Very few studies examined the nature of children's communicative coordination and its development in the natural environment (that is, outside controlled laboratory studies).
Further, existing naturalistic studies (e.g., Clark, E. 2015), though insightful, have been based on anecdotal observations, leading to rather qualitative conclusions.
Thus, previous work did not provide any theoretical model that could explain, quantitatively, the naturally occurring data, let alone provide a basis for theory-informed applications. This project will contribute to ll this gap.
We will combine AI tools from NLP and Computer Vision to study the multimodal dynamics of children's communicative coordination with caregivers, laying the foundation for a data-driven model that would 1) provide us with a scienti c understanding of the natural phenomena and 2) guide us through the design of child-computer interaction systems that can be used to test and evaluate the model.

Antle (2013). Research opportunities: Embodied child{computer interaction. International Journal of Child-Computer Interaction.
Clark, E. (2015) Common ground. The Handbook of Language Emergence.
Clark, H. & Brennan (1991). Grounding in communication. Perspectives on socially shared cognition.
Gelman (2009). Learning from others: Children's construction of concepts. Annual review of psychology.
Ho (2006). How social contexts support and shape language development. Developmental Review.
Tomasello (2014). A natural history of human thinking. Cambridge, MA: Harvard University Press.
Yale, Messinger, Cobo-Lewis, & Delgado (2003). The temporal coordination of early infant communication. Developmental Psychology.

1/11/2019 - 31/10/2022

The emergence of social conventions in the categorisation of speech sounds

Elliot Huggett
Supervisors : Noël Nguyen (Aix-Marseille Université, Laboratoire Parole et Langage) , Nicolas Claidière (Aix-Marseille Université, Laboratoire de Psychologie Cognitive)

The ability to establish shared conventions is a fundamental part of linguistic categorisation behaviour. Across all levels of a language, speakers of that language must agree upon ways in which the world is divided up, in order for successful communication to be possible. This behaviour is often studied in the domain of semantics, with words for colours and kinship terms being prime examples of problems that need to be solved with categorisation, but have many different possible solutions that are found in the worlds languages. A categorisation problem with similarly diverse cross-linguistic solutions that remains unstudied, however, is the categorisation system that must be established for a language to have a shared, consistent phonology. The space of possible vowels is continuous, and must be divided up into discrete categories, not only as a function of the individual perceptions of one speaker, but taking into account the perceptions of all speakers of the language, so that a shared and consistent solution can be reached. This solution will be the result of generations of speakers interacting and constantly solving the problem, with the solution of a given language being a culturally evolved behaviour, with the categorisation system at any given time being not only a result of dynamics between speakers of the language at that time, but the dynamics between speakers of that language stretching back generations.
In this project we aim to investigate the effects different group dynamics have upon conventionalisation of speech sound categories. This will be done through a series of innovative speech categorisation experiments, in which participants are trained on categories at the extremes of an acoustic space bounded by sounds unfamiliar to them in their native language, and then asked to categorise other sounds existing within this space. Initially, this will be performed individually to establish their prior tendencies. Following from this, they will be assigned to pairs, larger groups, or iteration chains, and the ways in which their categorisation becomes conventionalised under these different social dynamics will be studied. The results from these experiments will be used to inform the creation of multi-agent models, to help us better understand the dynamics at play and the ways in which interaction and transmission lead to the conventionalisation of categories in the unfamiliar acoustic space. Over the course of the project new methods and analyses, drawing on and combining the literatures surrounding phonology, categorisation behaviour, and cultural evolution will be developed, providing key insight and highlighting further areas of interest in a fundamental part of linguistic behaviour.

1/10/2019 - 30/09/2022

Wavelet-based muldimensional characterizaon of brain networks in language tasks

Clément Verrier
Supervisors : Bruno Torrésani (Instut de Mathémaques de Marseille) ,  Christian Bénar (Instut de Neurosciences des Systèmes)

Brain function involves complex interactions between cortical areas at different spatial and temporal scales. Thus, the spatio-temporal definition of brain networks is one of the main current challenges in neuroscience. With this objective in view, electrophysiological techniques such as electroencephalography (EEG) and magnetoencephalography (MEG) offer a fine temporal resolution that allows capturing fast changes (at the level of the millisecond) across a wide range of frequencies (up to 100 Hz).
However, the spatial aspects require solving a difficult (extremely ill-posed) inverse problem that projects the signals recorded at the level of surface sensors to the cortex. Current techniques for extracting spatio-temporal networks in MEG and EEG suffer from the inherent difficulties arising from solving the inverse problem. We propose to use a novel wavelet analysis approach in order to improve the extraction of language networks from MEG signals. The methods will be validated using simultaneous MEG-intracerebral EEG recordings. More precisely, the objective is to develop algorithms and data analysis procedures for spatio-temporal characterization of brain networks across multiple frenquencies, for EEG and MEG signals, validate them on simulated and real signals, and apply the developed methodology on language protocols in the framework of ILCB.

1/11/2018 - 31/10/2021

Implication of Subcortical Brain Structures and Cortico-Subcortical Loops in Early and Late Stages of Speech Motor Sequence Learning: a Within Subject fMRI/MEG Study

Snežana Todorović
Supervisors : Elin Runnqvist (Aix-Marseille Université, CNRS-LPL), Sonja Kotz (University of Maastricht)
Collaborator: Andrea Brovelli (Aix-Marseille Université, CNRS-INT)

The ability to interpret and produce structured sequences with meaning is undoubtedly at the core of human language. Most frequently, the learning of such sequences in language consists of auditory speech-input being used to learn articulatory speech-output. In my PhD project, I use MEG and fMRI to examine the implication and potential cooperation across different timescales of cortical and subcortical structures involved in skills that are related to, part of, or a prerequisite for the acquisition of new speech motor sequences, such as error monitoring, motor control, vocal learning in songbirds or motor sequence learning in humans and non-human primates. While using MEG will enable us to investigate functional connectivity dynamics between brain regions across different time scales, which is specially relevant when studying an intrinsically dynamical process such as learning that likely happens on different timescales, fMRI will give us necessary spatial resolution for detecting activation and functional networks in subcortical regions.

1/11/2018 - 31/10/2021

Understanding the vocal brain using new deep learning approaches

Shinji Saget
Supervisors : Thierry ARTIERES (PR1 AMU), Pascal BELIN (INT)

01/10/17 - 30/09/20

Rational exploitation of available intonational cues:  

The case of the signaling relationship between the Initial Rise of F0 and the (non-corrective) contrastive focus

Axel Barrault
Supervisors : James Sneed GERMAN (LPL),Pauline WELBY (LPL)

In languages where intonation has a post-lexical function – as in French, intonational contours extend beyond the word level and signal information structure and discourse relations (Ladd, 2008) in spite of the pervasive variability. Despite contingencies, there is no one-to-one mapping between an intonational contour and a discourse function (e.g., Grice et al., 2017). Yet listeners make use of intonational cues to speed up the processing of continuous speech. This lack of invariance (Liberman et al., 1967) has previously led to a probabilistic view of perceptual processing. Listeners continuously integrate several parameters - linguistic or not - interacting together in an incremental manner. For some, prediction is a central mechanism in speech processing, as has been shown for the syntactic (Kleinschmidt & Jaeger, 2015; Kamide, 2012; Fine et al., 2013), pragmatic (Grodner & Sedivy, 2011; Yildirim et al., 2016), and speech processing (Clarke & Garrett, 2004; Bradlow & Bent, 2008; Creel et al., 2008) levels. Intonation processing too involves the integration of 'bottom-up' acoustic cues interacting with 'top-down' predictions that accelerate speech processing (Ito & Speer, 2008; Ip & Cutler, 2017) and on the other hand a rational adaptation mechanism (Kurumada et al., 2014; Roettger & Franke, 2019).

In this project, we seek to determine which factors modulate listeners' evaluation of the evidential strength of intonational cues in order to infer communicative intent. In line with the literature on pragmatic reasoning, we assume that listeners' interpretation can be modeled in terms of Bayesian inferences (Goodman & Frank, 2016). We build on the hierarchical Bayesian model of the evidential strength of intonational cues proposed in Roettger & Franke's (2019) study on German. We attempt to refine the predictions of this probabilistic model by providing data in French on the role of factors contributing to the development of hierarchical prior beliefs about the speaker's production behavior, as well as on the interaction of factors conditioning their actualization through exposure.

To this end, we have been studying in perception and in production experiments a case of probabilistic association between the so-called "initial" rise (IR) in fundamental frequency (F0) and contrastive (non-corrective) focus in French, reported in production (German & D'Imperio, 2016). Several factors influence the presence of IR (Jun & Fougeron, 2000; Welby, 2006). Yet, German and D’Imperio (2016) concluded that this “relatively weak association can nevertheless be informative in a model of interpretation that integrates multiple probabilistic inputs to initial rise occurrence”. In fact, the IR has been consistently shown to be used in word segmentation (Welby, 2007; Spinelli, Grimault, Meunier & Welby, 2010).

The prosodic system of French having its own particularities, it presents a particular angle to study the flexibility of abstract representations. Moreover, the associations between discourse functions and intonational patterns are less regular in French than in the Germanic languages from which most of the studies on listeners' adaptation to the variability of intonational level cues originate. In other words, the nature of a reliable signaling relationship between intonational cues and discourse function in French differs from that in English. This research will allow a better understanding of the link between production and perception by providing information on the phonology of French intonation. However, despite the specificities of the French system, the case of prosodic implementation of contrastive focus examined in our study is observed in many - if not all - of the languages studied. The same is true for the adaptation mechanism observed at all levels of processing. This research will contribute to the discussion of the flexibility of abstract representations and their relationship with variability in interaction with other constraints. And thus, it contributes to the development of a model of the cognitive architecture of human communication.


01/10/17 - 30/09/20

Understanding the vocal brain using new deep learning approaches

Tom Dagens
Supervisor : Thierry ARTIERES (PR1 AMU) head of QARMA team on Machine Learning, Laboratoire d’Informatique Fondamentale (LIF)
Pascal BELIN (PR1 AMU), head of team “Neural Bases of Communication”, Institut de Neurosciences de La Timone (INT)

01/10/16 - 30/09/19

Soutenue le 10/07/2020

Déficits de la production du langage et modifications électrophysiologiques après rééducation dans l'épilepsie pharmaco-résistante.

Fasola Alexia
Supervisor: Agnès TREBUCHON (INS), F.-Xavier ALARIO (LPC)

Patients with focal drug-resistant epilepsy may present language production deficits in various situations. We focused our interest on inter- and post-ictal deficits assessed thanks to standardized tests before the surgery, with the aim of describing language production deficits of these patients and to propose a pre-surgical care. Given the lack of studies about the neuroplasticity induced by a language therapy in epileptic patients, we documented studies about electrophysiological markers of this plasticity in vascular aphasic patients. Our review shows that the success to the therapy is linked to neural changes in the left temporal lobe for early latencies and in the left frontal areas for later processes involved in the language production. For the first time, we studied language therapy effects on behavioral skills and neural activities in epileptic patients with anomia. Some patients improved their naming skills. In these patients, we found fronto-temporal activity changes, notably in the left Inferior Frontal Gyrus around 500 ms post-stimulus onset. This result may be linked to its role in the lexical selection process, which is impaired in included patients. In another hand, we showed that a global study of the language production gives information about compensatory mechanisms occurring during post-ictal aphasia. We observed an increase of the production of gestures involved in the support of lexical access. To conclude, our works show that the pathological epileptic model is accurate for the study of language production because it allows to investigate different contexts of deficit. We point the way towards a use of pre-surgical language therapy.


01/10/16 - 30/09/19

Et s’il me fait rire ? Quel impact sur la communication homme-machine de la prise en compte de mécanismes humains d’interaction sociale, comme l’humour, dans un agent conversationnel virtuel ?

Riou Mathieu

01/10/15 - 30/09/18

Soutenue le 6/12/2019


Clementine Bodin
Supervisor: Pascal Belin (INT), Olivier Coulon (LSIS)

Vocal communication, which is integrated into human language, can also be found in the behavior of other primates. The question is whether these similarities are reflected in similar cerebral processes of this information. This thesis aimed to explore the anatomical and functional substrates of voice perception in primates by adopting a comparative approach. It was structured around two main research axes: I. The anatomical-functional investigation of temporal voice areas (TVAs) in relation to the anatomy of the superior temporal sulcus (STS) in humans. In this perspective, functional activity in the TVAs was found maximal in the deepest region of the STS bilaterally. However, this relationship was less systematic at the individual level, mainly due the presence of plis de passage (PP) that constitute an important source of variability. Investigation of the underlying structural connectivity revealed that they constituted preferential white matter crossing places connecting the two banks of the sulcus. II. In a second axis, we performed a comparative study of voice areas in humans and monkeys through functional imaging. Several voice areas (brain areas more sensitive to human voice in humans and to monkey vocalizations in monkeys compared to vocalizations of other species or non-vocal sounds) were found in both species, mainly in the temporal lobe. Together, the results suggest the existence of a complex cortical network dedicated to the processing of conspecific vocalizations, relatively conserved across primates and exhibiting a high individual variability inherent in its high-level social functions.


01/10/15 - 30/09/18

Soutenue le 7/09/2018

Readers are parallel processors

Joshua Snell
Supervisor: Jonathan Grainger (LPC)

This thesis addresses one of the most hotly debated issues in reading research: Are words processed serially or in parallel during reading? One could argue that this is primarily a question of visuo-spatial attention: is attention distributed across multiple words during reading? The research presented here suggests that attention can indeed be allocated to multiple words at once. It is further established that attention is a key factor driving (sub-lexical) orthographic processing. The next question, then, is whether multiple lexical representations can be activated in parallel. This thesis comprises a wealth of evidence for parallel lexical activation: firstly we have found that readers activate embedded words (e.g., ‘use’ in ‘houses’) alongside the word that is to be recognized, indicating that parallel lexical processing would occur even if readers could effectively focus their attention on single words. Moreover, we have found that semantic and syntactic categorization decisions about foveal target words are influenced by the semantic and syntactic aspects of surrounding words, even when all these words are presented for a duration shorter than the average time needed to recognize a single word. Hence, given that readers’ attention is spread across multiple words and that multiple lexical representations can be activated in parallel, it seems reasonable to claim that the reading system is in principle a parallel processing system.

These SNELL_Joshua

01/11/14 - 31/10/17
Soutenue le 18/12/2018

From auditory perception to memory : musicianship as a window into novel word learning

Eva Maria Dittinger
Supervisor: Mireille Besson (LNC), Mariapaola D’Imperio (LPL)

Based on results evidencing music training-related advantages on speech processing, perceptive and cognitive functions, we examine whether music training facilitates novel word learning throughout the lifespan. We show that musically-trained children and young professional musicians outperform controls in a series of experiments, with faster brain plasticity and stronger functional connectivity, as measured by electroencephalography. By contrast, advantages for old adult musicians are less clear-cut, suggesting a limited impact of music training to counteract cognitive decline. Finally, young musicians show better long-term memory for novel words, which possibly contributes, along with better auditory perception and attention, to their advantage in word learning. By showing transfer effects from music training to semantic processing and long-term memory, results reveal the importance of domain-general cognitive functions and open new perspectives for education and rehabilitation.


01/10/14 - 30/09/17

Soutenue le 13/12/2018

Interactions entre langage oral et langage écrit lors du traitement de mots isolés et de phrases : comparaison d'adultes dyslexiques et normo-lecteurs

Ambre Denis-Noël
Supervisor: Chotiga Pattamadilok (LPL), Pascale Colé (LPC)

L’objectif de ce travail est d’examiner les interactions entre représentations orthographiques et phonologiques chez les individus dyslexiques et normo-lecteurs de niveau universitaire. Ces interactions ont été examinées en modalité orale et en modalité écrite, en traitement de mots isolés et de phrases. L’influence des représentations orthographiques sur le traitement de la parole a été examinée via la manipulation de la consistance orthographique et l’enregistrement de l’activité EEG, celle des représentations phonologiques lors de la lecture via la manipulation de la consistance phonologique et l’enregistrement des mouvements oculaires. Les résultats obtenus suggèrent l’existence de connexions bidirectionnelles entre orthographe et phonologie qui influenceraient le traitement du langage à l’oral comme à l’écrit, chez les deux populations. En traitement de mots isolés, ces influences sont plus tardives chez les adultes dyslexiques que chez les normo-lecteurs, dans les deux modalités. En lecture de phrases, les deux populations montrent une influence précoce des représentations phonologiques suggérant l’usage de mécanismes de compensation chez les individus dyslexiques. En compréhension de phrases auditives, les représentations orthographiques semblent influencer différents stades de traitement selon la population : les normo-lecteurs semblent s’appuyer sur ces représentations lors de la désambigüisation des représentations phonologiques ; les individus dyslexiques uniquement lorsque le contexte phrastique permet la pré-activation des diverses représentations associées aux mots. Ces résultats sont notamment discutés dans le cadre du modèle connexionniste en triangle.



01/11/13 - 30/09/16

Soutenue le 17/02/2017

Évaluation de la parole dysarthrique : Apport du traitement automatique de la parole face à l’expertise humaine

Imed Laaridh
Supervisor: Jean-François Bonastre (LIA), Corinne Fredouille (LIA), Christine Meunier (LPL)

La dysarthrie est un trouble de la parole affectant la réalisation motrice de la parole causée par des lésions du système nerveux central ou périphérique. Elle peut être liée à différentes pathologies : la maladie de Parkinson, la Sclérose Latérale Amyotrophique(SLA), un Accident Vasculaire Cérébral (AVC), etc. Plusieurs travaux de recherche ont porté sur la caractérisation des altérations liées à chaque pathologie afin de les regrouper dans des classes de dysarthrie. La classification la plus répandue est celle établie parF. L. Darley comportant 6 classes en 1969, (complétée par deux classes supplémentaires en 2005)Actuellement, l’évaluation perceptive (à l’oreille) reste le standard utilisé dans lapratique clinique pour le diagnostique et le suivi thérapeutique des patients. Cette approcheest néanmoins reconnue comme étant subjective, non reproductible et coûteuseen temps. Ces limites la rendent inadaptée à l’évaluation de larges corpus (dans le cadred’études phonétiques par exemple) ou pour le suivi longitudinal de l’évolution des patientsdysarthriques.Face à ces limites, les professionnels expriment constamment leur besoin de méthodesobjectives d’évaluation de la parole dysarthrique. Les outils de Traitement Automatiquede la Parole (TAP) ont été rapidement considérés comme des solutions potentiellespour répondre à cette demande.Le travail présenté dans ce rapport s’inscrit dans ce cadre et étudie l’apport quepeuvent avoir ces outils dans l’évaluation de la parole dysarthrique, et plus généralementpathologique.Dans ce travail, une approche pour la détection automatique des phonèmes anormauxdans la parole dysarthrique est proposée et son comportement est analysé surdifférents corpus comportant différentes pathologies, classes dysarthriques, niveaux desévérité de la maladie et styles de parole. Contrairement à la majorité des approchesproposées dans la littérature permettant des évaluations de la qualité globale de la parole(évaluation de la sévérité, intelligibilité, etc.), l’approche proposée se focalise surle niveau phonème dans le but d’atteindre une meilleure caractérisation de la dysarthrieet de permettre un feed-back plus précis et utile pour l’utilisateur (clinicien, phonéticien,patient). L’approche s’articule autours de deux phases essentielles : (1) unepremière phase d’alignement automatique de la parole au niveau phonème (2) uneclassification de ces phonèmes en deux classes : phonèmes normaux et anormaux. L’évaluation de l’annotation réalisée par le système par rapport à une évaluationperceptive d’un expert humain considérée comme ”référence“ montre des résultats trèsencourageants et confirme la capacité de l’approche à detecter les anomalies au niveauphonème. L’approche s’est aussi révélée capable de capter l’évolution de la sévéritéde la dysarthrie suggérant une potentielle application lors du suivi longitudinal despatients ou pour la prédiction automatique de la sévérité de leur dysarthrie.Aussi, l’analyse du comportement de l’outil d’alignement automatique de la paroleface à la parole dysarthrique a révélé des comportements dépendants des pathologieset des classes dysarthriques ainsi que des différences entre les catégories phonétiques.De plus, un effet important du style de parole (parole lue et spontanée) a été constatésur les comportements de l’outil d’alignement de la parole et de l’approche de détectionautomatique d’anomalies.Finalement, les résultats d’une campagne d’évaluation de l’approche de détectiond’anomalies par un jury d’experts sont présentés et discutés permettant une mise enavant des points forts et des limites du système.


01/10/13 - 01/10/16

Soutenue le 15/01/2018

Analyses multimodales de l'interaction patient-médecin en situation de formation à l'annonce d'un événement indésirable grave : modélisation en vue d'implémenter un outil de formation par la réalité virtuelle

Jorane Saubesty
Supervisor: Marion Tellier (LPL), Daniel Mestre (CRVM)

Le projet ANR ACORFORMed, dans lequel s’inscrit cette thèse, a pour objectif la création (par des informaticiens) d’un agent conversationnel animé « patient » comme outil de formation à l’annonce, par la simulation et à l’aide d’un environnement virtuel. A l’aide de la méthodologie issue des études de la gestuelle et des apports de la littérature sur l’organisation des interactions, nous tentons de répondre à la question suivante : quelle est l’organisation structurelle globale de l’interaction patient-médecin, lorsque ce dernier se forme à l’annonce d’un dommage associé aux soins ? Les analyses menées dans cette thèse nous permettent de décrire l’interaction patient/médecin lors de formations à l’annonce en proposant différentes phases composants l’interaction, ainsi que des précisions quant à leur découpage et leurs articulations. Elles constituent une base indispensable et utilisable par les informaticiens pour concevoir et implémenter un agent conversationnel « patient » crédible qui pourra être utilisé dans la formation des médecins. Située au coeur d’un projet interdisciplinaire, cette thèse en linguistique permet donc de transposer les pratiques interactionnelles des médecins en vue de l’implémentation d’un agent virtuel par des informaticiens.