Expressive gesture synthesis & recognition

Gesture communication and expression in advanced technologies such as new sensors, mobile devices, or specialized interactive systems, have given a new dimension to a broad range of applications never before experienced, such as entertainments, pedagogical and artistic applications, rehabilitation, etc.  The study of gestures requires more and more understanding of the different levels of representation that underly their production, from meanings to motion performances characterized by high-dimensional time-series data. This is even more true for skilled and expressive gestures, or for communicative gestures, involving high level semiotic and cognitive representations, and requiring extreme rapidity, accuracy, and physical engagement with the environment.

Our line of research focuses specifically on the study of variability in motion captured data, linked to different forms of expressiveness, or to the sequencing of semantic actions in selected scenarios. Motion capture is used for finding relevant features that encode the main spatio-temporal characteristics of gestures: low-level features are extracted from the raw data, whereas high-level features reflect structural patterns encoding linguistic aspects of gestures.


  • Concerning expressive gestures, the first challenge is to define and characterize the expressiveness and variability in human movement. As stated above, this expressiveness is considered at all levels of gesture generation, and involves both a semantic dimension (from actions that convey a specific meaning to sign languages that imply the linguistic aspects of phonetics, phonology, prosody, etc.), and an expressive dimension induced by intentional variations or emotional states of the actor, and results in variations in the produced signals.
  •  The second challenge is to explore new motion representation spaces that reflect the expressiveness and variability contained in the data. This implies to reduce the complexity of the high-dimensional motion data by proposing different embeddings for these data. Such embeddings should enable to characterize and parameterize specific action sequences, and give rise to original approaches for recognition, or generation of new behaviors, inspired by sensorimotor biological processes.
  •  The third challenge is to be able to link the different levels of representation, from narrative scenarios through structural patterns of actions, to continuous streams of motion data. More precisely, the aim is to extract structural patterns from data and to understand how these discrete patterns influence the synthesis of gestures while preserving the semantics of actions as well as subtle expressive variations.
  •  The fourth challenge concerns the definition of evaluation protocols that are necessary for evaluating the different hypothesis and models that are constructed at all the levels of the perception-production loop. Our approach follows the motor theory of perception, where motor production is necessarily involved in the recognition of sensory cues (audio, visual, etc.) and encoded actions. These evaluations will be considered both quantitatively and perceptually.

Leave a Reply