MULTIMODAL SIGNAL PROCESSING AND MULTIDIMENSIONAL INTEGRATION


This talk presents an overview of ideas, methods and recent research results in multimodal signal processing  with emphasis on audio-visual fusion and multimodal attention-based event detection. We shall begin with a brief synopsis of important findings from  audio-visual perception.  Then we shall outline multimodal signal front-ends and computational models for sensor fusion with two application fields: i) audio-visual speech recognition and inversion and ii) multimodal saliency-based video summarization. We envision the multi-dimensionality of the underlying conceptual framework as a space-time analogy where the horizontal plane is the space for multi-sensory integration, along the vertical direction we may have a multilevel integration between low-level cues and high-level semantics, and along the time direction we can track the evolving dynamics.


Professor Petros Maragos
National Technical University of Athens
School of Electrical and Computer Engineering
Athens, Greece