Multimodal Information Processing For Interface And Multimedia Technologies

  • Speaker: Jong-Seok LEE from Ecole Polytechnique F´ed´erale de Lausanne (EPFL)
  • Abstract: Humans are inherently multimodal information processors, i.e. they tend to use multiple sensory channels simultaneously for perception whenever possible. In fact, information perceived by humans often takes multiple channels, e.g. audio, visual, haptic, etc. This talk deals with multimodal information processing techniques applied to implementation of intelligent interfaces and multimedia analysis techniques. First, the concept and motivation of multimodal information processing is briefly introduced. Then, its specific applications are described, including audio-visual speech recognition, perceptual video compression based on audio-visual focus of attention, biosignal-based emotion recognition, and content-tag-user analysis in social networks.
  • Time: 16 :00 ~17 :15 on July 6 (Wednesday)
  • Location : Engineering Building 1, E204
  • Contact : Giljin Jang, Ph.D.(2119) -


  • Why (Multimodal) important?
    • Biological reasons
      • Humans & animals inherently integrate multiple sense
      • Different senses converge to same area in brain
    • Statistical reasons
    • Cognitive reasons
      • Cross-modal attention

Audio-visual speech recognition

Perceptual video compression based on audio-visual focus of attention

Biosignal-based emotion recognition

Topic revision: r1 - 06 Jul 2011, ToanMai
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback