Location
Room 120, New Computer Science
Event Description

Title: Representation learning in image and audio processing: from sparse
models to deep learning.

Speaker: Pablo Sprechmann, New York University, http://cims.nyu.edu/~pablo/

Abstract: Learning representations is a set of techniques that aim at learning
to transform raw input data into a representation that can be effectively
exploited in a high level task such as restoration, prediction, or
classification. In this talk I will discuss two successful techniques for
learning representations from natural audio and image data: sparse modeling
and deep learning. In the first part of the talk, I will discuss interesting
connections between these two approaches. Sparse models received a lot of
attention in recent years, achieving numerous state-of-the-art results in
various signal processing applications. Traditionally, such modeling
approaches rely on an iterative algorithm that minimizes an objective
function. The inherently sequential structure and the data-dependent
complexity and latency of iterative optimization tools often constitute a
major computational bottle-neck. Another limitation encountered by these
modeling techniques is the difficulty of their inclusion in discriminative
learning scenarios. To overcome this limitations, we develop a process-centric
view of sparse modeling, in which a learned deterministic fixed-complexity
pursuit process is used in lieu of iterative optimization establishing
connections with representations learned using deep neural networks. I will
illustrate these ideas in several audio and image processing tasks. A
fundamental ingredient in the success of sparse models, is their ability to
capture local regularity in the data. However, these methods are not design to
model global properties of the signal which are key to capture complex
geometrical structures and textured regions. On the other hand,
representations learned for solving discriminative tasks, such as object
recognition, are global representations, stable to local deformations. In the
second part of the talk I will discuss a new method for exploiting
representations learned from discriminative tasks in the context of generative
models. I will illustrate our method on an image super resolution task. The
idea is to use as conditional model a Gibbs distribution, where its sufficient
statistics are given by deep Convolutional Neural Networks (CNN). The
resulting sufficient statistics minimize the uncertainty of the target signals
given the degraded observations, while being highly informative.

Bio: Pablo Sprechmann is currently a postdoctoral researcher in Yann LeCun's
group at CILVR lab, Computer Science Department, Institute for Mathematical
Sciences, New York University. He received an MSc degree from the Universidad
de la República, Uruguay, in 2009, and a PhD degree in 2012 from the
Department of Electrical and Computer Engineering, University of Minnesota. He
worked as postdoctoral researcher at the ECE Department, Duke University
during 2013. His main research interests include the areas of machine learning
and its application to computer vision, signal processing and music
information retrieval.

Room 120 New Computer Science
Refreshments will follow the talk.

Event Title
Fac Cand & CSE 600: Pablo Sprechman