International Symposium "Vision by Brains and Machines"
November 13th - 17th
Montevideo, Uruguay
In the following paragraphs there is a brief summary of the main
interests of the invited speakers according to their web pages. Click
on the
image to see the
brief summary.

To make progess in understanding the operations of the human brain, we will need to understand its basic functions at an abstract level. One way to achieve such an understanding is to create a model of a human that has a sufficient amount of complexity so as to be capable of interpreting abstract behavioral models. Recent technological advances have been made that allow progress to be made in this direction. Virtual reality(VR) graphics models that simulate extensive human capabilities can be used as platforms from which to develop synthetic models of visuo-motor behavior. Currently such models can capture only a small portion of a full behavioral repertoire, but for the behaviors that they do model, they can describe complete visuo-motor subsystems at a useful level of detail. The value in doing so is that the s elaborate visuo-motor structures greatly constrain and simplify the specification of the abstract behaviors that guide them. The result is that, essentially, one is left with proposing an operating model for picking the right set of abstract behaviors at each instant. This paper outlines one such model. A centerpiece of the model uses vision to aid the behavior that has the most to gain from taking environmental measurements. Preliminary tests of the model against human performance in realistic VR environments show that the main features of the model show up in human behavior.

Perception as an active process is well described by Helmholtz's excellent phrase, "perception as unconscious inference". The information provided to our senses about the external world is incomplete and often capable of more than one interpretation. We make use of memories, expectations of various sorts, and combination of inputs from more than one sensory modality to restrict the set of possible interpretations and to reach a conclusion about the distal cause of proximal input to our sensory receptors. Unconscious perceptual inference about the properties of external objects based on incomplete information has parallels with more cognitive levels of information processing such as the construction of knowledge or belief systems and the development of language competence in children.
Active perception, as examined at the behavioral level has parallels in the neuroanatomy of sensory pathways. Thus, all sensory regions of the brain receive input not only from the periphery or from structures that are closer to the periphery, but also from other brain structures. This central input to sensory structures includes input from higher-level structures of the same sensory modality, from structures associated with other sensory modalities, and from motor command related structures. Such central input is especially prominent in several cerebellum-like structures that process sensory signals from the periphery.
My talk will begin with examples of active perception and a description of the general features of descending input to sensory structures. The talk will then continue with examples of central control of sensory processing with an emphasis on cerebellum-like sensory structures.
Neurophysiological studies over several decades have observed that visual cortical neurons respond to oriented structures in the visual field and that the firing rates of these cells can be characterized in terms of a linear filter response to a visual stimulus which then undergoes a non-linear transformation followed by a stochastic spike generation process. We argue that this simplified model is flawed and present two new pieces of evidence suggesting the need for a different model of visual cortical activity. First, current methods for measuring visual receptive field properties using reverse correlation ignore spatial correlations induced at earlier stages of visual processing (in retina an thalamus). When we model this spatial propagation, we find that the classical Gabor-like tuning of V1 neurons can be explained by various types of random filters. Second, computational models of natural scene patches have also led researchers to Gabor-like representations of visual scenes. These models fail to account for the spatial overlap of receptive fields found in the visual system. Instead we formulate a probabilistic model of natural scenes as a Markov random field with overlapping neighborhoods. The overlap results in a very different representation from previous patch-based models but one that is still consistent with neurophysiological evidence. Together these results suggest a need for new mathematical tools for measuring the properties of V1 cells and new computational models of visual cortical processing.
Joint work with Stefan Roth.
Rachid DERICHE graduated from Ecole Nationale Supérieure des Télécommunications, Paris, in 1979 and received the Ph.D degree in Mathematics from the University of Paris IX, Dauphine in 1982.
He is currently a Research Director at INRIA Sophia-Antipolis and performs his research activities in the Odyss\'{e}e laboratory located in Sophia-Antipolis and Ecole Normale Sup\'erieure, Paris.
His research interests are in Computer Vision and Bilogical Vision and include Partial Differential Equations, Level-Set techniques, Variational and Geometrical approaches applied to Image Processing and Computer and Biological Vision and also the use of functional imaging with an emphasis on Diffusion MRI for brain image analysis. More generally, he is very interested by the application of mathematics to Computer and Biological Vision and Image Processing. He has authored and co-authored more than 120 scientific papers and graduated more more than 15 PhD students.
He has served as the principal investigator in many European projects since the mid 1980's and acted as area chair, co-organiser or member of the program comittees of the main conferences in his domain (Iccv, Eccv, Cvpr, Scale-Space, Icip, Icpr, Caip, Vlsm...). Recently, he has been local Chair of the 9th ICCV in Nice (Oct 13-17, 2003), co-organizer of VLSM'2003 (Oct. 11-12, 2003) and area chair for the 7th, 8th and 9th ECCV to be held in May 2006. To find out more about his research and some selected publications take a look here

Diffussion MRI is a Magnetic Resonance Imaging (MRI) modality able to quantify in vivo and non invasively the diffusion of water molecules in biological tissues such as the white matter in the brain. This relatively new imaging modality, pioneered twenty years ago by Denis Le Bihan (CEA-SHFJ, Paris), acquires at each voxel, image intensities, referred to as diffusion, related to the relative mobility of endogenous tissue water molecules and reflecting the structure of the underlying biological tissues at a microscopic scale, well beyond the usual image resolution.
In 1994, Peter Basser (NIH, Bethesda), together with J. Mattiello and D. LeBihan, introduced the formalism of the Diffusion Tensor (DT) and what is known as DT-MRI. P. Basser proposed to characterize the orientation dependence of diffusion by an effective self-diffusion tensor given by a $3 \times 3$ symmetric positive definite tensor ${\bf D}$ and to estimate it directly from the signal intensities.
In this talk, I will first introduce this recent and exciting imaging modality and then present the variational approaches we developped for the estimation, regularization and segmentation of diffusion tensor images (DTI). These algorithms open the possibility of recovering a detailed geometric description of the anatomical connectivity between brain areas and distinguish the anatomical structures of the cerebral white matter. Applications to synthetic, phantom and real data sets from human cerebral white matter structures will illustrate the results obtained.
In the last part of my talk, I'll conclude by presenting and discussing some parts of our recent work on tractography and HARDI (High Angular Resolution Diffusion Imaging)
he explains their research: "Using a
multidisciplinary approach our research targets the biophysical and
connectional substrate underlying the genesis and expression of the
functional properties of receptive fields of primary visual cortex
neurons. Our aim is to explore different possible models of receptive
field functional selectivity, using combined techniques of
intracellular electrophysiology (whole cell patch and sharp
electrodes) and neuromimetic simulation. By developing synaptic
activity functional imaging techniques, we are making it possible to
explore the spatio-temporal dynamics of cortical receptive fields. In
parallel, we have also developed compartmental models of visual
cortical neurons and appropriate analytical tools for the fine study
of interactions between various post-synaptic mechanisms and
spatio-temporal patterns of input signals, and for the simulation of
coordinated activity in large assemblies of neurons (involving several
hundred neurons)."
In recent years, because cameras have become inexpensive and ever more prevalent, there has been increasing interest in video-based modeling of shape and motion. This has many potential applications in areas such as electronic publishing, entertainment, sports medicine and athletic training. It, however, is an inherently difficult task because the image-data is often incomplete, noisy, and ambiguous.
We will present real-time techniques for detecting rigid and deformable surfaces in 2 and 3-D, with no need whatsoever for a priori pose knowledge. To this end we introduce novel approaches to creating low-dimensional deformation models and to fitting them robustly to image-data.
We will demonstrate our approach in the context of Augmented Reality applications and shape analysis in sport environments.

Completion phenomena are theoretically important because they reveal how the visual system overcomes the local gaps of optic information. Gestalt theorists proposed that amodal completion is driven by a tendency towards simplicity. I will discuss strengths and weaknesses of such an idea and refer to specific cases of 2D and 3D completion, supporting the following specific hypotheses: surface-level processes integrate contour-level processes; retinal constraints play a non trivial role; approximation explains the perceived shape of partially occluded surfaces better than interpolation.

By most accounts, we are not born with the abilities to perceive and recognize. To a baby, the world is just a "booming and buzzing confusion". Yet, in the course of several months, our brains learn to interpret the 3D world based on the 2D retinal images and become able to recognize a multitude of objects and object categories in spite of their significant variations in pattern appearance. Statistical regularities in the visual events in our natural experience is likely a key factor driving this developmental process. There are a variety of statistical regularities, such as correlation of signals in time, in space, and between different visual modalities. I will describe some of the statistical regularities that we observed from the analysis of 3D scenes and 2D images, and discuss how they can be exploited to develop data-driven computational approaches for shape inference. These statistical regularities from natural scenes can be related to a number of correlational structures we observed in neurons and neuronal ensembles in macaque V1. We will discuss the significance of these correlational structures for visual inference and representational learning.
Joint work with Brian Potetz and Jason Samonds.

We have analyzed datasets coming from electrode recordings of evoked neural activity in V1, where the stimuli were gratings with varying orientation and spatial phase. We relate certain properties (topological invariants) of the space of stimuli to those of the space of neural Responses.

"Computational
gestalt, meaningful multisegments k-gons"
Joint work with R. Grompone,
J. Jakubowicz, J. M. Morel and G. Randall

Classic colorimetric approach consider color a pointwise information, but this is contraddicted by the perceptual experience. There is a growing family of algorithms that treat color information in its visual context, also known as spatial color methods (e.g. Retinex, ACE).
In this work we aim at presenting some common characteristics of these models and their relationship with human color perception. Two main variables affect the final result of these algorithm: their parameters and the visual characteristics of the image they work on. With the term visual characteristics we refer not only to their imaging parameters, like e.g. pixel mean value or used dynamic range, but also to their spatial distribution in the image.
A survey of the more significant visual configurations will be presented and discussed, hopefully allowing a deeper understanding of their behavior and stimulating further research directions.
Alessandro Rizzi, J.J. McCann

In this talk I will show how fundamental problems in image and video editing can be efficiently addressed with fundamental mathematics. After a brief introduction to the area of image inpainting, meaning the modification of an image in a non-detectable form, I will concentrate on image and video colorization, the art of colorizing black and white data. I will show the connection of this problem with Hamilton-Jacobi equations and distance functions, and a technique to compute them in linear time. I will also describe how to use similar techniques for other special effects and user-oriented image segmentation. The talk will have a combination of fundamental math, computational tools, and a lot of examples. Related open questions to the biology side of the street will be presented as well.
I will also spend a few minutes introducing the posters to be presented by other members of my group, including work on sparse representation of color images (Julien Mairal) and stratification learing (Gloria Haro).
The works covered in this talk and the posters are in collaboration with Liron Yatziv, Julien Mairal, Gloria Haro, Alberto Bartesaghi, Gregory Randall, Michael Elad, and Alexis Protiere.

The external world is represented in the brain as spatiotemporal patterns of electrical activity. Sensory signals, such as light, sound, and touch, are transduced at the periphery and subsequently transformed by various stages of neural circuitry, resulting in increasingly abstract representations through the sensory pathways of the brain. It is these representations that ultimately give rise to sensory perception. Deciphering the messages conveyed in the representations is often referred to as reading the "neural code". The fundamental goal of sensory prostheses is to induce sensory percepts through surrogate stimulation of neurons in the sensory pathway. Given the rather abstract representations of the sensory world within the neural circuitry, a clear understanding of the neural code is necessary for achieving the desired percept. With this motivation, our laboratory has focused on various challenges posed by this problem, two of which will be discussed. First, a ubiquitous property of neurons throughout the various sensory pathways of the brain is the ability to adapt their response properties to changes in the external environment. Adaptation therefore poses a unique challenge in reading the neural code, as the "words" constantly change meaning. Secondly, in interpreting neural representations, it is necessary to define the relevant temporal and spatial scales. Time scales at the millisecond level are important in neural coding, even in situations where the sensory world is changing over much slower time scales. Taken together, an understanding of these complexities and others is critical for ultimately relating spatiotemporal patterns of neural activity to sensory perception, and thus for the development of engineered devices for replacing or augmenting neural function lost to trauma or disease.