Vision and Visual Attention

Exploring a domain-specific approach to object recognition

 Background.  Vision science is among the best developed and vigorous research areas in the cognitive sciences. The study of low-level visual processing is concerned with issues such as how the retina transduces light and how the visual system extracts features and surfaces from retinal stimulation.  The study of higher level visual processing is concerned with how those features are used to recognize objects.  The issue is not merely how surfaces are segmented into separable objects (sometimes called “mid-level vision”), but how the visual system recognizes what category each object belongs to – that this one is a “dog”, that one a “car”, that one a “cup”.  Whereas the study of low-level vision has flourished, the study of high level object recognition has not.  How we recognize objects remains something of a mystery.

            Why is object recognition so poorly understood?  Most vision scientists say progress has been slow because the problem is so difficult: Reverse engineering computational processes that can reconstruct and categorize whole objects on the basis of low-level smears on the retina is no small task (as work in artificial intelligence has shown).  But progress may be slow for another reason as well: Vision scientists may be failing to understand object recognition because they are seeking processes that are too general.  It could be that the visual system has several functionally distinct category-specific recognition systems – one for animals, another for plants, another for man-made tools, and so on.

            Changing assumptions. Most vision scientists reason as follows: since we are able to recognize objects from all kinds of categories – even evolutionarily novel ones – the processes of object recognition must themselves be domain-general.  As a result, many scientists, such as Biederman and Marr, have taken approaches based on the general geometry of objects.  This research has been very interesting and valuable.  Yet it has not led to a breakthrough in our understanding of object recognition.

            An alternative view is that the object recognition system is composed of a variety of subsystems, at least some of which are specialized for recognizing categories of entities that have been very important during our evolutionary history.  There is, for example, a reasonable amount of evidence that the visual system contains a subsystem specialized for recognizing human faces, and this claim is in no way inconsistent with the observation that we are capable of recognizing evolutionarily novel objects as well.  In fact, neuropsychological, fMRI, and developmental evidence suggests that semantic memory contains dissociable, category-specific subsystems for several evolutionarily important domains: Animals, people, plants, landscapes/ topographical landmarks, human-made artifacts, and personality traits (e.g., Caramazza & Shelton, 1998; Klein  et al., 2002a, b).  The visual system could have dedicated recognition processes for some of these categories as well.  Perhaps the problem of object recognition has been difficult to crack because people are failing to look for these specialized systems.

        At the CEP, we have been investigating whether the visual system has functionally specialized, domain-specific object recognition systems.  For example, the notion that there are specialized systems for face recognition has come under attack.  Moreover, while some admit there are computational specializations for face recognition, they claim these were built by domain-general cognitive processes.  Brad Duchaine, formerly of the CEP and now at Harvard's Vision Science Lab, has produced evidence not only that faces are recognized by a specialized system, but that system arose from developmental processes that are themselves specialized for faces.  His evidence comes from the study of people with developmental prosopagnosia: people who, due to genetic or congenital anomalies, never developed the ability to recognize faces.  Duchaine points out that, if a domain general learning process were responsible for the development of face recognition, then damage to that general learning mechanism would impair not only the ability to recognize faces but also to develop expertise in recognizing other classes of objects.  Yet he finds that many people with developmental prosopagnosia can recognize individual members of non-face object classes.  Indeed, they can even develop expertise in recognizing individuals from completely novel classes of objects.   This pattern of preserved and impaired abilities would be impossible if face recognition developed via a domain-general learning process.  In pursuing these studies, Duchaine has also elucidated the logic of developmental dissociations for finding specialized learning systems (see Klein, Cosmides, Costabile & Mei, 2002 for an application to learning about personality traits).  Duchaine's website contains many more details and PDFs of papers.

For an Economist article on Brad Duchaine's work on prosopagnosia, click here.

       Are there domain-specific processes within visual attention?

Other CEP research involves the discovery of domain-specialized systems governing visual attention.  The most recent example of this is evidence of a system specialized for monitoring animals for changes in their state and location. Click here to find out more