Article summary of Modeling visual recognition from neurobiological constraints by Oram & Perrett - Chapter

Anatomy
Neurophysiology of visual pattern processes
Grandmother cells
Learning associations

The subject that is most researched of the human brain is vision. However, all available data shows that we still know very little about visual processes. The purpose of this paper is to represent the biological aspects of vision.

Anatomy

The anatomy of the visual system is structured as follows. There are different lobes in the brain and all parts in the brain are connected to each other in a complex way. The visual cortex occupies 40% of the brain. The visual system can be divided into two main streams. First the visual information comes from the eye to the V1 area via the thalamus. From there it can go to the occipital lobe via V1, V2 to MT (the dorsal road; the 'where path') or to the V2 / V3 / V4 regions to the IT (the ventral road; the 'what path') "). The functions of all these areas have only become clear through people with brain damage. V1 has the most important visual role.

The cortex consists of six layers with different input and output layers. The input is mainly in layer 4 and the output is mainly from layer 6. The connections between the cortical areas are probably reciprocal: they can send information to each other (so there is no one-way traffic).

Neurophysiology of visual pattern processes

Contours

The neural responses of the V1 respond primarily to orientation and spatial scales. The cells only respond to their own receptive field. They can be divided into two groups: the dependent and the independent cells. The dependent cells only respond to one kind of stimuli (for example, a line on a dark OR light background), while the independent cells respond generally (for example, to a line regardless of the background). These two groups respond to the orientation of lines. There is another group that not only responds to the orientation but also to the length of the lines, this is the hyper-complex group of cells.

Later research showed that V1 cells can also respond to the environment outside the receptive field. This only happens when the environment matches the stimulus. The V1 cells are arranged retinotopic. This means that the cells that are next to each other also see the world side by side. The receptive fields of neurons therefore partly overlap.

Groups of cells that belong together (such as those that respond to movement or color) also lie together in the cortex and are worked out in columns. The deeper layers of the cortex do not have such a clear distinction. Most neurons respond to one eye.

The work of Hubel and Wiesel showed that neurons from the V1 respond to lines and edges of visual stimuli: contours. This showed that the detection of edges of an object requires integration of information on different spatial scales. This evidence has led to the assumption that the analysis of spatial frequencies is a visual process. In conclusion, the V1 ensures that the contours of an object are determined.

V1 is the most common layer of the visual cortex, then it becomes more detailed. The V2 and V3 layers are more complex and have their own receptive fields. These are somewhat larger than the receptive fields of the V1 neurons. The V2 and V3 neurons are also arranged retinotopically. The V2 layer is sensitive to collinear and continuous contours (these are more general contours). In addition, the V2 layer plays a role in direction and movement. In conclusion, it can be said that contour is determined by neurons from the V1, V2 and V3 layers.

Item attributes

The V4 area responds to color and shape and facilitates the recognition process. Most neurons within the V4 respond to their own color. This area is also arranged retinotopically, but not as sharply arranged as the areas of V1, V2 and V3. Just as with the V2 region, the neurons that have a lot to do with each other are close together.

The last cortical areas of the ventral path are in the IT. The IT responds to both color and shape and has a large receptive field. In V4 and the PIT (posterior inferotemporal), 70% of neurons are sensitive to specific objects. The combinations of neurons indicate for which object, this can be a person. The neurons in the CIT (central inferotemporal) and AIT (anterior inferotemporal) have the most complex sensitivity to objects. All these areas together (V1, V2, V3, V4, IT, PIT, CIT and AIT) provide the ventral path and determine what you see. The further you go into the process, the more sensitive and specific the neurons are.

Grandmother cells

Neurons have been found in the AIT and STPa that respond very sensitively to biologically important visual patterns. They respond strongly to faces, hands and limbs and provide complex and abstract information. It has been proven that they respond specifically by having them look at heads in different ways. Multiple areas are used in the forming process. If an area is defect, it can still be taken care of by other areas (with the exception of the V1 area).

Learning associations

Research showed that after training some neurons were more sensitive to incoming learned information. The reaction speed became faster and more active. These studies show that the brain can learn and is plastic, but also that one can learn associative at the cell level. From the above information, it can be said that four phases are carried out:

collecting contour information;
creating a form dictionary (from which the correct form is recognized);
retrieving all possible descriptions;
recognizing the object.

Rolls of feedback connections

Within psychological models of recognition, top-down influence plays a major role in the visual process. Now we mainly look at expectations. Stimuli outside the receptive field are often completed with existing knowledge and people expect what they will see in advance. Various reactions have also been demonstrated in the V3 area if the expectation was right or wrong.

Movement plays a major role in this. There is almost no reaction to the moment that someone puts a hand in the receptive field, compared to the moment when something else suddenly comes in. Then there are strong reactions. However, often the cells begin to respond before the object is actually in the receptive field. That is why reactions are often very fast. The expectation is not only linked to seeing your own hand, but also to the expectation of where the hand should come into view and what it should look like. So there is no response if you know what is coming and if that is correct.

However, expectations do not only apply to the visual system. The moment you feel something different than you expect, you also respond strongly to it. All this is linked to the memory.

The entire visual system uses a feed forward process. The information is being processed further and further and does not return. So there are many more feedback connections from V1 to the LGN than the other way around. The feedback probably creates the "I told you so" idea. This can mainly be tested with prototype objects. Why prototypes are recognized faster is not yet known. This probably has something to do with the columns in the IT. Activity in a whole column then stands for features of the prototype.

Join World Supporter

for free to follow other supporters, see more content and use the tools
for €10,- by becoming a member to see all content

Why create an account?

Your WorldSupporter account gives you access to all functionalities of the platform
Once you are logged in, you can:
- Save pages to your favorites
- Give feedback or share contributions
- participate in discussions
- share your own contributions through the 7 WorldSupporter tools

Follow the author: Vintage Supporter