Study guide with articlesummaries for Artificial Intelligence at Leiden University

Summaries per article with Artificial Intelligence

Summaries per article with Artificial Intelligence

  • Summaries with 14 prescribed articles for AI of 2022/2023
  • Summaries with prescribed articles for AI previous years
  • See the supporting content of this study guide to use the article summaries

Table of content

  • Modeling visual recognition from neurobiological constraints
  • Achieving machine realization of truly human-like intelligence
  • Untangling invariant object recognition
  • Computing machinery and intelligence
  • Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project
  • Robots with instincts
  • Male and female robots
  • Speed ​​of processing in the human visual system
  • Sparse but not ""Grandmother-cell"" coding in the medial temporal lobe
  • Perceptrons
  • Learning and neural plasticity in visual object recognition
  • Breaking position-invariant object recognition
  • A feedforward architecture accounts for rapid categorization
  • Hierarchical models of object recognition in cortex
  • Articlesummaries with prescribed articles for AI - 2020/2021

Related summaries and study assistance

Supporting content I (full)
Article summary of Modeling visual recognition from neurobiological constraints by Oram & Perrett - Chapter

Article summary of Modeling visual recognition from neurobiological constraints by Oram & Perrett - Chapter


The subject that is most researched of the human brain is vision. However, all available data shows that we still know very little about visual processes. The purpose of this paper is to represent the biological aspects of vision.

Anatomy

The anatomy of the visual system is structured as follows. There are different lobes in the brain and all parts in the brain are connected to each other in a complex way. The visual cortex occupies 40% of the brain. The visual system can be divided into two main streams. First the visual information comes from the eye to the V1 area via the thalamus. From there it can go to the occipital lobe via V1, V2 to MT (the dorsal road; the 'where path') or to the V2 / V3 / V4 regions to the IT (the ventral road; the 'what path') "). The functions of all these areas have only become clear through people with brain damage. V1 has the most important visual role.

The cortex consists of six layers with different input and output layers. The input is mainly in layer 4 and the output is mainly from layer 6. The connections between the cortical areas are probably reciprocal: they can send information to each other (so there is no one-way traffic).

Neurophysiology of visual pattern processes

Contours

The neural responses of the V1 respond primarily to orientation and spatial scales. The cells only respond to their own receptive field. They can be divided into two groups: the dependent and the independent cells. The dependent cells only respond to one kind of stimuli (for example, a line on a dark OR light background), while the independent cells respond generally (for example, to a line regardless of the background). These two groups respond to the orientation of lines. There is another group that not only responds to the orientation but also to the length of the lines, this is the hyper-complex group of cells.

Later research showed that V1 cells can also respond to the environment outside the receptive field. This only happens when the environment matches the stimulus. The V1 cells are arranged retinotopic. This means that the cells that are next to each other also see the world side by side. The receptive fields of neurons therefore partly overlap.

Groups of cells that belong together (such as those that respond to movement or color) also lie together in the cortex and are worked out in columns. The deeper layers of the cortex do not have such a clear distinction. Most neurons respond to one eye.

The work of Hubel and Wiesel showed that neurons from the V1 respond to lines and edges of visual stimuli: contours. This showed that the detection of edges of an object requires integration of information on different spatial scales. This evidence has led to the assumption that the analysis of spatial frequencies is a visual process. In conclusion, the V1 ensures that the contours of an object are determined.

V1 is the most common layer of the visual cortex, then it becomes more detailed. The V2 and V3 layers are more complex and have their own receptive fields. These are somewhat larger than the receptive fields of the V1 neurons. The V2 and V3 neurons are also arranged retinotopically. The V2 layer is sensitive to collinear and continuous contours (these are more general contours). In addition, the V2 layer plays a role in direction and movement. In conclusion, it can be said that contour is determined by neurons from the V1, V2 and V3 layers.

Item attributes

The V4 area responds to color and shape and facilitates the recognition process. Most neurons within the V4 respond to their own color. This area is also arranged retinotopically, but not as sharply arranged as the areas of V1, V2 and V3. Just as with the V2 region, the neurons that have a lot to do with each other are close together.

The last cortical areas of the ventral path are in the IT. The IT responds to both color and shape and has a large receptive field. In V4 and the PIT (posterior inferotemporal), 70% of neurons are sensitive to specific objects. The combinations of neurons indicate for which object, this can be a person. The neurons in the CIT (central inferotemporal) and AIT (anterior inferotemporal) have the most complex sensitivity to objects. All these areas together (V1, V2, V3, V4, IT, PIT, CIT and AIT) provide the ventral path and determine what you see. The further you go into the process, the more sensitive and specific the neurons are.

Grandmother cells

Neurons have been found in the AIT and STPa that respond very sensitively to biologically important visual patterns. They respond strongly to faces, hands and limbs and provide complex and abstract information. It has been proven that they respond specifically by having them look at heads in different ways. Multiple areas are used in the forming process. If an area is defect, it can still be taken care of by other areas (with the exception of the V1 area).

Learning associations

Research showed that after training some neurons were more sensitive to incoming learned information. The reaction speed became faster and more active. These studies show that the brain can learn and is plastic, but also that one can learn associative at the cell level. From the above information, it can be said that four phases are carried out:

  1. collecting contour information;

  2. creating a form dictionary (from which the correct form is recognized);

  3. retrieving all possible descriptions;

  4. recognizing the object.

Rolls of feedback connections

Within psychological models of recognition, top-down influence plays a major role in the visual process. Now we mainly look at expectations. Stimuli outside the receptive field are often completed with existing knowledge and people expect what they will see in advance. Various reactions have also been demonstrated in the V3 area if the expectation was right or wrong.

Movement plays a major role in this. There is almost no reaction to the moment that someone puts a hand in the receptive field, compared to the moment when something else suddenly comes in. Then there are strong reactions. However, often the cells begin to respond before the object is actually in the receptive field. That is why reactions are often very fast. The expectation is not only linked to seeing your own hand, but also to the expectation of where the hand should come into view and what it should look like. So there is no response if you know what is coming and if that is correct.

However, expectations do not only apply to the visual system. The moment you feel something different than you expect, you also respond strongly to it. All this is linked to the memory.

The entire visual system uses a feed forward process. The information is being processed further and further and does not return. So there are many more feedback connections from V1 to the LGN than the other way around. The feedback probably creates the "I told you so" idea. This can mainly be tested with prototype objects. Why prototypes are recognized faster is not yet known. This probably has something to do with the columns in the IT. Activity in a whole column then stands for features of the prototype.

Article summary of Is a machine realization of truly human-like intelligence achievable? by McClelland - Chapter

Article summary of Is a machine realization of truly human-like intelligence achievable? by McClelland - Chapter


Why are people smarter than machines? This used to be a relevant question. Research has been going on for a long time to see if we can mimic human cognitive abilities. However, we have not come very far. For example, Newell and Simon have put together a mathematical system that is generally better at resolving formulas than a human, although, the smooth, adaptive intelligence in 1980 could not yet be imitated. The intelligence to which this refers is the observation of objects in a natural environment, the relationships between them, the understanding of language, the retrieval of relevant information and the implementation of suitable actions. Because this has been a while ago, the relevant question now is: Is it still true that people are smarter than computers, and if so; why?

There has certainly been progress since the 1980s, for example with a chess program (Deep Fritz). This program has defeated the world chess champion. However, the question is whether Deep Fritz has now learned how to play chess. Is it more than just a smart way to program and let a machine look up tables efficiently?

Serre has designed a program based on a feed-forward neural network. The program first learns a certain algorithm. The program is then able to perform a categorization task (assessing whether something is an animal or not). The results correspond to how a person would do this. This is called a natural cognitive task. This kind of progress can also be seen in other natural cognitive tasks, such as the processing of language, memory, planning and choosing actions.

However, there are still a number of questions to be asked about the progress of the intelligence of a machine. According to the author of the article, most systems claiming artificial intelligence do not look broadly enough at intelligence. What people can do, but not a computer, is to think out-of-the-box and reason on this. The conclusions that people can draw come from many different sides, and even the best artificial intelligence systems of the moment cannot.

Why are people smarter than machines?

The problem with machines is that they are unable to think in open ends; they are still dependent on the human programming. The things we miss are fluency, adaptability, creativity, etc. That is why you can say that the "smart" part of the machine is still based on humans. What is this about? Compared to the human neuronal network, a machine needs a week to process the same information that takes a human ten minutes. This requires more power from the computer, but the question is whether that is enough. Progress is needed in several areas, which are discussed below (partly based on known levels of Marr). 

Computational theory

Marr has made a taxonomy with 3 levels, so that it is easy to distinguish between the goal and the core of computer theories on the one hand and the algorithms and their realization on the other. He also emphasizes the level of the computer itself. We must look at what information is in the stimulus.

Various studies have shown that little is yet clear about the relationship between the stimulus characteristics and what the underlying truth is. It is very important to find out more about this cognitive computation .

It is quite difficult to find out how the computational problem works. It would be best if there was a program that could use the data to find out the relationship between situation and consequence. But how can you best explain what needs to be learned in this situation? There are two approaches:

  1. The goal must be interpreted as a structured statistical model of the environment. This must find exactly the best format to display the data.

  2. The goal must be interpreted in terms of optimum expectation, so that the internal model can remain rudimentary (instead of explicit, as in point 1.)

These two ways must both be further investigated: the first is too restrictive, while the second is too open. We must look at how limitations can lead the search to the optimal solution. We currently have a lot of knowledge about the use of simple solutions (flat solutions), but this does not provide enough information. On the contrary, there must be information about representations from multiple levels.

Algorithm and representation

We need to understand not only what a stimulus makes a stimulus and what the best strategy is to use it, but also how a computer mechanism can apply this as efficiently as possible.

The systematic structure (computational basis) of the characteristics of brain representations is currently being examined. This makes it possible to visualize how lower-level representations in the visual and auditory system are natural solutions in response to the structure of natural visual and auditory stimuli. This approach can be used to understand higher-level representations.

Architecture

Some architectures are intended to map human cognition, others are intended to build modern artificial cognitive systems.

The literature emphasizes a combination between explicit symbols on the one hand and implicit sub-symbols on the other (paying more attention to the connection between each other). An example of this is SAL (a combination of the ACT-R model and the LEABRA architecture). According to the author, the best outcome would be a system that is mainly sub-symbolic, but that the cognitive processes (which are now seen as symbolic) are the output of the calculations of the sub-symbolic processes.

The von Neumann computer is currently used as the underlying computer architecture. Although there is a need for a more brain-like system, the von Neumann architecture underlies most computer programs of human cognition.

Nurturance, culture and education

The fourth and final part to understand cognitive computation is the role of nurturance, culture and education in structuring human cognitive capacities. The human’s mental capacity is formed by experience, and experience is formed by culture and social and governmental influences. Understanding how cognitive capacity arises will also give us more information about how we can achieve artificial intelligence.

Article summary of Untangling invariant object recognition by DiCarlo & Cox - Chapter

Article summary of Untangling invariant object recognition by DiCarlo & Cox - Chapter


This article provides a graphical perspective on the arithmetic challenges of object recognition. It also looks at which neuronal population is responsible for the representation of objects. Our daily activities are accompanied by quick and accurate recognition of visual stimuli; this way we can recognize thousands of objects within seconds. However, which brain mechanisms are involved in this process is still unknown.

Object recognition is defined as being able to accurately distinguish objects or categories from all kinds of possible stimuli. This is done via the retina through an identification-prescribing transformation. Object recognition is difficult for a variety of reasons. The main reason is that every object can produce an infinite number of different images on the retina, while this image is still recognized from every angle. This is also called the invariance problem: the fact that we never see exactly the same thing twice and can nevertheless recognize things.

Arithmetic processes

When solving a recognition task, someone must use internal neuronal representations. These internal neural representations come from the visual vision and from this a choice is made by the brain. The brain must use a selection function to distinguish between which neurons are fired when object A is presented and which is not. Somewhere in the brain there are the right neurons that react to something and you have to filter it out. The central question in this process remains: which format of neurons represents the choice and which decision functions belong to that representation?

You can see the above problem from two sides. On the one hand, object recognition is a problem for finding complex decision functions and on the other it is a problem for finding operations that progressively transform from the retinal representation in the form of a new representation, followed by decision functions. The latter vision can be well used when investigating the architecture of the visual system (especially the ventral path).

Object recognition is difficult

Our eyes fixate an average of 300 ms on the world and then move on. During every glimpse, a visual image is already made and stored using at least 100 million cells. Such a representation can be seen as high-dimensional. An example of a low dimensional representation is a face.

Then it is about a fixed object that can be seen in many different ways. How you can see an object is called a manifold. Different objects have different manifolds.

The manifolds of all different objects are crumpled together in the brain. This means that the retina does not immediately recognize what we see, but it does pass on the information that we need to make a choice of what we see.

You can see the brain recognition mechanism as a transformation of the incoming visual representation that is easy to build into recognition. However, it is not possible to decode how recognition works.

The ventral visual path

This path translates the manifolds into objects. The ventral path is as follows: V1 to V2 to V4 to IT. Gross's studies have shown that the IT has the most specific complex neurons. The neurons there are likely to cause object recognition, because these neurons respond specifically to certain forms and are reasonably insensitive to changes in object position.

Recognition is not a result of performance, but of how strong the visual representation is in the IT cortex. This also means that the manifolds are less mixed up in the IT cortex. This also means that the V1 cortex is still very mixed up when it comes to the manifolds (just like in the retinal representation). In short: the ventral path ensures that objects are recognized by unraveling the manifolds. It is not yet known how this happens.

Meanwhile, there are already several ideas about and investigations into the process of the ventral path. Some neurophysiologists have focused on characterizing tolerance in IT neurons towards some objects. This is the same as object tangling. Other research is aimed at understanding the characteristics of shape dimensions. These studies are important for defining complex characteristics of the ventral visual pathway for neuronal tuning, which is related to manifold untangling.

The perspective of the tangling object leads to a different approach. Individual IT neurons are not expected to be responsible for the recognition, but for population representations. In addition, this perspective assumes that the immediate preparation of a goal determines how well the ventral visual path causes detangling of manifolds. This perspective offers a better way to make computer models, because populations can be more meaningful than individual neurons.

In addition, the perspective states that a focus on the cause of the tangling is better than focusing on the characteristics or forms where something reacts. Finally, with this perspective, hypotheses can be tested, which can lead to new biological hypotheses.

Flattened manifolds

By flattening manifolds one can perhaps see what happens. We are looking for transformations that cause a manifold to be flattened without interfering with others. This allows the correct neurons to be identified. At IT-level, the detangling of object manifolds results in the folding of each manifold into one point. This suggests that detangled IT-representations not only directly provide object recognition, but also recognize other tasks such as position, location and size. IT-neurons therefore have large limited receptive fields. Here the limitation works to the advantage. The detangling of manifolds can be viewed with neuronal images. Yet this is very difficult to see.

Further analysis shows that the V1 cortex sees the world through a narrow hole and that the V2 can do the same. After that the recognition gets better and better. There are three consistent mathematical ideas that allow the detangling of the above physiology:

  • Idea 1: the visual system projects incoming information to higher dimensional places so that the data spreads more in the space.

  • Idea 2: neuronal sources are present at every stage that correspond to the distribution of visual information from the real world.

  • Idea 3: time implicitly ensures the supervision of manifold flattering.

Article summary of Computing machinery and intelligence by Turing - Chapter

Article summary of Computing machinery and intelligence by Turing - Chapter


The imitation game

If you want to answer the question: "Can machines think?", you have to start defining. If you are going to define, however, you use everyday language, which is based on statistical analyzes. That is not the intention.

A new form of the problem can be explained by the imitation game. There are three people: a man (A), a woman (B) and an interrogator (C) (gender does not matter). The interrogator is in a different room than A and B and has to discover through a number of questions which gender A and B have. The task of A is to ensure that C does not solve the task properly and the task of B is to ensure that C does solve the task properly. The interrogator has no knowledge about the sound of the voice, what A and B look like and ideally no knowledge of their handwriting.

The question of the article thus becomes: Will the computer be as good at misleading C as a real person?

Criticism of the new form of the problem

Is this question worth investigating? The new formulation of the problem means that it is not necessary to involve characteristics of people that are difficult to imitate. To investigate thinking, you don't have to "dress" the robot humanly. For example, you don't have to mimic human-like skin.

A form of criticism of this formulation is that the machine is at a disadvantage. If a man has to copy a computer, he will soon be exposed if math questions are asked (for example, he responds too slowly). According to the authors, however, there is no reason to believe that the machine is unable to play the imitation game appropriately.

A final point of criticism is that the machine may adopt a different strategy than imitating a man's behavior. However, according to the author, there is no reason to believe that imitating a man is not the best strategy.

The machines in the game

In the article, the following is meant by machine: an electronic computer or a digital computer. These are the only machines that participate in the imitation game.

Digital computers

Digital computers are intended to perform actions that a human computer does. A human computer must abide to certain rules and may not deviate from this. You can imagine that these rules are written down in a book with an unlimited number of pages. This book can change per task. You can further divide it into three parts:

  1. Storage: you can compare this with the paper (on which you perform your calculations or on which the book is printed). If the calculations are done by heart, then you can compare it with the memory. Information in the storage can be divided into small packages.

  2. Execution: these are the different actions that an individual must perform during a calculation. This can vary per machine.

  3. Control: here it is checked whether the rules are enforced.

The digital computer must be able to hold an order or be able to repeat it, so that a new order does not constantly have to be given. You can compare it with the following example: Bart is sick and has to take a pill every day at 5 pm. His mother can remember him every day, but she can also put up a note telling him to take that pill every day. If Bart is better, the note can be removed. This is the way that the computer works.

If you want to imitate the human computer as precisely as possible, you must know how the action was carried out and then translate the answer into a specific instruction table. This is called programming.

The storage of digital computers is generally limited. However, it is not difficult to imagine computers that have infinite storage. We refer to these computers as infinite capacity computers.

The idea of ​​a digital computer is not new: between 1828 and 1839 Charles Babbage was working on his machine The Analytical Engine. The Analytical Engine as a whole was mechanical and not electrical. The neural network and all computers that have been made since the Analytical Engine are electric. However, since chemical activity in the neural network is just as important, and in certain computers storage is acoustic, the use of electricity is only seen as a superficial similarity.

Universality of digital computers

The digital computers discussed earlier belong to the section discrete state machines. This means that the machines move through sudden movements between different states. In the narrow sense, no machine is built in this way, but in some cases it may be useful to approach them in this way (think of a light that is on or off).

An abstract explanation: there is a machine and a brake. In this case the machine is the position of a wheel. You have position 1, position 2 and position 3 (q1, q2 and q3). You can see what position it is in via a light. The input signal is the lever (i0 or i1). You can put this in a table if you have the last state and an input signal:

 

Last state

 

 

Input

 

 

 

q1

q2

q3

i0

q2

q3

q1

 

i1

q1

q2

q3

 

You can write down the output (namely the light you see) as follows:

State

q1

q2

q3

Output

o0

o0

o1

In other words, the moment the machine was in q1 as the last state, and the lever is at i1, the output is o0.

Only a limited number of outcomes is possible. In this way you can predict all possible outcomes. This is in line with Laplace's vision. Laplace's vision is that you can predict all future states in the universe, provided you know the state of the universe at that time. In practice, however, the smallest deviation can have the greatest consequences. A machine with a discrete state has the feature that this cannot happen.

If a machine wants to do the imitation game, it can play both person A and person B and it becomes difficult for the interrogator to see the difference. However, the machine must have sufficient storage capacity, work fast enough and be reprogrammed, depending on which role it must perform.

A universal machine means that a digital computer can imitate any machine with a discrete state. The advantage of this is that you only have one machine that can perform various calculations.

Contradictory views on the main question

According to the author, within 50 years, computers can be programmed in such a way that after five minutes an interrogator has no more than 70% chance of getting the answer right. Although the author believes that the question "Can machines think?" is irrelevant, he says that in the future there is a general consensus that machines can indeed think. He also wants to emphasize that scientists do not only jump from scientific fact to scientific fact; there is also guesswork. As long as a distinction is made between scientific facts and suspicions, he believes that suspicions make a relevant contribution to science.

Now 9 conflicting views are discussed:

1. Theological objection

Thinking is part of the immortal soul. Because God has given an immortal soul to every person, but not to animals or machines, no animal or machine is capable of thinking. This argument would be more convincing if an animal were also classified as "human". The argument implies a limitation in the ability of God. There are a number of things that are generally accepted that even God is unable to do, but is it not very restrictive to say that he is unable to give an animal a soul?

2. The "head-in-the-sand-objection"

The consequences of thinking machines are frightening; let's just hope this won't happen. Although this argument is rarely so frankly expressed, it does affect the people who deal with this topic. Man tends to think he is the superior species in the world, but thinking machines endanger this position. This way of thinking (the superiority of man) is probably also the reason that the Theological Objection gets a lot of support.

3. The arithmetic objection

There are several mathematical results that can show that the power of machines with a discrete state is limited. An example is Gödel's theorem: propositions can be formulated in powerful, logical systems that cannot be confirmed or rejected, unless the program itself is contradictory. If we want to use Gödel's theorem, a logical system must be described in terms of machines and you must describe machines in terms of logical systems. In this case we are talking about a digital computer with an infinite capacity. However, there are limitations if you play the imitation game with such a machine: he will sometimes give the wrong answer or give no answer at all. The Arithmetic Objection responds to this: machines have all kinds of limitations that do not bother humanity.

Suppose a machine gives the wrong answer. Then people get (according to this objection) a feeling of superiority. According to the author, not so much value should be attached to the fact that machines occasionally give the wrong answer: after all, people often give enough wrong answers. Moreover, there was only 1 machine worse at that time; can you as a person also beat multiple machines at the same time?

4. The argument of consciousness

The moment a machine can write a poem through emotions and thoughts and is aware of the fact that it has written it, we can assume that a machine equals the brain. According to the most extreme form of this argument, you only know for sure that you think yourself (solipsist vision). So we could never find out if a machine is thinking, because to find out, we have to be the machine ourselves. The imitation game is often used under the name viva voce , to find out if someone actually understands the question or has only learned a pattern. According to the author, most people who support this argument are more likely to abandon it than to lean towards the extreme, solipsist side.

5. Arguments of various incapacities

There are things that machines can indeed do, but you will never get a machine to do X. X means different things. A few examples: being kind, being nice, being funny, learning through experience, enjoying ice cream, making mistakes, etc. There is no real support for these propositions, which, according to the author, comes from scientific inference: a person has seen a number of machines in his life and draws its own conclusions. The statements that are mentioned do not fit here. A number are elaborated below:

  1. A machine cannot make mistakes: the definition of “making mistakes” must be looked at: Functional errors: due to a mechanical / electrical fault, which causes the machine to do something that it is not supposed to do. This cannot happen in a philosophical discussion, you are talking about abstract machines. These machines are unable to make mistakes. Conclusion errors: can only occur if a certain conclusion is attached to the answer. For example, if a machine has to type in a certain calculation, the machine can make a mistake here.

  2. A machine is not subject to its thoughts: you can only show that a machine is subject to its thoughts if you can show that a machine has at least some thoughts with at least something of substance. For example, if you show a comparison to the machine, you could say that the machine is currently thinking about the comparison.

  3. A machine has no diversity in its behavior: it is the same as saying that a machine does not have enough storage capacity. The statements mentioned above are mainly related to the concept of consciousness.

6. The objection of Lady Lovelace

Lady Lovelace said: "The Analytical Engine does not claim that it produces anything of its own: it does what we can order." Hartree adds that it may be possible to make a machine think in the future, but that that seemed impossible at that time. A form of The Object of Lady Lovelace is that a machine can never really do anything new. Another form is that a machine cannot surprise us.

7. The argument of continuity in the nervous system

Because the nervous system certainly does not resemble a machine with a discrete state (on the contrary), you can never reconstruct the nervous system with a machine with a discrete state. Of course, discreetly is not the same as continuous, but this difference does not make any difference to the imitation game.

8. The argument of informality of behavior

It is not possible to make a program in which every conceivable situation can be anticipated. If you assume this, we as humans can never be machines. Namely: if every person has a limited number of rules for how he behaves, then we would be just like a machine. However, we do not have those rules, so we are not machines. A distinction must be made between two expressions:

  1. Rules of conduct: know that you must stop when a red light in traffic. You can follow these types of rules and you are consciously working on them.

  2. Laws of behavior: This is about a person's body. When you get pinched, your body responds.

We have never finished investigating all laws of conduct. That is why you can never say definitively: "We can never be a machine". After all, there are no circumstances under which it is said that enough has been investigated.

9. The argument of extra-sensory perception (ESP)

The last argument is about extra-sensory perception, and terms such as telepathy, clairvoyance, prior knowledge and psychokinesis. These ideas go against our scientific insights, but there is evidence (for example for telepathy). When you play the imitation game, in which a computer is involved and a telepathist, the interrogator can ask: “What is the pattern of the card that I have in my hand?” Per 400 cards, the telepathist has, say, 130 times the correct answer, while the computer may have it 104 times right. This way you can guess which is the computer. If the interrogator has psychokinetic powers, the computer may have the answer more often than expected based on probability. If the interrogator is clairvoyant, he will be fine anyway. In short, basically anything can happen based on ESP.

Learning machines

Building on Lady Loveland's argument, you could say you put an idea in a machine, something happens, and then stops responding. Another comparison is with an anatomical pile. Suppose there is a pile of sub-critical size: the idea reacts with a neutron in the stack. The neutron disrupts a process and this eventually goes out. If the size of the pile is critical, the neutron will continue until the entire pile is destroyed. There is also such a system in humans. This is usually sub-critical, but a small portion is above-critical. If an idea comes in here, it creates a "theory", which consists of several ideas from different levels. The question then remains: can machines think above-critical?

Different parts of the brain provide different functions in the mind. You can compare this with an onion. To get to the "real" mind, you have to peel the layers of the onion one by one. In this way you eventually come to the "real" mind.

Compared to nerve cells, modern machines that mimic the neural network work a thousand times faster, so it's not about speed. The way in which the machines must be programmed to play the game must be sought. To achieve this, we have to look at 3 parts:

  1. The initial state of the brain (around birth)

  2. The education to which the brain is subject

  3. Other experiences, apart from education, to which the brain is subject

The idea is that if you focus on mimicking the brain of a child, then through education they will eventually resemble the brain of an adult. The child's brain is hardly programmed. In this way the problem is divided into 2 parts: on the one hand the child program and education.

The comparison with evolution is as follows:

The structure of the child machine

Genetic material

Changes

Mutations

Natural selection

Assessment of the researcher

The researcher can ensure that mutations and the process of natural selection proceed quickly and efficiently. The machine must be able to learn through conditioning. But if, apart from conditioning, there is no other way to communicate, the amount of information can never be more than the number of rewards or penalties you use. The moment the student finally learns to say a word, he will be black and blue because of punishment. So there must be a non-emotional way to communicate, for example by learning (symbolic) language. For example, orders can be given. There must then be definitions and assertions in the system that consist of facts, formulas, propositions given by an authority figure, etc. The teacher can say to the machine: "Go and do your homework".

The commandments ( imperatives ) that a machine must execute are commandments that can apply the rules of the logical system. Think of rules such as: "If there is a method that is faster than another method, then never opt for the slower method." These kinds of statements can be learned on the one hand by authority figures or on the other hand by the machine itself (scientific inference).

The paradox of the learning machine implies that a machine learns certain rules that it must adhere to, but these can also change. This can be explained by the fact that the rules that change are short-lived.

In this way, the teacher may not always understand what is "going on in the machine." The argument that the machine can only do what we say it should do is no longer valid.

A random element in a learning machine can have advantages. If you systematically look up everything, you may get a lot of information that is not right, because you start with an area where the right solutions are not available anyway. With a random element you have a greater chance that there will be a good answer faster. The systematic method is not possible in evolution. You cannot keep track of which genetic combinations there have been.

Intellectually, machines may be able to compete with humans in the future, but the question remains what type of machine is most suitable for this.

Article summary of Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project by Buchanan & Shortliffe - Chapter

Article summary of Rule-Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project by Buchanan & Shortliffe - Chapter


The branch of computer science concerned with symbolic, (non)algorithmic methods for problem solving is called artificial intelligence (AI). An algorithm is a procedure that guarantees to find a fitting solution for a problem or that there is no solution. For many years, computers have been doing numerical calculations for data processing. However, knowledge about a certain subject is rarely numerical, but rather symbolical. Therefore, its problem-solving methods are rarely mathematical. It relies on heuristics, that are not guaranteed to work, but often find solutions quicker than trial-and-error. MYCIN is an expert system and designed to:

  1. Provide expert level solutions for complex problems.

  2. Being understandable.

  3. Being flexible enough to easily accommodate new knowledge.

What is the context of the MYCIN system?

The MYCIN system exists from two parts. First, a knowledge base and second an inference mechanism. Sometimes there are subprograms added to facilitate interaction with users. There are two flows of information between the user and the system. The user either inputs information in the form of the description of a new case. The second flow is the output of advice and explanation by the expert system. These interactions all go through the user interface which then communicates to the system. The underlying knowledge base is the store of the program containing facts and associations about a subject. The inference mechanism or control structure can take many forms, for example a chained together set of rules about the facts and statements in the knowledge base. This is called forward chaining or data-directed inference because the data is known, by chaining a solution is provided. MYCIN however uses primarily backward chaining, a goal-directed control strategy. In this strategy, the system starts with an argument (the goal) and works backwards through inference rules to find the data that establishes that goal. Because of the many rule chains and data, the system should inquire, MYCIN is sometimes referred to as an evidence-gathering program. The goal of the work of the MYCIN is to provide diagnostic and therapeutic advice about a patient. The whole system is referred to as performance system as the other subsystems are not directly related to giving advice.

How was the programming language chosen?

For the creation of MYCIN, the LISP programming language was used. The reason is the extreme flexibility based on a small number of simple constructs. Furthermore, this language allows rapid testing and modification. This way, medical rules in the knowledge base were easily separated from inference procedures that use the rules. Also, LISP does not require recompilation of programs to test them. It removes the distinction between program and data, so parts of the program can be examined and edited as if they were data structures.  Currently with the release of Interlisp, additional tools such as EMYCIN have been built on top of Interlisp.

What is the historical perspective on MYCIN?

Production rules for artificial intelligence were probably brought up by Allen Newell. He saw an elegant formalism that was useful for psychological modelling. Production rules can be utilized to encode domain-specific knowledge. For example, Waterman (1970) worked with heuristics and rules of a poker game. This research led to the development of the DENDRAL program, the first AI program to emphasize the generalized problem-solving capabilities over specialized knowledge. The program was able to construct explanations of analytic data about molecular structures. A major concern here was the representation of specialized knowledge of the chemistry domain for the computer to use in problem solving. In a sense, MYCIN is an outgrowth of the DENDRAL program because the design and implementation are based upon it.

Before a system can make appropriate therapeutic decisions, it should contain all knowledge of antimicrobial selection. Also, by modifying the input from patient databases to direct data entry a consultation with physicians was realized. This made the system interactive. The model for MYCIN should be able to diagnose and suggest therapy. One difficulty was seeing how a long dialogue between therapist and patient could be focused on one line of reasoning at a time. Furthermore, translating the ill-structured knowledge of infectious disease into semantic networks remains difficult. In the first grant application of MYCIN, the goals of the project were described:

  • A consultation program that provide physicians with advice regarding antimicrobial therapies. This is based on data on microbiology and direct observations of the physician.

  • It should have interactive explanation capabilities to explain its knowledge of disease therapy and justify recommendations.

  • It should contain computer acquisition of judgemental knowledge. The MYCIN system should be taught therapeutic decision rules, useful in clinical practice.

At the time of building MYCIN, researchers also worked on other subprojects in AI, such as question answering (QA), inference, explanation, evaluation and knowledge acquisition.

What about MYCIN’s task domain; antimicrobial selection?

The nature of the decision problem in MYCIN originates in the task domain: antimicrobial selection. An antimicrobial agent is a drug that is designed to kill bacteria or arrest growth. Selection of antimicrobial therapy is the problem of choosing an agent (or combination) to treat a patient with an infection. A naturally occurring bacteria or fungi is called antibiotic. Some are too toxic in treating infectious diseases. Besides antibiotics, antimicrobial or synthetic antibiotics can be used in treatment of infections. The cause of the infection is used as a clue for the decision what drugs are beneficial. The selection of therapy is done in four parts:

  1. The physician decides whether the patient has a significant infection.

  2. If this is the case, the organism causing the infection needs to be identified.

  3. Select a set of drugs that might be appropriate.

  4. Finally, the (combination of) drugs should be chose for treatment.

How can useful drugs be selected for the patient?

First, the isolation of bacteria from a patient is not evidence of a significant infection, they are often important to the homeostasis of the patients’ body. The second challenge is samples can be contaminated with external organisms. It is wise to use several samples. The significance is then based on clinical criteria. These allow the physician to judge the significance of the infection. Second, there are several laboratory tests that can check the causing organism of an infection. The complete test and definition of identity can take up to 24-48 hours. The problem with the process is that the patient cannot wait several days before therapy. Early data becomes important for narrowing down the possible identities. Historical information about the patient is also useful. Lastly, to discover the range of antimicrobial sensitivities of the organism, in vitro tests are run. The bacterium is exposed to several commonly used antimicrobial agents and the sensitivity is analysed. Then the physician knows what drugs can be effective in vivo (in the patient). This data is only available after several days; therefore, a decision often is based on statistical data available from hospital laboratories. Once a list with potentially useful drugs is created, the likelihood of its effect should be considered. The patient’s allergies, sex, age and kidney status should be examined, just like the administration of the drug. As the patient’s status can vary, so should the recommended dosage of the drug.

When came the evidence for needing assistance?

The use of sulfonimines and penicillin cannot be overstated. In the 1950s it became clear that antibiotics were misused. At the time of developing MYCIN this misuse was receiving a lot of attention. Many antibiotics were prescribed without the identification of the offensive organism. Antibiotics is overprescribed because the patient demands a prescription with every visit. Improved public education is one step towards diminishing the problem. Studies have shown that one-third of hospitalized patients receive a kind of antibiotics. The monetary cost is enormous. This issue is summarized by Simmons and Stolley in 1974.

  1. Have new resistant bacterial strains have emerged because of the wide use of antibiotics?

  2. Has the hospital ecology changed because of this use?

  3. Have infections changed because of antibiotics misuse?

  4. What are trends in the use of antibiotics?

  5. Is there a proper use of antibiotics?

  6. Is the more frequent use of antibiotics presenting new hazards?

The answers to the questions are frightening to an extent that the consequences might be worse than the disease. This raises a new question: are physicians basing their prescription habits on rational decisions? In a study of Roberts and Visconti (1972) it turns out only 13% was judged to be rational. The goal of MYCIN is to provide an improved therapy decision. An automated consultation system could support in a partial solution to the therapy selection problem.

Article summary of Robots with instincts by Adami - Chapter

Article summary of Robots with instincts by Adami - Chapter

[toc]

The ability to predict the future is in some cases seen as a form of intelligence. Our brains are adjusted so that they can quickly compare the different outcomes of our actions. How do they do this?

Cully et al. show that robots can learn to recover quickly from physical damage. The underlying idea is that they must adopt a new strategy to be able to continue. This looks like instinct, but what the robots do is compare their previously learned strategies with each other so that the best strategy is chosen.

Three things are needed to make an accurate prediction about your behavior: experience, understanding how the world works and being able to judge how your own actions contrast with those of others.

Previous studies state that the ability to plan depends on the ability to program a representation of the world in a robot. If this succeeds, how can future actions be looked up quickly and efficiently?

In the research by Cully et al., The robots were asked to find the best strategy after being damaged. What happened was that robots, before they were damaged, had a baseline with possible solutions. After the damage, the possible movements were tried before they decided which behavior would best compensate for the damage.

Because of the embodiment of the robot, there are only a limited number of actions that can be performed the robot. The authors looked at all kinds of actions that a robot could perform and looked at the suitability of every motor movement. This was, for example, measured in "distance that the robot can travel". The robots were able to learn the new actions through special-purpose machine-learning algorithms.

The special-purpose machine-learning algorithms are not the same as the cognitive systems that we have, but both are strongly limited by embodiment. What you can and cannot do with your body is something you have to learn through experience and trial and error .

It is very difficult to imitate the functioning of our brains: the aforementioned algorithm was designed by researchers, while the way in which our brains have adjusted themselves is the result of millions of years of survival of the fittest . In the past it has not been possible to imitate our brains in terms of fast, intuitive and situation-specific behavior. That is why it is better to focus on adaptive and evolutionary algorithms.

Article summary of Male and female robots by Da Rold, Petrosino, & Parisi - Chapter

Article summary of Male and female robots by Da Rold, Petrosino, & Parisi - Chapter


The difference between male and female robots is that after they mate, the woman is unproductive for a set period, while the male can mate again. Therefore, males have a greater variance in reproductive success because they are actively looking for a scarce resource. Also, reproductive females are less active compared to males. They wait for males to find them. On the other hand, non-productive females are as active as males when it comes to looking for food. They only look for food and are not interested in anything else. The final difference was in the preference of type of food and offspring care. Only if males do not have parental certainty.

What is the objective of this study?

Mating has different consequences for male and female. The objective of this study is to describe the behavioural consequences of the single difference that males can reproduce after mating while females cannot. The idea behind this is to reproduce this phenomenon in artificial systems. If the system will behave the same as the real phenomenon, the model that was used can help explain the phenomenon. The robots used in the experiment are not created by the researchers, but they are male and female robots living and reproducing in an artificial environment. This will teach the differences in evolved behaviour between male and female robots. Robots can be useful to illustrate the difference between man and woman because when constructing robots, the behavioural assumptions need to be stated specifically.

One of the differences is that reproductive female robots are less active than male robots. However, when females mate successfully this is not the case. The stereotype of active males versus reactive females must be guarded against. It should be objectively measured. The robots that are used in this case are highly simplified, and the results of the study may apply to some species, but not to others. The nature of sexual differences in species depends on the adaptive pattern of certain species. The robots used in this experiment provide a better understanding in observed behaviour in real animals plus their evolutionary origin.

Not many experiments have been done on sexual differences in robots and computational models on the behaviour of male and female organisms. Studies that try to investigate genetic algorithms are done by recombining genotypes of two different animals. The only difference between the male and female robots in this experiment is the consequence of mating. It’s assumed the species are sexually different, without specifying the ways in which they are. The only dissimilarity is appearance. The goal is to find out if male and female robots behave differently and what these differences are.

How is the experiment executed?

A group of hundred Khepera robots was used in this study. They all have an energy level ranging from zero to one, and when the energy level is zero, the robot disappears. Food tokens are used to represent eating. Half of the robots in the experiment are male, the other half is female. Mating happens when two robots of a different sex touch. In the end, the number of mating experiences is counted, and the top ten males and top ten females generate offspring. This represents the evolution and is repeated 10 times for 1,000 generations. The behaviour of the robots is controlled by a neural network encoded into the genes of that robot.

What were the results of the experiment?

The variance in reproductive success is the same for males and females. However, the male robots have a higher number of mating events. There was an increase in number of mating events, so the robots evolve in the ability to successfully mate and eat. Length of life and number of food tokens is correlated, but not perfectly. It seems that females can eat more efficiently. Length of life and mating success was correlated differently for males and females; female mating success is strongly correlated with length of life, which is not true for males. This implies that there are two adaptive strategies used.

Males and females produce different results in terms of mating and eating. Females behave differently depending whether they are reproductive or nonreproductive. Males were always active in looking for food or reproductive females, whereas only nonreproductive females were active. Reproductive females were not active and were waiting for males. Nonreproductive females were active, but were only looking for food, not for males.

The behaviour of the robots in the experimental laboratory was studied by a controlled experiment. The robot is given the choice between two items. The items can be male, reproductive female, nonreproductive female or a food token. Males prefer the reproductive female over food and even stronger over a nonreproductive female. When the male must choose between a nonreproductive female and food, it prefers food. Reproductive females prefer food over a male, also when she needs to choose between a male and a nonreproductive female, the choice of male is just a little more preferable than the nonreproductive female.

Finally, the choice between different types of food was simulated by colouring the available food. Males do not prefer one type of food over another. Females have specific food preferences dependent on their environment. Reproductive females prefer more energetic food. When female robots are present, the male tents to eat less (energetic) food. This can be explained by males being socially aware of their environment and the presence of females. They will behave accordingly.

What is suggested for further research?

In the article, five main suggestions are made to elaborate on the current research.

  • The emergence of families could be studied. An important implication is that the life of the robots should then be linked to specific places. They should be able to recognize the location of these places.

  • The robots are described as male and female but are not individually different. The influence of sexual attractiveness could be measured to mirror sexual selection. Things that could be considered are recreational sex, which is mating between a male and a nonreproductive female. Also, it could be that sexual attractiveness is different for nonreproductive versus reproductive females.

  • Motivations for animals are different and rely on emotional states and their expression. This is because not more than one motivation can be pursued at one time. Emotional states help guide what decisions are made and are based on the interaction of brain with the rest of the body. However, the expression of emotions that are related to sex and parenting might be a whole other subject for research.

  • A limitation of the current study is that stereotypes need to be more realistic. One example is that male robots only inherit genotypes of their father whereas females inherit the genotype of their mother. There is no crossover.

  • Lastly, female menopause and grandmothers could be considered. In real life, females become permanently nonreproductive. Female robots in menopause could be used to explore the grandmother hypothesis: that females continue to live after menopause to help their daughters take care of their offspring.

Article summary of Speed ​​of processing in the human visual system by Thorpe, Fize & Marlot - Chapter

Article summary of Speed ​​of processing in the human visual system by Thorpe, Fize & Marlot - Chapter


Neurophysiological measurements of delayed or selective visual responses are used to determine how long visual processes take. With the help of these measurements, certain parts of the brain and their functions were discovered. These studies are mainly about brain processes. The current article is about face recognition.

Problems with ERPs

Face recognition is measured with ERPs. However, there are several problems when measuring with ERPs. A problem is, for example, that faces are recognized through highly specialized neuronal pathways. No studies are known that have carried out the precise neuron measurements. A second problem is that with a measured response, the face recognition process may not yet be complete. A reaction can also arise during the process, for example during structural encoding. This last problem can be solved by having the subject do another task at the same time.

Method and results of the current study: go / no-go task

This study solves this last problem by using a go / no-go task in which the test subject must determine within 20ms whether or not an animal is in the picture. ERPs were used to measure the corresponding responses. Each picture is used only once so that no recognition symptoms occur. Analyzes showed that the test subjects could very accurately decide whether an animal was in the picture. 94% of the responses were correct. The average response time to the go tasks was 445 ms. The reaction time gives a good indication for the visual process, but the ERP measurements are better.

Further investigation showed that there was an action potential 150 ms after the stimulus. This was a lot more negative for the no-go tasks than for the go-to tasks. Already before the reaction is signaled. This is not due to a difference in whether or not they were animal pictures, because it is both a visual stimulus. This is probably due to a quick decision-making task immediately after completing the visual process. The response for go tasks should then be faster than for no-go tasks. The difference then comes from the no-go tasks, because you react faster to something you see (go) than to something you are looking for (no-go).

This research has shown that a large part of the visual information has already been processed within 150 ms. However, to discover which parts of the brain respond exactly, follow-up examinations are required. fMRI examinations are recommended.

Article summary of Sparse but not "Grandmother-cell" coding in the medial temporal lobe by Quiroga, Kreiman, et. al. - Chapter

Article summary of Sparse but not "Grandmother-cell" coding in the medial temporal lobe by Quiroga, Kreiman, et. al. - Chapter


The medial temporal lobe (MTL) plays an important role in memory. The study discussed in this article examines how cell activity patterns transform visual information into long-term memory memories such as faces. The question of how this happens has plagued neuroscientists for decades. Evidence comes from electron physiology and lesion studies in monkeys. This shows that there is a hierarchical organization along the ventral visual pathway from the visual cortex V1 to the inferior temporal cortex (IT). Visual stimuli come in through this way and are processed and stored. How this information is exactly presented in there, remains a mystery.

Hypotheses

Two ideas for this representation have been drawn up based on two visions:

  • Idea 1: The "distribution population coding" vision states that a stimulus is represented by the activity of a large number of neuron combinations. Each neuron stands for a certain figure.

  • Idea 2: The “sparse coding” vision states that a perception is represented by smaller neuron combinations, of neurons that respond specifically to specific figures, objects and concepts. Because of this, neuron scientists thought that one neuron corresponds to an object or person, regardless of how this was observed. These cells were called grandmother cells. The question is whether these cells exist at all?

Neurons in IT and MTL

The IT sends information to the MTL. It is important to know that the MTL is not a homogeneous structure with a single function. Neurons in the MTL respond selectively when composing gender and facial expression on photos and in real life with both acquaintances and strangers. This combination of selectivity leads to an explicit representation in which a single cell can serve as an indicator to see which person is being depicted.

These cells resemble grandmother cells, but it is impossible that only one cell responds. In any case, it is even more unbelievable that you can find this cell. In addition, the cell may respond to several people, because in one study you cannot visit all the people in the world. It has been proven that some units fire at several people. Finally, theoretical findings estimate that each unit must have 50-150 cells to distinguish between people. 

It is important to investigate whether these abstract cells can also be found in animal species. In rats, these abstract cells would be expected to be located in the neocortical areas for object recognition, such as the IT cortex. However, it is very difficult to detect such a group of firing neurons.

Research with learning tasks showed that after an unsupervised learning task, the previously random-firing neuron units became units that simultaneously fire at one object from the learning task. In this way this object or individual would be recognized. Most of these units respond uniquely to a single individual.

The findings of neurological patients showed that the MTL is not necessary for visual recognition. The hippocampal rhinal system is involved in long-term declarative memory. The MTL cells, described in the previous sections, connect visual perception to memory. So they do not really provide recognition. Recognition happens after about 130 ms, while the units (from the previous section) only started firing at a certain stimulus between 250-350 ms.

If the MTL is involved in memory, it is likely that these neurons do not know all the differences within visual details. The existence of category cells such as the units (which respond to individuals) corresponds to the vision that they encode aspects of the meaning we have of a stimulus that we want to remember. These cells are probably also involved in learning associations. These units are therefore not grandmother cells.

Article summary of Perceptrons by Van der Velde - Chapter

Article summary of Perceptrons by Van der Velde - Chapter


The purpose of this article is to provide an overview of how perceptrons classify patterns, and to highlight the importance of squashing functions such as activation functions and the learning opportunities of perceptrons.

Basic principles of perceptrons

A perceptron is a neural network in which neurons from different layers are connected to each other. A basic perceptron is shown in Figure 1 of the article. The network consists of two input neurons (x and y) and one output neuron (U). In general, perceptrons can have multiple input neurons and multiple output neurons. The output of neuron x is given by its output activation which is also indicated by x (so x can refer to the name of a neuron or to its output activation). The output of neuron y is its activation y.

Normally a neuron transforms the activation it receives by using an activation function (AF). The input for the AF is generally given by (total input - Activation Threshold).

Activation function ('squashing function')

In general, an activation function AF is a so-called 'squashing function'. Two important squashing functions are:

  • logistic function

  • hyperbolic tangent function

These functions can be seen in Figure 2 in the article.

The input can vary over a long interval (from strongly negative to strongly positive). However, the output is limited, either between 0 and 1 or between -1 and 1. This is an important feature of activation functions. It shows that the logistic function and the hyperbolic tangent function are both squashing functions in the sense that they reduce ('squashing') a (potentially) large input to a relatively small output.

The 'squashing behavior' of both functions means that the most important input values ​​are those that are around 0 (around the threshold value). For both functions you can manipulate the steepness with which the function values ​​change around 0. Moreover, the logistic function only produces positive output activations.

The logistic function

The formula of the logistic function is:

Lf(x) = 1/1 + e-x

In this formula, the following applies:

  • Lf(x): logistic function

  • x: input variable

  • e-x: exponential function

Figure 3 shows ex. The behavior of e-x is the opposite. When x is strongly positive, e-x is small. The logistic function (Lf(x)) therefore approaches 1 for a strongly positive input. When x is strongly negative, e-x is large. The logistic function (Lf(x)) therefore approaches 0 for a strongly negative input. When x = 0, it applies that e-x = e0 = 1, so Lf (x) = 0.5. This means that the logistic function is 0.5 with zero input.

The hyperbolic tangent function

The formula for the hyperbolic tangent function tandh (x) is:

tanh(x) = ex - e-x / ex + e-x

Both ex and e-x play an important role in this. When x is strongly positive, ex is large and e-x small. The hyperbolic tangent function tanh(x) therefore approaches 1 for a strongly positive input. When x is strongly negative, ex is small and e-x is large. So the hyperbolic function approaches -1 for a strongly negative input. When x = 0, it applies that ex = 1 and e-x = 1. The hyperbolic function tanh(x) = 0 at zero input.

Threshold value

<

p>Activation neuron = AF (total input - Activation Threshold). Activation Threshold is indicated by θ. When θ > 0, a larger positive input activation is required to obtain an output of 1. It also applies that if θ

Classification in perceptrons

This paragraph illustrates how perceptrons can classify patterns. It also illustrates why squashing functions are such important activation functions in neural networks. Figure 5 of the article shows how the two-layer network of Figure 1 can be used to classify patterns in line with both the logistic AND-function. The figure shows the (x, y) patterns (0, 0), (1, 0) and (0, 1) that are classified as 0, and the pattern (1, 1) that is classified as 1. These points are called the 'input space'.

Classification of the AND-function can be achieved if we draw a line in the input space that separates points (0, 0), (1, 0) and (0, 1) from point (1, 1). Then all (x, y) inputs are classified to the left of the line. These are labeled as 0. All (x, y) inputs located to the right of the line are classified as 1.

EXOR classification

The EXOR problem is introduced in Figure 8 and is as follows:

x y xERORy

1 1 0

1 0 1

0 1 1

0 0 0

However, this problem cannot be classified in a two-layer network. It therefore cannot be classified with a perceptron. The reason for this is that the problem is not linearly separated in the (x, y) space. The fact that problems such as EXOR are not linearly separated was an important discovery in network theory. The solution is to solve the problem with a three-layer network. This is shown in Figure 9. In this, there is a 'hidden' layer (neurons a and b) within the input (x, y) layer and the output neuron U. The hidden layer performs two necessary intermediate classifications.

Learning with perceptrons

Figure 11 illustrates learning in a two-layer network (perceptron) that learns to classify patterns with the logistic AND- and NOR-rules. The learning procedure depends on the difference between the output of a neuron on each given example U and the output that the network needs to have a correct classification (the desired output D). The difference D-U is therefore a measure of the error that the network makes. Learning procedures that use the measure error are called 'supervised learning procedures'.

Article summary of Learning and neural plasticity in visual object recognition by Kourtzi & DiCarlo - Chapter

Article summary of Learning and neural plasticity in visual object recognition by Kourtzi & DiCarlo - Chapter


Detecting and recognizing objects in the context

Detecting and recognizing meaningful objects in complex environments is a crucial skill that ensures that we can survive. The recognition process is fast, automatic and is seen as being present standardly. However, it is not constructed as easily as people think. After much research, it has become apparent that the recognition process takes place in the ventral visual system. Broadly speaking, this is done through a number of phases: V1 to V2 to V4 to PIT to AIT (the PIT and AIT together form the IT that you see in other articles). The highest phase of this system is the AIT. It is thought that there are neurons in the AIT that specifically recognize objects.

On a theoretical level there is a growing appreciation for the role of learning in robust, crucial representations of object recognition. The role of learning is approached by studying the visual system. Important points in this regard are the development of the visual system and the way in which the system is used by young people and adults. This paper focuses on experience-related plasticity in the adult visual system. Many adult studies have shown that learning-dependent changes are taking place in distinguishing and recognizing stimuli from tasks. Recent studies have searched for locations in the brain that confirm this.

Objects

The ability to form neuronal representations that are sensitive to pixel combinations of shapes and colors comes from object recognition. This is the quality mark for vision and goes through arithmetic difficulties so that it remains highly sensitive to the identity of image changes. Object recognition requires the visual system to discriminate between different patterns of input. Discrimination is probably done by merging a number of neurons in the early stages of the process. As a result, neurons from higher phases know when to fire at certain patterns, which leads to discrimination and recognition. Arithmetic models have proven that such explicit object recognition can be built using neuronal connections from groups that fire together with similar characteristics of images.

Learning sensitivity

The plasticity of neuronal connections combined with suitable learning rules is a potential mechanism. An example of this is that learning to respond selectively goes hand in hand with learning which things often occur (together) in reality. On a mechanistic level, this occurs when certain neurons respond to the input. An interaction of this strategy per phase of the visual system results in a complex stimulus system, which ensures recognition. Recently evidence has been found for the above theory. The timeframe for learning such recognitions suggests a strong link between neuroplasticity and behavioral improvements.

Learning tolerance

Most studies on visual learning have focused on changes in neuronal or behavioral selectivity. Learning selectivity is not enough for object recognition. Selective object representations must be tolerant of image changes (size, color, angle). Learning from the natural world is the solution. A central idea is that features and objects from the world do not suddenly exist or do not exist, but have (temporary) continuity. If you see an object and you walk on, you still see that object, but always from a different point. This is learning tolerance, because you know that it is still about the same object. That way you recognize it from multiple points of view. Evidence has been found that tolerance is not automatic, but it is not known how it works neuronal.

The visual system must also learn to recognize objects among other objects; called clutter. Therefore, it is suggested that learning has to do with improving correlations between neurons that respond to a characteristic or target against a background of noise. However, you should not always see the noise as noise. This background noise can also ensure that you place something in the right context.

Neuron plasticity

It is often said that the plasticity of perceptual learning is in early visual phases, because this learning is linked to the position of the retina. This means that changes in the receptive field can lead to certain tunings of V1 neurons. Recent image studies have found influence of the V1 when learning object characteristics. However, the evidence remains controversial. One possibility is that V1 learning effects can be found in the average response of a large group of neurons measured with fMRI.

Shape representation can shift from high to low visual areas. This supports fast and automatic searching and discovery with attention control (in cluttered scenes). These findings are consistent with the suggestion that object representation is not only helped by bottom-up processing but also by top-down processing. Learning starts in higher visual areas for easy tasks and continues to lower visual areas if higher resolution is required for more difficult tasks.

One of the biggest advantages of fMRI is that it shows global images of the brain. fMRI is therefore a handy method for studying the brain during visual tasks. It has been found that learning is supported by selective processes with crucial characteristics of objects. Recent neuronal imaging studies have shown that learning is supported by functional interactions between occipital temporal and parietal frontal areas. The findings of these image studies are consistent with the top-down approaches to visual processes. These areas together create a perceptual representation of the world.

In summary, it can be said that current studies indicate that there is no fixed place for brain plasticity in visual learning. At the neuronal level, learning can arise from changes in the feed forward network, especially at higher phases, or through changes in interactions between frontal cortical areas and local connections in the primary visual cortex. Such changes can be adaptive and efficient.

The four conclusions that now can be drawn are:

  1. The adult visual system is plastic.

  2. There is no place of plasticity that supports object recognition learning.

  3. Learning is not always the result of simple, static change at the core of the feedforward network.

  4. The relationship between neuronal mechanisms and plasticity remains unknown.

Article summary of Breaking position-invariant object recognition by Cox, Meier, et. al - Chapter

Article summary of Breaking position-invariant object recognition by Cox, Meier, et. al - Chapter

[toc]

Each object can take an infinite number of positions on the retina. After all, objects can change in size, location, light, etc. The ability to identify objects despite all these changes is principal to human visual object recognition. However, the neuronal mechanisms behind this process are a mystery. Several authors have suggested that a solution to the invariance problem is to learn representations through experience in the real world.

Visual features that follow each other quickly are more likely to correspond to different images of the same object. This way someone can slowly build an invariant representation of associative patterns. The patterns are built up through neuronal activity in retinal images of an object. Changes in the retinal position of an object can occur due to rapid eye movements. A possible strategy is to associate representations with neuronal series of activity patterns.

If the correct position invariance has been learned through experience, incorrect invariances can be created. Testing with an experiment where images resembled each other prove this. In line with the results, it turned out that the movement makes it appear as if it is the same image. The images were changed so quickly that the test subjects did not notice this. The subjects were more often confused by object pairs that changed positions than by alternating images (in which they did not see the image being swapped). Invariance with objects takes place quickly. The confusion arises because people have been taught to expect something to stay the same when it moves, and as soon as this changes during the movement, something is wrong in your saccade.

A third study has shown that specific adjustments in the object spatiotemporal experience can lead to position-invariant recognition of test objects. This study shows that visual processes can be changed by visual statistics that do not reach consciousness. The invariance comes from the experience.

Article summary of A feedforward architecture accounts for rapid categorization by Serre, Oliva & Poggio - Chapter

Article summary of A feedforward architecture accounts for rapid categorization by Serre, Oliva & Poggio - Chapter


Object recognition takes place in the ventral visual system in the cortex. This system runs from the visual V1 area to IT. From there, there are connections with the PFC that ensure that perception and memory are connected. The further down the path, the more specific the neurons are and the greater their receptive fields. Plasticity and learning are probably present in all phases of object recognition.

It is not known what feedback will be given from the phases. The hypothesis is that the basic processes of information go feedforward and are supported by short time limits that are required for specific responses. However, this hypothesis does not exclude feedback loops. The feedforward architecture is a reasonable starting point for the theory of the visual cortex aimed at explaining direct object recognition. The recognition phase can also play a role.

Model based on the feedforward theory

The model used here is a direct extension of the Wiesel and Hubel model. The model is a feedforward theory of visual processes. Every feedforward theory distinguishes between two types of cells; simple and complex cells. This creates a distinction between selectivity and invariance. This model uses an input of 7 × 7 boxes. The boxes are first analyzed by a multidimensional row of simple S1 units that respond best to lines and angles. The S1 units can be on and off if the figure they represent is present. The next level is C1. C1 looks like complex cells from the striatum. Every complex C1 unit receives information from a unit of S1 cells with the same orientation. However, there may be slight differences and C1 can receive multiple S1 units, making the C1 unit less sensitive to position and size.

At the C level, complex cells are thus converted into a two-dimensional image by combining the afferent S cells. The results are calculated with firing neurons. This creates a complex model with how a representation is created in the brain of a visual image. Reversing a C level is also called a MAX operation.

Not all features of a complete feedforward network can be found in this model. That would be too complicated. Every model tries to describe and explain a piece. Because we still have little knowledge of how everything works exactly, this remains very complicated. So you also have to put the different theories (and articles) next to each other and try to understand and apply them to each other. Just as in this model the MAX returns.

Results

The above data was presented and measured with a task where test subjects had to indicate whether or not an animal was presented. The showing of the pictures was very short. Yet people can then quickly and accurately decide whether or not there is an animal. This means that our object recognition works very quickly. If the image was displayed a little shorter this also had little effect. The model was used to view the similarities with people.

A comparison between human performance and the feedforward model in animal tasks was measured with (d'); a monotonic function of the performance of the observers. The results showed that people and the model responded about the same and that there were similar reactions. Recent studies have even shown that animal recognition also works if, for example, the image is shown upside down. People were a bit slower when using a concealment component.

Discussion

This model improves the old model on two important points.

  1. A learning phase has been added. The hierarchy of the visual system is viewed better. Learning takes place in two independent phases. First during development with unsupervised learning, then at task-specific circuits.

  2. The new model is closer to the anatomy of the visual cortex. This model also suggests (along with other theories) that an initial feedforward step is driven in bottom-up processes for a basic representation, built from a basic dictionary with general figures.

Article summary of Hierarchical models of object recognition in cortex by Riesenhuber & Poggio - Chapter

Article summary of Hierarchical models of object recognition in cortex by Riesenhuber & Poggio - Chapter


Recognition of visual objects

The recognition of visual objects is fundamental. Research often takes place with a repeated cognitive task with two essential requirements: invariance and specificity. Cells from the inferotemporal cortex (IT, the highest visual area in the ventral visual pathway) appear to play a key role in object recognition. The cells respond to what one sees with complex objects such as faces. Certain neurons respond specifically to certain faces and not to other faces. The question remains: how can they respond to different faces while the stimulus offer is practically the same in the retina?

This is also reflected in the striate cortex in cats. Both simple and complex cells respond to a presented bar. For example, it appears that the small simple cells have narrow receptive fields that are strongly position-dependent and that the complex cells have large receptive fields and are not position-dependent. Hubel and Wiesel have made a model in which the simple cells respond as if they are neighbor cells. Where cells that sit next to each other also see the world next to each other. So you often see a group of cells firing together. A direct follow-up of this model leads to a higher-order-complex cells scheme.

Cells in the V4 can control their attention and they can respond to an adaptation in their receptive field. There is little evidence that this mechanism is used to translate invariant object recognition. Invariance of each transformation can be built up by converting afferent cells with different variations of the same stimulus. Evidence has now been found that groups of cells that respond to whole or partial vision are learned through a learning process. The vision invariance problem can then be presented by a small number of neurons. This idea gives us two problems.

Problem 1

In monkeys, it is that learning (for them) unknown stimuli (such as faces) is possible because they learn a part of the invariant via just one view of the object. If this object is presented with a lot of distractor objects around it, it can be learned in combination with these objects. The cells thus become invariant at other positions.

Problem 2

The model does indicate how view tuned units (VTU, groups that fire at a specific object) are built, but not how they arise.

Results

The model is based on a simple hierarchical feedforward architecture. It is assumed that the structure reflects the invariance and that characteristic specificity must be built up from different mechanisms. The pooling mechanism should provide robust feature detectors. This means that it must allow detection on specific characteristics without getting confused by clutter or context in the receptive field.

There are two alternatives to a pooling mechanism.

Linear addition = SUM.

Equal weights are hereby weighed. Responses to a complex cell are invariant as long as the stimulus remains in the receptive field of the cell. However, there is no response as to whether there actually is a bar in the receptive field. The output signal is the sum of the afferent cells and so there is no characteristic specificity.

Non-linear maximum operation = MAX.

The strongest afferent cell determines the postsynaptic response. With MAX, the response is determined by determining the most active afferent cell and this signal is seen as the best match for a portion of the stimulus. This makes MAX respond better.

In both cases, the response of a complex cell is invariant to the bar on the receptive field. A non-linear MAX function is a good way that correctly describes the pool when invariant. This includes implicit scanning of afferent cells of the same type. The strongest is then selected from the cells that respond and this is the most consistent with the invariance. Pooling combinations of afferent cells provides a mixed signal caused by different stimuli.

MAX systems are comparable in some respects to neurophysiological data. For example, if two stimuli are offered in the receptive field of an IT neuron, then the neuron's response is dominated by the stimulus that receives the most responses separately. This corresponds to how the MAX model predicts when it comes to afferent neurons. A number of studies provide support for the MAX model. These studies often find a high non-linear tuning of IT cells. This corresponds to the MAX response function. A linear model cannot make such strong changes with a small change in input.

In some cases, clutter can cause the value to change from the MAX function. The quality of the match in the final phase has then changed, so that the power of the VTU response is also different. A solution for this is to add more specific characteristics. Simulations have shown that this model is able to recognize objects in a context.

The MAX model can be used well to describe brain processes. MAX responses are probably from cortical microcircuits in lateral inhibition between neurons in the cortical layer. In addition, the MAX response is important for object recognition.

Supporting content II (teasers)
Summaries per article with Artificial intelligences and Neurocognition at Leiden University 20/21

Summaries per article with Artificial intelligences and Neurocognition at Leiden University 20/21

Summaries per article with Artificial intelligences and Neurocognition at Leiden University 20/21

Join World Supporter
Join World Supporter
Log in or create your free account

Why create an account?

  • Your WorldSupporter account gives you access to all functionalities of the platform
  • Once you are logged in, you can:
    • Save pages to your favorites
    • Give feedback or share contributions
    • participate in discussions
    • share your own contributions through the 7 WorldSupporter tools
Follow the author: Vintage Supporter
Promotions
verzekering studeren in het buitenland

Ga jij binnenkort studeren in het buitenland?
Regel je zorg- en reisverzekering via JoHo!

Access level of this page
  • Public
  • WorldSupporters only
  • JoHo members
  • Private
Statistics
[totalcount] 1
Comments, Compliments & Kudos

Add new contribution

CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image.