May 22, 2008

Perceptron neural network

Perceptrons are the simplest architecture to learn when studying Neural Networking. Picture you mind of a perceptron as a node of a wide, interconnected neural network, sort of like a data tree, although the neural network does not necessarily have to have a top and bottom sections. The connections among all the nodes not only show the relationship between the nodes but also transmit data and information, called a signal or impulse. The perceptron is a simple model of a neuron .


Since making connections of perceptrons into a neural structure is a bit complicated, let's take a perceptron by itself. A perceptron has a number of external input pattern, one internal input (called a bias), a threshhold, and one output. To the right, you can see a picture of a simple perceptron. It resembles a neuron.


Usually, the input values are boolean (just two possible values 1 and 0, true and false), but they can be any number. The output of the perceptron, however, is always boolean like a switch. When the output is on (has the value 1), the perceptron is said to be activated.


All of the inputs (including the bias) have weights attached to the input patterns that modify the input values to the neural network. The weight is just multiplied with the input.


The activation function is one of the key components of the perceptron as in the most common neural network architectures. It determines, based on the inputs, whether the perceptron activates or not. Basically, the perceptron takes all of the weighted input values and adds them together. If the sum is above or equal to some value (called the threshold) then the perceptron fires. Otherwise, the perceptron does not. So, it get active whenever the following equation is true (where w represents the weight, and there are n inputs):


The Perceptron activates when this Equation is True

The threshold is like a wall: if the "signal" has enough "energy" to jump over the wall, then it can keep going, but otherwise, it has to stop. Traditionally, the threshold value is represented either as the Greek letter theta (the symbol inside the circle in the picture above) or by a graphical symbol that looks like a square S:



The main feature of perceptrons is that they can be trained (or learn) to behave a certain way as all neural networks. One popular beginner's assignment is to have a perceptron model (that is, learn to be) a basic boolean function such as AND or OR. Perceptron learning is supervised, that is, you have to have something that the perceptron can imitate. So, the perceptron learns like this: it produces an output, compares the output to what the output should be, and then adjusts itself a little bit. After repeating this cycle enough times, the perceptron will have converged (a technical name for learned) to the correct behavior.


This learning method is called the delta rule, because of the way the perceptron checks its accuracy. The difference between the perceptron's output and the correct output is assigned the Greek letter delta, and the Weight i for Input i is altered like this (the i shows that the change is separate for each Weight, and each weight has its corresponding input):


Change in Weight i = Current Value of Input i × (Desired Output - Current Output)


This can be elegantly summed up to:



The delta rule works both if the perceptron's output is too large and if it is too small. The new Weight i is found simply by adding the change for Weight i to the current value of Weight i.


Interestingly, if you graph the possible inputs on different axes of a mathematical graph, with pluses for where the perceptron fires and minuses where the perceptron doesn't, the weights for the perceptron make up the equation of a line that separates the pluses and the minuses.



For instance, in the picture above, the pluses and minues represent the OR binary function. With a little bit of simple algebra, you can transform that equation in the diagram to the standard line form in which the weights can be seen clearly. (You get the following equation of the line if you take the firing equation and replace the "greater than or equal to" symbol with the equal sign).



This equation is significant, because single perceptron can only model functions whose graphical models are linearly separable. So, if there is no line (or plane, or hyperplane, etc. depending on the number of dimensions) that divides the fires and the non-fires (the pluses and minuses), then it isn't possible for the perceptron to learn to behave with that pattern of firing. For instance, the boolean function XOR is not linearly separable, so you can't model this boolean function with only one perceptron. The weight values just keep on shifting, and the perceptron never actually converges to one value.


So, by themselves, perceptrons are a bit limited, but that is their appeal. Perceptrons enable a pattern to be broken up into simpler parts that can each be modeled by a separate perceptron in a network. So, even though perceptrons are limited, they can be combined into one powerful network that can model a wide variety of patterns, such as XOR and many complex boolean expressions of more than one variable. These algorithms, however, are more complex in arrangement, and thus the learning function is slightly more complicated. For many problems (specifically, the linearly separable ones), a single perceptron will do, and the learning function for it is quite simple and easy to implement. The perceptron is an elegantly simple way to model a human neuron's behavior. All you need is the first two equations shown above.



A detailed explanations about perceptron neural network can be found at Perceptron and Adaline Neural Networks

5 comments:

Anonymous said...

nice post! thanks!

Anonymous said...

wht s dis?if u think u vil gave all news abt perceptrons?den where s de types of perceptrons..where???? pls send all de

Anonymous said...

its too precise for a beginner to understand!!!!

Anonymous said...

Buona Sera! Mark Lindsey . payday loans

chandu said...

my doubt, is activation function used here is fixed or can we take other