Talk:Perceptron

From Wikipedia, the free encyclopedia

I hope this is more accurate than the previous version: can someone who knows more than me fact-check this article, and if necessary bring neural networks into sync?

As discussed on Talk:Artificial neuron, there is a lot of overlap between articles on this and various related topics. See that page for some suggestions on how they might best be seperated. - IMSoP 18:29, 11 Dec 2003 (UTC)

1 illustrating images for gaussian data don't differ
2 Is my Pic ok
3 pocket ratchet algorithm
4 McCulloch-Pitts redirect
5 What does b represent?
6 Learning a linear MLP that does XOR
7 Learning rule & bias?
8 Content difficult to understand
9 Multi Layer Perceptron
10 Running Time
11 Initial weight vector
12 Suspected Excessive Promotion of Herve Abdi
13 Minsky and Grossberg
14 Very important confusion of names

[edit] illustrating images for gaussian data don't differ

Hello, the three pictures (gaussian data in 2d, gaussian data with linear classifier, gaussian data in higher space) do not differ. Is it intended? —The preceding unsigned comment was added by 87.160.196.168 (talk) 19:45, 26 February 2007 (UTC).

[edit] Is my Pic ok

Please see the discussion page for the XOR perceptron net

[edit] pocket ratchet algorithm

Having never heard of this algorithm, I'm disapointed it doesn't have more discussion, either here or on it's own page.

[edit] McCulloch-Pitts redirect

I'll try and tidy up this article when I get the chance, but McCulloch-Pitts neuron should NOT redirect here. MP neurons are threshold units, whereas neurons in a perceptron model are linear functions of their summed inputs. An MP neuron's activation is essentially u(x) + b, where u(x) is the heaviside step function, x is the sum of inputs (inputs can either be +1 excitatory or -1 inhibitory), and b is a bias or threshold. DaveWF 07:33, 4 April 2006 (UTC)

According to chapter 2 of Rojas' book the only difference between the classical Rosenblatt perceptron and McCulloch-Pitts neuron is that the perceptron has weighted inputs. This concurs with Russell & Norvigs AI A modern approach, which says that perceptrons also use the heaviside step function. Also McCulloch-Pitts activation isn't a straight sum, because the inhibitory inputs are absolute - a single active inhibitory connection will force the whole output to 0. There's no such thing as a -1 input for MP neurons. 129.215.37.38 17:25, 15 August 2007 (UTC)

[edit] What does b represent?

My lay understanding of scientific, computer, and math topics is good but my understanding of jargon and formulae is poor...

In the definition section we have f(x) = <w,x> + b - all terms are defined except b — what am I missing? — Hippietrail 16:53, 10 April 2006 (UTC)

I've tried to clear this up. Hope that helpsl. DaveWF 06:22, 5 September 2006 (UTC)

[edit] Learning a linear MLP that does XOR

Multiple layers don't help unless they have non-linear activation functions, since any number of linear layers will still give you a linear decision boundary. ~~I'll fix this up later tonight.~~ DaveWF 22:26, 4 September 2006 (UTC)

Okay, I've fixed some bias-related stuff but I'm going to do a major rewrite over the next couple of days. The superficial discussion is fine but some of the other stuff is inaccurate or misleading. DaveWF 06:22, 5 September 2006 (UTC)

where α < 1 limitation comes from? I bet any positive α will work for training. 91.124.83.32

[edit] Learning rule & bias?

The article suggests that the bias is not adjusted by the perceptron learning rule, but AFAIK this is usually not the case. Can someone who knows more about NNs confirm this? Neilc 23:48, 15 October 2006 (UTC)

The bias can be learned just like any other weight, and often is by means of introduction of another input which is always '1'. In this way it's quite similar to linear regression. DaveWF 04:03, 18 October 2006 (UTC)

[edit] Content difficult to understand

I think the article should be updated a little. even the second paragraph is a little confusing, when it calls the perceptron a 'binary classifier'. I will try to draw some pictures, and I think some of the mathematical text should either be cleaned up, or replaced with picturesPaskari 17:37, 29 November 2006 (UTC)

I updated the 'learning algorithm' section, as I found it to be a little difficult to follow. My version is better in that it is tabular, but I still think it's too dificult too follow. Could someone look it over and update as need be. Paskari 19:28, 29 November 2006 (UTC)

[edit] Multi Layer Perceptron

Could someone who is knowledgable enough create a page for multi-layer perceptrons? there is a good section on it under Artificial Neural Networks, but it isn't detailed enough. Paskari 19:34, 29 November 2006 (UTC)

[edit] Running Time

I am going to create a section outlining the running time and tractability of the algorithm, I hope this is OK with everyone. Paskari 16:53, 1 December 2006 (UTC)

First we consider a network with only one neuron. In this neuron, if we assume the length of the largest element in x or in w to be of size n then the running time is simply the running time of the dot product which is

O (n 3)

. The reason for this is that the dot product is the rate determining step. If we extend the example further, we find that, in a network with k neurons, each with a running time of

O (n 3)

, we have an overall running time of $O (k n 3)$

I've removed this section as it seems completely wrong to me and is unsourced. The number of inputs to a single perceptron defines the O time. Since all the inputs can be inputs to a single perceptron, then the runtime is

O (n)

. There are n multiplications, I'm not sure where the power of 3 comes from. 172.188.190.67 17:33, 20 August 2007 (UTC)

[edit] Initial weight vector

What is the weight vector set to in the first run through the training set? Is it the equal to your x vector?

It doesn't matter, random would work just as well 91.124.83.32

[edit] Suspected Excessive Promotion of Herve Abdi

Another reference to Herve Abdi, inserted by an anonymous user with ip address 129.110.8.39 which seems to belong to the University of Texas at Dallas. Apparently the only editing activity so far has been to insert excessive references to publications by Herve Abdi, of the University of Texas at Dallas. The effect is that many Wikipedia articles on serious scientific topics currently are citing numerous rather obscure publications by Abdi et al, while ignoring much more influential original publications by others. I think this constitutes an abuse of Wikipedia. A while ago, as a matter of decency, I suggested to 129.110.8.39 to remove all the inappropriate references in the numerous articles edited by 129.110.8.39, before others do it. For several months nothing has happened. I think the time has come to delete the obscure reference. Truecobb 21:37, 15 July 2007 (UTC)

[edit] Minsky and Grossberg

I am looking throughout wikipedia for innacurate claims regarding Minsky and Papert's Perceptrons book, and I intend to correct all of them. One interesting fact regarding the contents of this page here is that one of the exaggerated claims were inserted together with a reference to this Grossberg article, in an anonymous edit of an IP that never did anything else. Here is the edit in question:

http://en.wikipedia.org/w/index.php?title=Perceptron&oldid=61149477

Does anybody know who inserted this text?...

I don't know what this article has to do with that book. First of all, not only Perceptrons brings proofs of how to implement the XOR (partity) function for any number of inputs, but many other books before it already said that (for example, Rosenblatt's book). This article seems to deal with more complex networks, with feedbacks. I believe that this reference to this article is just something that people hear and go on repeating without questioning.

My doubt is: should we just remove anyway any reference to article claiming to have “countered” the Perceptrons book, or should we keep them to show that the research in ANNs never really stop for real? -- NIC1138 (talk) 22:32, 24 March 2008 (UTC)

[edit] Very important confusion of names

It seems to me that this article is confusing what a perceptron is a (three-layered network) for a single neuron. The “associative” neurons, as called by rosenblatt, reside in an intermediary layer (what has today the strange name of “hidden” layer). The output is a unique summation followed by a limitation in the case of what Rosenblatt called a “simple perceptron”, and more outputs in more complex perceptrons. We will need some intense rewrites to reflect this in the article...

I believe the training Rosenblatt did in the beginning was just using randomly assigned weights in the first layer, and then adjusting only the output weights. Perhaps this is what leads people to the confusion, believing that the early perceptron was just a simple linear classifier, when it was already a second-order structure. -- NIC1138 (talk) 17:16, 25 March 2008 (UTC)