User:Mbuset14/Melanie's Sandbox

David E. Rumelhart

David E. Rumelhart is my topic for this assignment. I am interested in knowing more about his influence on mathematical psychology. One topic he researched was backpropogation.^[1]

Introduction

David E Rumelhart received his BA in psychology and mathematics at the University of South Dakota, 1963. He earned his Ph.D in mathematical psychology at Stanford in 1967. He taught at the University of California, San Diego, and Standford. Rumelhart believed that the gap between mathematical and computer based models of learning as well as linguistics needed to be brought together^[2]. One topic that Rumelhart was interested in, was unsupervised learning. He used the name "competitive learning" to refer to it. Competitive learning was examined by the use of computer simulation and formal analysis. Rumelhart & Zipser found that when competitive learning was applied to parallel networks mimicking neuron like elements, many learning tasks can be achieved. Jordan & Rumelhart (1992), explained how certain algorithms made two assumptions in regards to neural networking. The perceptron and LMS algorithms assumed that the only adaptive units in the network were output units and that a teacher provides desired states of all of said output units^[3] . However, once more research was done it was found that internal units adaptively recode the initial input representation of the environment. This finding eliminated the early assumption that the only output units were adaptive units in a network. Algorithms such as Boltzmann learning and backpropagation are considered training networks that use nonlinear internal units^[3] . Backpropagation strives to find the minimum error function, which is then considered to be the solution of the learning problem^[4] . Backpropagation differs from perceptrons by its use of activation function rather than step function^[4] . In addition to the first assumption being disproven, further research challenged the second as well. Unsupervised learning algorithms require no use of a “teacher”. These algorithms do their work by clustering and extracting features on the input data about statistical or topological properties^[5] .

Competitive Learning

Rumelhart Explored the problems of the processing capabilities of one level systems and the difficulty of developing learning schemed for multilayered systems.

Competitive learning is a scheme in which important features can be discovered at one level that a multilayered system can use to classify pattern sets which cannot be classified with a single level system.^[6] Rhumelhart reported that 35 years of experience has shown that getting neuron like elements to learn some easy thing is often quite straightforward, but designing system with powerful general learning properties is a difficult problem, and the competitive learning paradigm does not change this fact.^[6] Rumelhart hoped to show that competitive learning is a powerful strategy that cut down the amount of time it would take to complete difficult tasks.35-40 years ago, it was very difficult to see how anything resembling a neural network could learn at all. Prior to Hebb’s work, it was believed that some physical change must occur in a network to support learning, but it was unclear what this change could be. Hebb proposed that a reasonable and biologically plausible change would be to strengthen the connections between elements of the network only when both the pre and post-synaptic units were active simultaneously. The essential notion that the strength of connections between the units must change in response to some function of the correlated activity of the connected units still dominates learning models^[6]. Frank Rosenblatt invented a class of simple neuron like learning networks which he called perceptrons. Researchers thought that perceptrons may have actually corresponded to parts of more extended networks and biological systems; in this case. The results obtained will be directly applicable. More likely they represent extreme simplifications of the central nervous system, in which some properties are exaggerated and other suppressed. In this case successive perturbations and refinements of the system may yield a closer approximation ^[7].The essential structure that a competitive learning mechanism can discover is represented in the overlap of stimulus patterns. The "simplest stimulus population in which stimulus patterns could overlap with one another is one constructed out of dipoles-stimulus patterns consisting of exactly two active elements and the rest inactive"^[6] . If we have a total of N input units there are N(N-1)/2possible dipole stimuli. If the actual stimulus population consists of all N (N-1)/2 possibilities, there is no structure to be discovered. There are no clusters for out units to point at (unless we have one unit for each of the possible stimuli, in which case we can point a weight vector at each of the possible input stimuli). If, however we restrict the possible dipole stimuli in certain ways, then there can be meaningful groupings of the stimulus patterns that the system can find^[6] .

Interactive Activation and Competition Networks

Rumelhart worked a lot with colleague James McClelland. Together they developed an early word recognition connectionist model. The type of model is called “interactive activation and competition networks” (IAC) and is used during competitive learning for a number of applications such as speech perception, visual word recognition, and visual perception^[2] . The IAC network is comprised of three interconnected hierarchical levels. Each level is either excitatory or inhibitory. Excitatory connections make the connections more active and fast. Inhibitory connections weaken connections. The three levels are; input levels with visual features, intermediate levels where units are individual letters, and output levels where each unit is a word^[2] . At the input levels, the visual features of a word are considered and connections are made based on the physical feature of the first letter in the word. For example, if the word starts with a “M” all letters with the beginning letter “M” will be considered and this would produce an excitatory connection between neurons looking for the letter “M”. Other words that start with any other letter than “M” will not be considered and those connections will be inhibited. Depending on the second letter of the word, the same process will take place and so on and so forth until the final word is made. The exhausting of certain letters in certain words is called a process cycle. An example of the IAC network is seen in Figure 2.

Human Information Processing

Again, Rumelhart and McClelland worked on another neural network algorithm named the competitor of the logogen model. This model is a type of human information processing ^[8] . Broadbent learning has proposed that memory is most likely diffuse being represented in many different brain regions. Rumelhart and McClelland created the logogen network in response to research done by Broadbent (1985) and his work done on memory. Rumelhart and McClelland believed that Broadbent had a good start but they were not completely satisfied with Broadbent’s official proposition explaining that his idea was much more mathematical and needed to be further understood by both psychologists and mathematicians in order to receive the best results. Rumelhart and McClelland recognized that Broadbent was adding onto previous research done by David Marr (1982). While Marr proposed three levels of his theory, the computational, the algorithmic, and the implementation levels, Broadbent only discussed two levels, leaving the algorithmic level absent. Rumelhart and McClelland’s proposal is mainly focused on the algorithmic level and focuses on the storing and retrieving of memory. The algorithmic level was of most importance to Rumelhart and McClelland focusing on issues such as, “efficiency, degradation of performance under noise or other adverse conditions, whether a particular problem is hard or difficult, which problems are solved quickly, which take a long time to solve, how information is represented, and so on” ^[8] . They explained that these problems speak directly to psychologists and at the computational level it does not matter “whether the theory is stated as a program for a Turing machine, as a set of axioms, or as a set of rewrite rules” ^[8] . What is important here and what matters is what function is being computed and now necessarily how it is being computed. Rumelhart and McClelland proposed that a proper algorithm must mimic that of neuronal functioning in the brain where more complex ideas take more time and simple ideas and memories take less time. An Example of the different levels is seen in Figure 1.

Figure 1

Figure 2

references

^[9] ^[8] ^[10] ^[11] ^[12] ^[13] ^[6] ^[14] ^[15] ^[16] ^[17] ^[18] ^[3] ^[19] ^[20] ^[4] ^[5] ^[2] ^[7]

^ Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (9). "Learning representations by back-propagating errors". Nature. 323 (6088): 533–536. doi:10.1038/323533a0. {{cite journal}}: Check date values in: |date= and |year= / |date= mismatch (help); Unknown parameter |month= ignored (help)
^ ^a ^b ^c ^d Harley. "Theories of Learning in Educational Psychology".
^ ^a ^b ^c {{cite journal|last=Jordan|first=M.I|title=Forward models: Supervised learning with a distal teacher|journal=Cognitive Science|year=1992|volume=3|issue=16|pages=307-354|ISSN=0364-0213
^ ^a ^b ^c Rojas, R. "The backpropagation algorithm" (PDF). Springer-Revlag. Retrieved 19 March 2012.
^ ^a ^b Fahlman, Scott. E. "An Empirical Study of Learning Speed in Back-Propagation Networks" (PDF). National Science Foundation. Retrieved 19 March 2012.
^ ^a ^b ^c ^d ^e ^f Rumelhart, D. E (1985). "Feature discovery by competitive learning". Cognitive Science. 1 (9): 75–112. doi:10.1207/s15516709cog0901_5. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ ^a ^b Rosenblatt, Franklin (1962). Washington, DC: Spatan Books. p. 616. ISBN B0006AXUII. {{cite book}}: Check |isbn= value: invalid character (help); Missing or empty |title= (help)
^ ^a ^b ^c ^d Rumelhart, David E.; McClelland, James L. (1985). "Levels indeed! A response to broadbent". Journal of Experimental Psychology. 2 (114): 193–197. doi:10.1037/0096-3445.114.2.193.{{cite journal}}: CS1 maint: date and year (link)
^ Rumelhart, McClelland, D E, J L (1986). Parallel distributed processing :Explorations in the microstructure of cognition. Mass: Cambridge.{{cite book}}: CS1 maint: multiple names: authors list (link)
^ McClelland, James L.; Rumelhart, David E. (1985). "Distributed memory and the representation of general and specific information". Journal of Experimental Psychology. 2 (114): 159–188. doi:10.1037/0096-3445.114.2.159. PMID 3159828.{{cite journal}}: CS1 maint: date and year (link)
^ Rumelhart, David E.; Siple, Patricia (1974). "Process of recognizing tachistoscopically presented words". Psychological Review. 2 (81): 99–118. doi:10.1037/h0036117. PMID 4817613.{{cite journal}}: CS1 maint: date and year (link)
^ Rumelhart, David E.; McClelland, James L. (1982). "An interactive activation model of context effects in letter perception: II. the contextual enhancement effect and some tests and extensions of the model". Psychological Review. 1 (89): 60–94. doi:10.1037/0033-295X.89.1.60.{{cite journal}}: CS1 maint: date and year (link)
^ Rumelhart, D. E (1999). Bly, B.M. London: Academic. ISBN 9780126017304.
^ Rumelhart, David E.; Widrow, Bernard; Lehr, Michael A. (1994). "The basic ideas in neural networks". Commun. 37 (3): 87–92. doi:10.1145/175247.175256. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: date and year (link)
^ Norman, Donald A.; Rumelhart, David E. (1981). "he LNR approach to human information processing". Cognition. 10 (1–3): 253–240. doi:10.1016/0010-0277(81)90051-2. PMID 7198542.{{cite journal}}: CS1 maint: date and year (link)
^ Widrow, Bernard (March 1994). "Neural networks: applications in industry, business and science". Commin: 93–105. doi:10.1145/175247.175257. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)CS1 maint: date and year (link)
^ Chauvin, Y (1995). Backpropagation :Theory, architectures, and applications. Hillside N.J: Lawrence Erlbaum Associates.
^ {{cite journal|last=Franco|first=H|coauthors=Morgan, N, Rumelhart, D, Abrash, V|title=Context-dependent connectionist probability estimation in a hybrid hidden markov model-neural net speech recognition system|journal=Computer Speech & Language|year=1994|volume=3|issue=8|pages=211-222|doi=10.1006/csla.1994.1010
^ McClelland, James L.; Rumelhart, David E. (1981). "An interactive activation model of context effects in letter perception: I. An account of basic findings". Psychological Review. 88 (5): 375–407. doi:10.1037/0033-295X.88.5.375. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: date and year (link)
^ Rumelhart, D. E (2004). Toward an Interactive Model of Reading. In R.B. Ruddell, & N.J. Unrau (Eds.), Theoretical Models and Processes of Reading. Newark, DE: International Reading Association. ISBN 10: 0-87207-502-8. {{cite book}}: Check |isbn= value: invalid character (help)

(Mbuset14 (talk) 01:55, 13 March 2012 (UTC))

[Rumelhart-1] Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (9). "Learning representations by back-propagating errors". Nature. 323 (6088): 533–536. doi:10.1038/323533a0. {{cite journal}}: Check date values in: |date= and |year= / |date= mismatch (help); Unknown parameter |month= ignored (help)

[Ed_Psych-2] Harley. "Theories of Learning in Educational Psychology".

[Distal_Teacher-3] {{cite journal|last=Jordan|first=M.I|title=Forward models: Supervised learning with a distal teacher|journal=Cognitive Science|year=1992|volume=3|issue=16|pages=307-354|ISSN=0364-0213

[Algorithm-4] Rojas, R. "The backpropagation algorithm" (PDF). Springer-Revlag. Retrieved 19 March 2012.

[Learning_Speed-5] Fahlman, Scott. E. "An Empirical Study of Learning Speed in Back-Propagation Networks" (PDF). National Science Foundation. Retrieved 19 March 2012.

[Competitive_Learning-6] ^ ^a ^b ^c ^d ^e ^f Rumelhart, D. E (1985). "Feature discovery by competitive learning". Cognitive Science. 1 (9): 75–112. doi:10.1207/s15516709cog0901_5. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)

[Neurodynamics-7] Rosenblatt, Franklin (1962). Washington, DC: Spatan Books. p. 616. ISBN B0006AXUII. {{cite book}}: Check |isbn= value: invalid character (help); Missing or empty |title= (help)

[Levels_Indeed!-8] Rumelhart, David E.; McClelland, James L. (1985). "Levels indeed! A response to broadbent". Journal of Experimental Psychology. 2 (114): 193–197. doi:10.1037/0096-3445.114.2.193.{{cite journal}}: CS1 maint: date and year (link)

[Parallel_Dist.-9] Rumelhart, McClelland, D E, J L (1986). Parallel distributed processing :Explorations in the microstructure of cognition. Mass: Cambridge.{{cite book}}: CS1 maint: multiple names: authors list (link)

[Distributed_Memory-10] McClelland, James L.; Rumelhart, David E. (1985). "Distributed memory and the representation of general and specific information". Journal of Experimental Psychology. 2 (114): 159–188. doi:10.1037/0096-3445.114.2.159. PMID 3159828.{{cite journal}}: CS1 maint: date and year (link)

[Tachistoscopically-11] Rumelhart, David E.; Siple, Patricia (1974). "Process of recognizing tachistoscopically presented words". Psychological Review. 2 (81): 99–118. doi:10.1037/h0036117. PMID 4817613.{{cite journal}}: CS1 maint: date and year (link)

[Letter_Perception-12] Rumelhart, David E.; McClelland, James L. (1982). "An interactive activation model of context effects in letter perception: II. the contextual enhancement effect and some tests and extensions of the model". Psychological Review. 1 (89): 60–94. doi:10.1037/0033-295X.89.1.60.{{cite journal}}: CS1 maint: date and year (link)

[Cognitive_Science-13] Rumelhart, D. E (1999). Bly, B.M. London: Academic. ISBN 9780126017304.

[Basic_Ideas_Neural_Networks-14] Rumelhart, David E.; Widrow, Bernard; Lehr, Michael A. (1994). "The basic ideas in neural networks". Commun. 37 (3): 87–92. doi:10.1145/175247.175256. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: date and year (link)

[LNR_Approach-15] Norman, Donald A.; Rumelhart, David E. (1981). "he LNR approach to human information processing". Cognition. 10 (1–3): 253–240. doi:10.1016/0010-0277(81)90051-2. PMID 7198542.{{cite journal}}: CS1 maint: date and year (link)

[Neural_Networks-16] Widrow, Bernard (March 1994). "Neural networks: applications in industry, business and science". Commin: 93–105. doi:10.1145/175247.175257. {{cite journal}}: Unknown parameter |coauthors= ignored (|author= suggested) (help)CS1 maint: date and year (link)

[Backpropogation-17] Chauvin, Y (1995). Backpropagation :Theory, architectures, and applications. Hillside N.J: Lawrence Erlbaum Associates.

[Probability_Estimation-18] {{cite journal|last=Franco|first=H|coauthors=Morgan, N, Rumelhart, D, Abrash, V|title=Context-dependent connectionist probability estimation in a hybrid hidden markov model-neural net speech recognition system|journal=Computer Speech & Language|year=1994|volume=3|issue=8|pages=211-222|doi=10.1006/csla.1994.1010

[Letter_Perception_I-19] McClelland, James L.; Rumelhart, David E. (1981). "An interactive activation model of context effects in letter perception: I. An account of basic findings". Psychological Review. 88 (5): 375–407. doi:10.1037/0033-295X.88.5.375. {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: date and year (link)

[Interactive_Model_of_Reading-20] Rumelhart, D. E (2004). Toward an Interactive Model of Reading. In R.B. Ruddell, & N.J. Unrau (Eds.), Theoretical Models and Processes of Reading. Newark, DE: International Reading Association. ISBN 10: 0-87207-502-8. {{cite book}}: Check |isbn= value: invalid character (help)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]