In general, there are assigned readings for each lecture that are intended to prepare you to participate in the class discussion for that day. In addition, there may be optional background readings (marked with "
" in the Syllabus) that serve either as the basis for the lecture, to present an alternative point of view, or simply to make available to you relevant material that we won't have time to cover in class. Optional readings are also a good source of ideas for projects. There are no required readings on days when something is due, but you are still expected to attend class, hand in your homework, and draw on the material you have already learned in order to participate in the discussion.
of the course and how each is assessed
(listed in brackets).
There is no required text for the course. All assigned and optional readings
are available as downloadable pdf files from links in the Syllabus below.
Other course materials (e.g., handouts, assignments, etc.) will be made
available via links at the top of this web page. The following texts contain
some of the course readings and may be useful as general references:
We will be using a software package called "Lens" (for Light Efficient Network Simulator), developed by former CMU CS graduate student Doug Rohde. Lens runs under Windows, Mac OSX, and Linux. The main website for Lens is http://tedlab.mit.edu/~dr/Lens/ but don't install Lens from that site. You can download a file containing a precompiled version of Lens here:
If you have any problems getting Lens running, contact the instructor. After installing Lens, you should look at the online manual at http://tedlab.mit.edu/~dr/Lens/, particularly the instructions under "Running Lens" and the Tutorial Network under "Example Networks". The precompiled versions of Lens come with a offline (local) copy of the manual that can be accessed by pointing your web browser at Manual/index.html in the Lens directory.
Section 3: Learning Internal Representations
Feb 13 (Thu): Back-propagation (slides) [HOMEWORK 3 POSTED]
Feb 18 (Tue): Temporal learning and recurrent networks (slides)
Feb 20 (Thu): Generalization and overfitting (slides)
- (opt: Morgan, N. & Bourlard, H. (1990). Generalization and parameter estimation in feedforward nets: Some experiments. In D. S. Touretzky (Ed.), Advances in neural information processing systems 2. San Mateo: CA: Morgan Kaufmann, 630-637. )
- (opt: le Cun, Y., Denker, J. S., & Solla, S. A. (1990). Optimal brain damage. In D. S. Touretzky (Ed.), Advances in neural information processing systems 2. San Mateo: CA: Morgan Kaufmann, 598-605.)
- (opt: Weigand, A. S., Rumelhart, D. E., & Huberman, B. A. (1991). Generalization by weight-elimination with application to forcasting. In R. P. Lippmann, J. E. Moody, & D. S. Touretzky (Eds.), Advances in neural information processing systems 3. San Mateo: CA: Morgan Kaufmann, 875-882.)
Feb 25 (Tue): Contrastive Hebbian learning (slides)
Feb 27 (Thu): Unsupervised learning (slides)
Mar 4 (Tue): Reinforcement learning and forward models (slides)
- Barto, A. G. (1995). Reinforcement learning; Reinforcement learning in motor control. In M. A. Arbib (Ed.), The handbook of brain theory and neural networks (pp. 804-813). Cambridge, MA: MIT Press.
- (opt: Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM, 38, 58-68.)
- (opt: Jordan, M. I., and Rumelhart, D. E. (1992). Forward models: Supervised learning with a distal teacher. Cognitive Science, 16, 307-354.)
Mar 6 (Thu): Psychological implications [PROJECT PROPOSAL DUE] [HOMEWORK 3 DUE]
- (opt: McClelland, J. L. (2001). Failures to learn and their remediation: A Hebbian account. In J. L. McClelland and R. S. Siegler (Eds.), Mechanisms of cognitive development: Behavioral and neural perspectives, (pp. 97-121). Mahwah, NJ: Lawrence Erlbaum Associates.)
- (opt: McCandliss, B. D., Fiez, J. A., Protoapas, A., Conway, M., McClelland, J. L. (2001). Success and failure in teaching the [r] [l] contrast to Japanese adults: Tests of a Hebbian model of plasticity and stabilization in spoken language perception. Cognitive, Affective, & Behavioral Neuroscience, 2, 89-108.)
Mar 11 (Tue): NO CLASS (Spring Break)
Mar 13 (Thu): NO CLASS (Spring Break)
Section 4: Applications
Mar 18 (Tue): Cognitive development (slides)
- Munakata, Y. and McClelland, J. L. (2003). Connectionist models of development. Developmental Science, 6, 413-429.
- (opt: Munakata, Y., McClelland, J. L., Johnson, M. H. & Siegler, R. (1997). Rethinking infant knowledge: Toward an adaptive process account of successes and failures in object permanence tasks. Psychological Review, 104, 686-713.)
Mar 20 (Thu): Semantics (slides)
Mar 25 (Tue): Language: Morphology (slides)
- The past-tense debate. Trends in Cognitive Sciences, 2002, 6, 456-474. [Pinker, S. & Ullman, M T. The past and future of the past tense, 456-463; McClelland, J. M. & Patterson, K. 'Words or Rules' cannot exploit the regularity in exceptions: Reply to Pinker and Ullman, 464-465; McClelland, J. M. & Patterson, K. Rules or connections in past-tense inflections: What does the evidence rule out?, 465-472; Pinker, S. & Ullman, M T. Combination and structure, not gradedness, is the issue: Reply to McClelland and Patterson, 472-474.]
- (opt: Rumelhart, D. E. & McClelland, J. L. (1986). On learning the past tenses of English verbs. PDP2, Chapter 18.)
- (opt: Plaut, D. C. & Gonnerman, L. M. (2000). Are non-semantic morphological effects incompatible with a distributed connectionist approach to lexical processing? Language and Cognitive Processes, 15, 445-485.)
Mar 27 (Thu): Language: Word reading (slides)
- Plaut, D. C. (1999). Computational modeling of word reading, acquired dyslexia, and remediation. In R. Klein and P. A. McMullen (Eds.), Converging methods in reading and dyslexia (pp. 339-372). Cambridge, MA: MIT Press.
- (opt: Plaut, D. C., McClelland, J. L., Seidenberg, M. S., & Patterson, K. (1996). Understanding normal and impaired word reading: Computational principles in quasi-regular domains. Psychological Review, 103, 56-115.)
- (opt: Coltheart, M., Rastle, K., Perry, C. and Langdon, R. & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108, 204-256.)
- (opt: O'Reilly et al. (2014). Language, CCN, Chapter 9.)
Apr 1 (Tue): Language: Sentence processing (slides)
- McClelland, J. L., St. John, M., & Taraban, R. (1989). Sentence comprehension: A parallel distributed processing approach. Language and Cognitive Processes, 4, 287-335.
- (opt: Elman, J. L. (1993). Learning and development in neural networks: The importance of starting small. Cognition, 48, 71-99.)
- (opt: Rohde, D. L. T., and Plaut, D. C. (1999). Language acquisition in the absence of explicit negative evidence: How important is starting small? Cognition, 72, 67-109.)
- (opt: Frazier, L. (1987). Sentence processing: A tutorial review. In M. Coltheart (Ed.), Attention and performance XII: The psychology of reading (pp. 559-586). Hillsdale, NJ: Erlbaum.)
- (opt: O'Reilly et al. (2014). Language, CCN, Chapter 9.)
Apr 3 (Thu): Memory and the hippocampus (slides) [TAKE-HOME ESSAY POSTED]
Apr 9 (Tue): High-level vision and attention (slides) [TAKE-HOME ESSAY DUE]
- Mozer, M. C. and Sitton, M. (1998). Computational modeling of spatial attention. In H. Pashler (Ed.), Attention (pp. 341-393). Hove, England: Psychology Press/Erlbaum.
- (opt: Mozer, M. C. and Behrmann, M. (1990). On the interaction of selective attention and lexical knowledge: A connectionist account of neglect dyslexia. Journal of Cognitive Neuroscience, 2, 96-123.)
- (opt: Behrmann, M., Zemel, R. S. and Mozer, M. C. (1998). Object-based attention and occlusion: Evidence from normal participants and a computational model. Journal of Experimental Psychology: Human Perception and Performance, 24, 1101-1036.)
- (opt: O'Reilly et al. (2014). Perception, CCN, Chapter 6.)
Apr 10 (Thu): NO CLASS (Spring Carnival)
Apr 15 (Tue): NO CLASS (Passover)
Apr 17 (Thu): Cognitive control and executive function
- (opt: Rumelhart, D. E., Smolensky, P., McClelland, J. L., & Hinton, G. E. (1986). Schemata and sequential thought processes in PDP models. PDP2, Chapter 14, pages 38-57.)
- (opt: Cohen, J. D., Aston-Jones, G., and Gilzenrat, M. S. (2004). A system-level perspective on attention and cognitive control: Guided activation, adaptive gating, conflict monitoring, and exploitation versus exploration. In M. I. Posner (Ed.), Cognitive Neuroscience of Attention (pp. 71-90). New York: Guilford Press.)
- (opt: O'Reilly et al. (2014). Executive functions, CCN, Chapter 10.)