An Instructable Connectionist/Control Architecture: Using Rule-Based Instructions to Accomplish Connectionist Learning in a Human Time Scale Presented

An Instructable Connectionist/Control Architecture:Using Rule-Based Instructions to Accomplish Connectionist Learning in a Human Time Scale

Presented by: Jim Ries for CECS 477, WS 2000

Paper by:

Walter Schneider and William L. Oliver

University of Pittsburgh, Learning Research and Development Center

Introduction Overview Task Decomposition

Gate Learning Example CAP2 Architecture CAP2 Rule Learning Authors’ Conclusions My Own Thoughts

Walter SchneiderPh.D., Indiana University

Professor, Psychology

University of Pittsburgh, Pittsburgh, PA 15260

Phone: (412) 624-7061.

Fax: (412) 624-9149

Email: [email protected]

http://www.lrdc.pitt.edu/

mailto:[email protected]









Overview

Hybrid approach to blend rules with connectionist model. Rules are “learned” (with instruction) and

represented in a connectionist manner. Learned rules are less “brittle”.

Attempts to decompose problems in order to hasten learning. Supposedly general decomposition mechanism.

Claims to model human cognition.

Task Decomposition

As task complexity increases, learning times in both symbolic and connectionist systems can dramatically increase (perhaps exponentially).

Cognitive psychology indicates that basic cognitive processes can be decomposed into stages.

Task Decomposition (cont.)

A good decomposition reduces the number of problem states needed for consideration. e.g., humans can do arbitrary addition by

memorizing 100 addition facts and an algorithm for adding one column at a time. w/o this decomposition, would need to learn 1010 addend combinations to solve 5 column addition problems!

Gate Learning Example of task decomposition. In human version, subjects are instructed on the

rules for each gate, and then do many trials to learn.

W/o task decomposition, number of states is: 2i X g X n (where i is gate inputs, g is # of gate

types, n is # of negation states) W/ task decomposition, number of states is:

2i + (g X r) + (n X o) (where g is # of recording states, o is # of output states of gate mapping stage)

Gate Learning (cont.) For six-input gates following:

w/o decomposition: 384 states w/ decomposition: 77 states

Decomposition reduces state growth from multiplicative function to additive.

Gate Learning (cont.)



Networks trained to 100% accuracy (since there is no “noise”)

Results for 6 gate trial: w/o decomposition - 10,835 trials w/ decomposition - 948 trials human - 300 trials



Subjects begin by executing rules sequentially, and gradually switch to associative responses.

The stage taking the longest to converge (Recording) was the limiting factor.

The author did not mention whether real time for a “trial” differed between a net using decomposition or one without decomposition.


CAP2 Architecture

Controlled Automatic Processing model 2. Macro level : system of modules that pass vector

and scalar messages. Scalar messages used for control. Vector messages encode perceptual and conceptual

information.

CAP2 Architecture (cont.)

Components Data Modules - transforms and transmits vector

messages (consistent with neurophysiology) Control Signals - control activity of modules

Activity report - codes whether a data module is active and has a vector to transmit

Gain control - controls how strongly the output of a module activates the other modules to which it is connected

Feedback - controls the strength of the autoassociative feedback within the module




Controller Module - sequential rule net Task input vector Compare Result input vector Context input vector Outputs control operations

• Attend• Compare (compare vectors from different modules)• Receive (enable a module to receive a vector)• Done

Currently implemented in C, rather than as a connectionist network!




Authors are committed to structural assumptions of the architecture (as related to human cognition) Processing substrate in humans akin to data

network Modular network structures serve as functional units

of processing Information passes between modules as vectors Memory associations among vectors develop

through learning similar to connectionist learning


Mechanisms for task decomposition Configuring the data network (# of stages, # of

modules/stage, etc. through control signals) Specifying the number of states in each stage.

CAP2 captures knowledge specified in rules However, the data network does not simply learn the

rules stored in the controller, but will learn patterns in input data as well (driving instructor example).

To achieve the same level of tuning in a production system would require a huge set of rules.


Chunking - “C” “A” “T” “CAT” Degree of matching (Euclidean distance)

activity report = (xi + yi)2

Does this mean that CAP2 would be unable to represent concepts that were “close” or “distant” in a different sense (e.g., Taxicab distance, or other distance measures)?

n

i

iyxi1

)(


CAP2 Rule Learning

Rule learning should be achieved in a small number of trials (or why bother; just use connectionist learning).

Gate Learning example

CAP2 Rule Learning (cont.)



Sequential network learned rules even faster than humans sequential network - 120 trials humans - 216 trials decomposed model - 932 trials single stage model - 10,835 trials

Rule knowledge is brittle, and performas much as novices perform during the early stage of rule learning.

Authors’ Conclusions

Hybrid connectionist/control architecture illustrates the complementary nature of symbolic and connectionist processing. Better than connectionist learning, because it

benefits from instruction. Better than symbolic processing because it captures

rules in a connectionist network which can scale and is less brittle.

Authors’ Conclusions (cont.)

Closely models human cognition. Not merely a connectionist implementation of a

symbolic architecture.

My Own Thoughts

Unclear that this models human cognition, but I have no cognitive science background to truly evaluate this claim.

Is this really general? For example, they seemed to gloss over the fact that part of their rule system was implemented directly in C rather than in a connectionist manner.

The examples were generally done iteratively. How does parallelism change things (if at all)?

Full paper reference

Schneider, W. & Oliver, W. L. (1991). An instructable connectionist/control architecture: Using rule-based instructions to accomplish connectionist learning in a human time scale. In K. Van Lehn (Ed.), Architectures for intelligence: The 22nd Carnegie Mellon symposium on cognition (pp.113-145). Hillsdale, NJ: Erlbaum.

Documents

An Instructable Connectionist/Control Architecture: Using Rule-Based Instructions to Accomplish Connectionist Learning in a Human Time Scale Presented