14.1.1 Overview of the Bp Implementation

The basic structure of the backpropagation algorithm is reviewed in the tutorial (see section 4.1 Backpropagation and XOR). In short, there are two phases, an activation propagation phase, and an error backpropagation phase. In the simplest version of Bp, both of these phases are strictly feed-forward and feed-back, and are computed sequentially layer-by-layer. Thus, the implementation assumes that the layers are organized sequentially in the order that activation flows.

In the recurrent versions, both the activation and the error propagation are computed in two steps so that each unit is effectively being updated simultaneously with the other units. This is done in the activation phase by first computing the net input to each unit based on the other units current activation values, and then updating the activation values based on this net input. Similarly, in the error phase, first the derivative of the error with respect to the activation (dEdA) of each unit is computed based on current dEdNet values, and then the dEdNet values are updated based on the new dEdNet.

To implement a new algorithm like Bp in PDP++, one creates new class types that derive from the basic object types that make up a network (see section 10 Networks (Layers, Units, etc)), and scheduling processes (see section 12 Processes and Statistics). These new classes inherit all the functionality from the basic types, and specify the details appropriate for a particular algorithm. There are two ways in which this specification happens--overloading (replacing) existing functions, and adding new ones (see section 8.1.1 What is an Object?).

The new classes defined in the basic Bp implementation include: BpConSpec, BpCon, BpCon_Group, BpUnit, BpUnitSpec, BpTrial, the role of which should be clear from their names. In addition, we have added a CE_Stat that computes the cross-entropy error statistic, much in the same way SE_Stat does (see section 12.8.1 Summed-Error Statistics).

Bias weights in PDP++ are implemented by adding a BpCon object to the BpUnit directly, and not by trying to allocate some kind of self projection or some other scheme like that. In addition, the BpUnitSpec has a pointer to a BpConSpec to control the updating etc of the bias weight. Thus, while some code was written to support the special bias weights on units, it amounts to simply calling the appropriate function on the BpConSpec.

The processing hierarchy for feed-forward Bp requires only a specialized Trial process: BpTrial, which runs both the feed-forward activation updating and error backpropagation phases.