Many of the relevant details are discussed above in the context of the descriptions of the basic Bp classes. This section provides a little more detail that might be useful to someone who wanted to define their own versions of Bp classes, for example.
Support for the activation updating phase of Bp is present in the basic
structure of the PDP++ section 10.4 Units and section 10.5 Connections types,
specifically in the Compute_Net
and Compute_Act
functions.
We overload Compute_Act
to implement the sigmoidal activation
function.
The error backpropagation phase is implemented with three new functions at both the unit and connection level. The unit spec functions are:
Compute_dEdA(BpUnit* u)
ext_flag
TARG
set), then it just calls the error function
to get the difference between the target and actual output activation.
If it is not an output unit, then it iterates through the sending
connection groups (i.e., through the connections to units that this one
sends activation to), and accumulates its dEdA
as a function of
the connection weight times the other unit's dEdNet
. This is done
by calling the function Compute_dEdA
on the sending connection
groups, which calls this function on the BpConSpec, which is
described below.
Compute_dEdNet(BpUnit* u)
dEdA
to get the derivative with respect to the
net input.
Compute_Error(BpUnit* u)
Compute_dWt(Unit* u)
dEdW
) for all of the unit's connections. This is a function of
the dEdNet
of the unit, and the sending unit's activation. This
function is defined as part of the standard UnitSpec interface,
and it simply calls the corresponding Compute_dWt
function on the
ConSpec
for all of the receiving connection groups. In Bp, it
also calls Compute_dWt
on for the bias weight.
UpdateWeights(Unit* u)
dEdW
. Like Compute_dWt
, this function
is defined to call the corresponding one on connection specs. Also, it
updates the bias weights. Note that this function is called by the
EpochProcess, and not by the algorithm-specific BpTrial
directly.
The corresponding connection spec functions are as follows. Note that,
as described in section 10.5 Connections, there are two versions of every
function defined in the ConSpec. The one with a C_
prefix
operates on an individual Connection, while the other one
iterates through a group of connections and calls the
connection-specific one.
float C_Compute_dEdA(BpCon* cn, BpUnit* ru, BpUnit* su)
float Compute_dEdA(BpCon_Group* cg, BpUnit* su)
dEdNet
of the unit that receives from the sending unit
times the weight in between them.
float C_Compute_dWt(BpCon* cn, BpUnit* ru, BpUnit* su)
float Compute_dWt(Con_Group* cg, Unit* ru)
dEdW
variable on the receiving connections by
multiplying the sending unit's activation value times the receiving
unit's dEdNet
.
float B_Compute_dWt(BpCon* cn, BpUnit* ru)
float C_Compute_WtDecay(BpCon* cn, BpUnit* ru, BpUnit* su)
C_UpdateWeights
function.
float C_BEF_UpdateWeights(BpCon* cn, Unit* ru, Unit* su)
float C_AFT_UpdateWeights(BpCon* cn, Unit* ru, Unit* su)
float C_NRM_UpdateWeights(BpCon* cn, Unit* ru, Unit* su)
float UpdateWeights(Con_Group* cg, Unit* ru)
dEdW
. There is a different version of the connection-specific code
for each of the different momentum_type
values, and the
group-level function has a separate loop for each type, which is more
efficient that checking the type at each connection.
float B_UpdateWeights(BpCon* cn, Unit* ru)
momentum_type
variable and calls the appropriate C_
function.
The following is a chart describing the flow of processing in the Bp algorithm, starting with the epoch process, since higher levels do not interact with the details of particular algorithms:
EpochProcess: { Init: { environment->InitEvents(); // init events (if dynamic) event_list.Add() 0 to environment->EventCount(); // get list of events if(order == PERMUTE) event_list.Permute(); // permute if necessary GetCurEvent(); // get pointer to current event } Loop (trial): { // loop over trials BpTrial: { // trial process (one event) Init: { // at start of trial cur_event = epoch_proc->cur_event; // get cur event from epoch } Loop (once): { // only process this once per trial network->InitExterns(); // init external inputs to units cur_event->ApplyPatterns(network); // apply patterns to network Compute_Act(): { // compute the activations network->layers: { // loop over layers if(!layer->ext_flag & Unit::EXT) // don't compute net for clamped layer->Compute_Net(); // compute net inputs layer->Compute_Act(); // compute activations from net in } } Compute_dEdA_dEdNet(): { // backpropagate error terms network->layers (backwards): { // loop over layers backwards units->Compute_dEdA(); // get error from other units or targets units->Compute_dEdNet(); // add my unit error derivative } } network->Compute_dWt(); // compute weight changes from error } } if(wt_update == ON_LINE or wt_update == SMALL_BATCH and trial.val % batch_n) network->UpdateWeights(); // after trial, update weights if necc GetCurEvent(); // get next event } Final: if(wt_update == BATCH) network->UpdateWeights(); // batch weight updates }