14.1.8 Bp Implementation Details

Many of the relevant details are discussed above in the context of the descriptions of the basic Bp classes. This section provides a little more detail that might be useful to someone who wanted to define their own versions of Bp classes, for example.

Support for the activation updating phase of Bp is present in the basic structure of the PDP++ section 10.4 Units and section 10.5 Connections types, specifically in the Compute_Net and Compute_Act functions. We overload Compute_Act to implement the sigmoidal activation function.

The error backpropagation phase is implemented with three new functions at both the unit and connection level. The unit spec functions are:

Compute_dEdA(BpUnit* u)
Computes the derivative of the error with respect to the activation. If the unit is an output unit (i.e., it has the ext_flag TARG set), then it just calls the error function to get the difference between the target and actual output activation. If it is not an output unit, then it iterates through the sending connection groups (i.e., through the connections to units that this one sends activation to), and accumulates its dEdA as a function of the connection weight times the other unit's dEdNet. This is done by calling the function Compute_dEdA on the sending connection groups, which calls this function on the BpConSpec, which is described below.
Compute_dEdNet(BpUnit* u)
Simply applies the derivative of the activation function to the already-computed dEdA to get the derivative with respect to the net input.
Compute_Error(BpUnit* u)
This function is not used in the standard training mode of Bp, but is defined so that the error can be computed when a network is being tested, but not trained.
Compute_dWt(Unit* u)
Computes the derivative of the error with respect to the weight (dEdW) for all of the unit's connections. This is a function of the dEdNet of the unit, and the sending unit's activation. This function is defined as part of the standard UnitSpec interface, and it simply calls the corresponding Compute_dWt function on the ConSpec for all of the receiving connection groups. In Bp, it also calls Compute_dWt on for the bias weight.
UpdateWeights(Unit* u)
Updates the weights of the unit's connections based on the previously computed dEdW. Like Compute_dWt, this function is defined to call the corresponding one on connection specs. Also, it updates the bias weights. Note that this function is called by the EpochProcess, and not by the algorithm-specific BpTrial directly.

The corresponding connection spec functions are as follows. Note that, as described in section 10.5 Connections, there are two versions of every function defined in the ConSpec. The one with a C_ prefix operates on an individual Connection, while the other one iterates through a group of connections and calls the connection-specific one.

float C_Compute_dEdA(BpCon* cn, BpUnit* ru, BpUnit* su)
float Compute_dEdA(BpCon_Group* cg, BpUnit* su)
These accumulate the derivative of the error with respect to the weights and return that value, which is used by the unit to increment its corresponding variable. Note that this is being called on the sending connection groups of a given unit, which is passed as an argument to the functions. The computation for each connection is simply the dEdNet of the unit that receives from the sending unit times the weight in between them.
float C_Compute_dWt(BpCon* cn, BpUnit* ru, BpUnit* su)
float Compute_dWt(Con_Group* cg, Unit* ru)
These increment the dEdW variable on the receiving connections by multiplying the sending unit's activation value times the receiving unit's dEdNet.
float B_Compute_dWt(BpCon* cn, BpUnit* ru)
The bias-weight version of this function. It does not multiply times the sender's activation value (since there isn't one!).
float C_Compute_WtDecay(BpCon* cn, BpUnit* ru, BpUnit* su)
This calls the weight decay function on the given connection, if it is not NULL. It is meant to be called as part of a C_UpdateWeights function.
float C_BEF_UpdateWeights(BpCon* cn, Unit* ru, Unit* su)
float C_AFT_UpdateWeights(BpCon* cn, Unit* ru, Unit* su)
float C_NRM_UpdateWeights(BpCon* cn, Unit* ru, Unit* su)
float UpdateWeights(Con_Group* cg, Unit* ru)
These are the functions that update the weights based on the accumulated dEdW. There is a different version of the connection-specific code for each of the different momentum_type values, and the group-level function has a separate loop for each type, which is more efficient that checking the type at each connection.
float B_UpdateWeights(BpCon* cn, Unit* ru)
The bias-weight version of the function, which checks the momentum_type variable and calls the appropriate C_ function.

The following is a chart describing the flow of processing in the Bp algorithm, starting with the epoch process, since higher levels do not interact with the details of particular algorithms:

EpochProcess: {
  Init: {
    environment->InitEvents();          // init events (if dynamic)
    event_list.Add() 0 to environment->EventCount(); // get list of events
    if(order == PERMUTE) event_list.Permute();       // permute if necessary
    GetCurEvent();                      // get pointer to current event
  }
  Loop (trial): {                      // loop over trials
    BpTrial: {                         // trial process (one event)
      Init: {                          // at start of trial
        cur_event = epoch_proc->cur_event; // get cur event from epoch
      }
      Loop (once): {                   // only process this once per trial
        network->InitExterns();         // init external inputs to units
        cur_event->ApplyPatterns(network); // apply patterns to network
        Compute_Act(): {               // compute the activations
          network->layers: {           // loop over layers
            if(!layer->ext_flag & Unit::EXT) // don't compute net for clamped
              layer->Compute_Net();     // compute net inputs
            layer->Compute_Act();       // compute activations from net in
          }
        }
        Compute_dEdA_dEdNet(): {       // backpropagate error terms
          network->layers (backwards): { // loop over layers backwards
            units->Compute_dEdA();   // get error from other units or targets
            units->Compute_dEdNet(); // add my unit error derivative
          }
        }
        network->Compute_dWt();         // compute weight changes from error
      }
    }
    if(wt_update == ON_LINE or wt_update == SMALL_BATCH and trial.val % batch_n)
      network->UpdateWeights(); // after trial, update weights if necc
    GetCurEvent();              // get next event
  }
  Final:
    if(wt_update == BATCH)  network->UpdateWeights(); // batch weight updates
}