14.1.5 Variations Available in Bp

There are several different BpUnitSpec and BpConSpec types available that perform variations on the generic backpropagation algorithm.

LinearBpUnitSpec implements a linear activation function

ThreshLinBpUnitSpec implements a threshold linear activation function with the threshold set by the parameter threshold. Activation is zero when net is below threshold, net-threshold above that.

NoisyBpUnitSpec adds noise to the activations of units. The noise is specified by the noise member.

StochasticBpUnitSpec computes a binary activation, with the probability of being active a sigmoidal function of the net input (e.g., like a Boltzmann Machine unit).

RBFBpUnitSpec computes activation as a Gaussian function of the distance between the weights and the activations. The variance of the Gaussian is spherical (the same for all weights), and is given by the parameter var.

BumpBpUnitSpec computes activation as a Gaussian function of the standard dot-product net input (not the distance, as in the RBF). The mean of the effectively uni-dimensional Gaussian is specified by the mean parameter, with a standard deviation of std_dev.

ExpBpUnitSpec computes activation as an exponential function of the net input (e^net). This is useful for implementing SoftMax units, among other things.

SoftMaxBpUnitSpec takes one-to-one input from a corresponding exponential unit, and another input from a LinearBpUnitSpec unit that computes the sum over all the exponential units, and computes the division between these two. This results in a SoftMax unit. Note that the LinearBpUnitSpec must have fixed weights all of value 1, and that the SoftMaxUnit's must have the one-to-one projection from exp units first, followed by the projection from the sum units. See `demo/bp_misc/bp_softmax.proj.gz' for a demonstration of how to configure a SoftMax network.

HebbBpConSpec computes very simple Hebbian learning instead of backpropagation. It is useful for making comparisons between delta-rule and Hebbian leanring. The rule is simply dwt = ru->act * su->act, where ru->act is the target value if present.

ErrScaleBpConSpec scales the error sent back to the sending units by the factor err_scale. This can be used in cases where there are multiple output layers, some of which are not supposed to influence learning in the hidden layer, for example.

DeltaBarDeltaBpConSpec implements the delta-bar-delta learning rate adaptation scheme Jacobs, 1988. It should only be used in batch mode weight updating. The connection type must be DeltaBarDeltaBpCon, which contains a connection-wise learning rate parameter. This learning rate is additively incremented by lrate_incr when the sign of the current and previous weight changes are in agreement, and decrements it multiplicatively by lrate_decr when they are not. The demo project `demo/bp_misc/bp_ft_dbd.proj.gz' provides an example of how to set up delta-bar-delta learning. The defaults file `bp_dbd.def' provides a set of defaults that make delta-bar-delta connections by default.