14.2.3 RBp Unit Specifications

The unit-level specifications contain most of the RBp-specific parameters, although some important ones are also present in the RBpTrial process. Note that the dt parameter should be the same for all unit specs used in a given network. Also, this parameter is copied automatically to the RBpTrial process, which also needs to know the value of this parameter. Thus, the unit spec is the place to change dt, not the trial process.

The unit object in RBp is essentially the same as the BpUnit, except for the addition of variables to hold the previous values of all the state variables, and special circular buffers to hold the entire record of activation state variables over the update trajectory. These are described in greater detail in section 14.2.9 RBp Implementation Details.

float dt
Controls the time-grain of activation settling and error backpropagation as described above. In ACTIVATION mode, the activations are updated towards the raw activation value computed as a sigmoid function of the current net input by an amount proportional to dt:
    u->da = dt * (u->act_raw - u->prv_act);
    u->act = u->prv_act + u->da;
Similarly, in NET_INPUT mode, the net-inputs are moved towards the current raw net input proportional to the size of dt:
    u->da = dt * (u->net - u->prv_net);
    u->net = u->prv_net + u->da;
TimeAvgType time_avg
Controls the type of time-averaging to be performed. ACTIVATION based time-averaging, as shown above, adapts the current activations towards the raw activation based on the current net input, while NET_INPUT based time-averaging, also shown above, adapts the net input towards the current raw value. The latter is generally preferred since it allows networks with large weights to update activations quickly compared to activation-based updates, which have a strict ceiling on the update rate since the maximum activation value is 1, while the maximum net input value is unbounded.
bool soft_clamp
Soft clamping refers to the application of an environmental input to the network as simply an additional term in the unit's net input, as opposed to a hard-clamped pre-determined activation value. Soft clamping allows input units to behave a little more like hidden units, in that raw inputs are only one source of influence on their activation values.
float soft_clamp_gain
A strength multiplier that can be used to set the level of influence that the inputs have in soft-clamp mode. This allows the user to use the same environments for hard and soft clamping, while still giving the soft-clamp values stronger influence on the net input than would be the case if only 0-1 values were being contributed by the external input.
bool teacher_force
A modification of the RBp algorithm where the activation values are "forced" to be as given by the teaching (target) values. Given that the error is backpropagated over a long series of time steps, this can help error on previous time steps be computed as if the later time steps were actually correct, which might help in the bootstrapping of representations that will be appropriate when the network actually is performing correctly.
bool store_states
This flag determines if activity states are stored over time for use in performing a backpropagation through them later. This usually must be true, except in the Almeida-Pineda algorithm, or when just testing the network.
Random initial_act
Sets the parameters for the initialization of activation states at the beginning of a sequence. This state forms the 0th element of the sequence of activations.