15.8 Cs Implementational Details

Note that the weight change operation in Cs is viewed as the process of collecting statistics about the coproducts of activations across the weights. Thus, there is a new function at the level of the CsConSpec called Aggregate_dWt, which increments the dwt_agg value of the connection by the phase (+1.0 for plus phase, -1.0 for the minus phase) times the coproduct of the activations. This function is called repeatedly if learning is taking place after the start_stats number of cycles has taken place.

The standard Compute_dWt function is then called at the end of the sample process, and it divides the aggregated weight change value by a count of the number of times it was aggregated, so that the result is an expected value measure.

Also, note that the type of momentum used in Cs corresponds to the BEFORE_LRATE option of backpropagation (see section 14.1.2 Bp Connection Specifications).

The following is a chart describing the flow of processing in the Cs algorithm, starting with the epoch process, since higher levels do not interact with the details of particular algorithms:

EpochProcess: {
  Init: {                              // at start of epoch
    environment->InitEvents();          // init events (if dynamic)
    event_list.Add() 0 to environment->EventCount(); // get list of events
    if(order == PERMUTE) event_list.Permute();       // permute if necessary
    GetCurEvent();                      // get pointer to current event
  }
  Loop (sample): {                      // loop over samples
    CsSample: {                         // sample process (one event)
      Init: {                          // at start of sample
        cur_event = epoch_proc->cur_event; // get cur event from epoch
      }
      Loop (sample): {                 // loop over samples of event
        CsTrial: {                    // phase process (one phase)
          Init: {                      // at start of phase
            phase = MINUS_PHASE;        // start in minus phase
            if(phase_init == INIT_STATE)  network->InitState();
            cur_event = epoch_proc->cur_event;  // get cur event from epoch
            cur_event->ApplyPatterns(network);  // apply patterns to network
          }
          Loop (phase_no [0 to 1]): {  // loop over phases 
            CsSettle: {                // settle process (one settle)
              Init: {                  // at start of settle
                if(CsPhase::phase == PLUS_PHASE) {
                  network->InitState(); // init state (including ext input)
                  cur_event->ApplyPatterns(network);  // re-apply patterns
                  Targ_To_Ext();        // turn targets into external inputs
                }
              }
              Loop (cycle): {          // loop over act update cycles
                CsCycle: {             // cycle process (one cycle)
                  Loop (once): {       // only process this once per cycle
                    Compute_SyncAct() or Compute_AsyncAct(); // see below
                    if(!deterministic and cycle > start_stats)
                      Aggregate_dWt();  // aggregate wt changes
                  }
                }
              }
              Final: {                 // at end of phase
                if(deterministic) Aggregate_dWt();  // do at end
                PostSettle();           // copy act to act_p or act_m
              }
            }
            phase = PLUS_PHASE;         // change to plus phase after minus
          }
        }
      }
      Final: {                         // at end of sample (done with event)
        network->Compute_dWt();         // compute wt changes based on aggs
      }
    }
    if(wt_update == ON_LINE or wt_update == SMALL_BATCH and trial.val % batch_n)
      network->UpdateWeights();         // update weights after sample if necc
    GetCurEvent();                      // get next event to present
  }
  Final:                                // at end of epoch 
    if(wt_update == BATCH)  network->UpdateWeights();
}

The activation computing functions are broken down as follows:

Compute_SyncAct(): {           // compute synchronous activations
  network->InitDelta();         // initialize net-input terms
  network->Compute_Net();       // aggregate net inputs

  network->layers: {           // loop over layers
    units->Compute_Act(CsSettle::cycle); // compute act from net
  }                            // (cycle used for annealing, sharpening)
}
Compute_AsyncAct(): {          // compute asynchronous activations
  for(i=0...n_updates) {       // do this n_updates times per cycle
    rnd_num = Random between 0 and CsSettle::n_units;  // pick a random unit
    network->layers: {         // loop over layers
      if(layer contains rnd_num unit) { // find layer containing unit
        unit = layer->unit[rnd_num];    // get unit from layer
        unit->InitDelta();              // initialize net input terms
        unit->Compute_Net();            // aggregate net inputs
        unit->Compute_Act(CsSettle::cycle);  // compute act from net
      }                        // (cycle used for annealing, sharpening)
    }
  }
}