The CsDistStat is used to measure the percentage of time (i.e., the
probability) that the units in the network which have target patterns in
the environment spend in any of the possible target patterns. This is
used when there are multiple possible target states defined for any
given event (see section 15.7 The Probability Environment and Cs), which means that a simple
squared-error comparison against any one of these would be relatively
meaningless -- one wants to know how much time is spent in each of the
possible states. The dist stat generates one column of data for each
possible target pattern, and each column represents the probability
(proportion of time) that the network's output units were within some
distance of the target pattern. The tolerance of the distance measure
is set with the tolerance
parameter, which is the absolute
distance between target and actual activation that will still result in
a zero distance measure. A network is considered to be "in" a particular
target state whenever its total distance measure is zero, so this
tolerance should be relatively generous (e.g., .5 so units have to be on
the right side of .5).
The CsTIGstat is essentially a way of aggregating the columns of
data produced by the CsDistStat. It is automatically created by the
dist stat's CreateAggregates
function (see section 12.7 The Statistic Process) at the
level of the CsSample process (note that unlike other aggregators, it
is in the final_stats
group of the sample process, and it feeds
off of the aggregator of the dist stat in the loop_stats
of the
same process). The TIG stat measures the total information gain (aka
cross-entropy) of the probability distribution of target states observed
in the network (as collected by the dist stat pointed to by the
dist_stat
member), and the target probability distribution as
specified in the probability patterns themselves (see section 15.7 The Probability Environment and Cs).
This measure is zero if the two distributions are identical, and it goes
up as the match gets worse. It essentially provides a distance metric
over probability distributions.
The CsTargStat, like the TIG stat, provides a way of aggregating the
distribution information obtained by the dist stat. This should be
created in the sample final_stats
group (just like the TIG stat),
and its dist_stat
pointer set to the aggregator of the dist stat
in the sample process loop_stats
. This stat simply records the
sum of each column of probability data, which provides a measure of how
often the network is settling into one of the target states, as opposed
to just flailing about in other random states.