verboseStorage and a little prtAlgorithm plotting

What is verboseStorage?

verboseStorage is a logical flag of prtAction that specifies whether the training dataset should be stored within the action. The default value is true. Let’s see an example

First let’s get a toy dataset and plot it to see what we are talking about.

ds = prtDataGenUnimodal;

plot(ds);

Let’s train a prtClassMap and plot the resulting decision contours.

c = train(prtClassMap, ds);

plot(c)
title('Classifier Decision Contrours with Training Data Set','FontSize',16);

You can see that even though we only plotted the trained classifier, the dataset appears in the plot. The training dataset is stored in the read-only property dataSet

c.dataSet

ans = 
  prtDataSetClass with properties:

               nFeatures: 2
             featureInfo: []
                    data: [400x2 double]
                 targets: [400x1 double]
         observationInfo: []
           nObservations: 400
       nTargetDimensions: 1
               isLabeled: 1
                    name: 'prtDataGenUnimodal'
             description: ''
                userData: [1x1 struct]
                nClasses: 2
           uniqueClasses: [2x1 double]
    nObservationsByClass: [2x1 double]
              classNames: {2x1 cell}
                 isUnary: 0
                isBinary: 1
                  isMary: 0
               isZeroOne: 1
            hasUnlabeled: 0

Let’s try this again, this time setting verboseStorage to false.

cVerboseStorageFalse = train(prtClassMap('verboseStorage',false), ds);

plot(cVerboseStorageFalse)
title('Classifier Decision Contrours without Training Data Set','FontSize',16);

Plotting the classifier contours without a data set is sometimes useful for examples. Now you can see that the classifiers dataSet property is empty.

cVerboseStorageFalse.dataSet

ans =
     []

Data Set Summaries

A long time ago, earlier versions of the PRT had no verboseStorage property and the dataSet was always saved. You can see how this might create problems when dataSets get large. We originally used the dataSet to determine plot limits and other things for the classifier plot as well. Now we use the dataSetSummary field to create plots. All prtDataSets must have a summarize() method that yields a structure that can be used by other actions when plotting. You can see that for the above examples the value of verbose storage does not change the dataSetSummary. This is how prtClass.plot() knows what image bounds to use for plotting.

c.dataSetSummary

cVerboseStorageFalse.dataSetSummary

ans = 
          upperBounds: [4.7304 4.7950]
          lowerBounds: [-4.0730 -3.5644]
            nFeatures: 2
    nTargetDimensions: 1
        nObservations: 400
        uniqueClasses: [2x1 double]
             nClasses: 2
               isMary: 0
ans = 
          upperBounds: [4.7304 4.7950]
          lowerBounds: [-4.0730 -3.5644]
            nFeatures: 2
    nTargetDimensions: 1
        nObservations: 400
        uniqueClasses: [2x1 double]
             nClasses: 2
               isMary: 0

prtAlgorithm

Since verboseStorgae is a property of prtAction it is also a property of prtAlgorithm. When you set the verboseStorage property for an algorithm you are actually setting the verboseStorgae property for all actions within the algorithm. If verboseStorgae is true for a prtAlgorithm you can use prtAlgorithm.plot() to explore what the data coming into any stage of the algorithm (the training data) looks like. Here is a quick example. Note: plotting prtAlgorithms requires graphviz. There may be issues with the current version of graphviz and the PRT. Please file an issue on github.

algo = prtPreProcZmuv + prtClassRvm/prtClassPlsda + prtClassLogisticDiscriminant;
algo.verboseStorage = true; % Just for clarity, this is the default

trainedAlgo = train(algo, ds);

plot(trainedAlgo);

Boxes with bold outlines are clickable. Double clicking those will open another figure and call plot on the action. For example double clicking on the RVM plots the resulting decision contours (and the dataSet after it has been preprocessed using ZMUV (notice the X and Y labels).

Similarly you can plot the PLSDA decision contour (also with the preprocessed ZMUV data).

The fusion of the two classifiers is shown be clicking on the prtClassLogisticDiscriminant.

The total confidence provided by the output of the algorithm is shown as a function of the input dataSet by clicking on the output block. Notice here that the features are the original input features and the contours show the contours of the entire algorithm.

If we repeat the whole process with verboseStorgae false you will see that the resulting plots do not have the dataSet just like before.

algo = prtPreProcZmuv + prtClassRvm/prtClassPlsda + prtClassLogisticDiscriminant;
algo.verboseStorage = false; % This will feed through to all actions

trainedAlgo = train(algo, ds);

plot(trainedAlgo);

As an example here is just the final output contours of the algorithm.

Conclusions

Well that’s verboseStorage. If you have a big dataset you probably want to turn it off but if you don’t it can be useful to fully explore an algorithm. Let us know what you think.

PRT Blog

verboseStorage and a Little prtAlgorithm Plotting

What is verboseStorage?

Data Set Summaries

prtAlgorithm

Conclusions

Comments

Recent Posts