PRT Blog

MATLAB Pattern Recognition Open Free and Easy


Observation Info

One of the hidden gems of the PRT is the observationInfo property. Let’s talk about a few ways you can use it to simplify your workflow.

Contents

What is observationInfo?

Simply put, observationInfo is a structure array stored in a prtDataSetStandard that has a number of entries equal to the number of observations in the dataSet. This structure array has user defined fields that store side information. When observations are removed from a dataset the observationInfo structure is also properly indexed. Here is a quick example.

First let’s make a simple dataset with only 4 observations

X = cat(1,prtRvUtilMvnDraw([0 0],eye(2),2),prtRvUtilMvnDraw([2 2],eye(2),2));
Y = [0; 0; 1; 1;];

ds = prtDataSetClass(X,Y);

plot(ds);

Now let’s create a structure of observationInfo and set it.

obsInfo = struct(‘fileIndex’,{1,2,3,4}‘,'timeOfDay’,{‘day’,‘night’,‘day’,‘night’}‘);

ds.observationInfo = obsInfo;

Retain Observations

If we retain (or remove) observations from this dataSet, the observation info is properly indexed.

dsSub = ds.retainObservations([1; 4]);
dsSub.observationInfo.fileIndex
ans =
  1
ans =
  4

Select

A hidden method of prtDataSetClass allows us to “select” observations from a dataSet by evaluating a function on the observationInfo. select() takes a function handle that is evaluated for each entry in the dataSet and returns a logical index. A dataSet containing only the observations for which the function was true is returned.

dsDayOnly = ds.select(@(s)strcmpi(s.timeOfDay,'day'));
{dsDayOnly.observationInfo.timeOfDay}
ans = 
  'day'    'day'

Graphically

Sometimes with complex dataSets with lots of observationInfo sorting through the data can be difficult. A graphical way to view observationInfo and create a function handle for select. This functionality is currently in Beta so make sure you include those directories in your path.

prtUiDataSetStandardObservationInfoSelect(ds);

Right clicking (ctrl+click in OSX) in the table allows you to graphically select observations that will be returned by select.

Conclusions

This is a simple example of what observation info is and how it can be used. We use it all of the time to manage all of our side information. Because all calls to retainObservation correctly index the observationInfo the side information is available within actions even during cross-validation. Making use of observation info is a quick way to fake a custom type of dataSet that contains other side information. Let us know how you use observationInfo and if you have any ideas to improve it.




Comments