PRT Blog

MATLAB Pattern Recognition Open Free and Easy


prtDataGenSandP500 and prtDataGenCylinderBellFunnel

Hi everyone, a quick update this time – we added two new prtDataGen* functions to the PRT that people might find useful – prtDataGenSandP500, and prtDataGenCylinderBellFunnel.

Contents

prtDataGenSandP500

prtDataGenSandP500 generates data containing stock-price information from the S&P 500. The information dates back to January 3, 1950, and includes the index’s open, close, volume, and other features.

Check it out:

ds = prtDataGenSandP500;
ds.featureNames
spClose = ds.retainFeatures(5);
plot(spClose.X,'linewidth',2);
title('S&P 500 Closing Value vs. Days since 1/3/1950');
ans = 

    'Date'    'Open'    'High'    'Low'    'Close'    'Volume'    'AdjClose'

If you can do decent prediction on that data… you might be able to make some money :)

Cylinder-Bell-Funnel

prtDataGenCylinderBellFunnel is a tool for generating a synthetic data set which contains a number of time-series, each of which has either a flat plateau (cylinder), a rising (bell) or a falling (funnel) slope.

You can find the specification we used to generate the data here: http://www.cse.unsw.edu.au/~waleed/phd/html/node119.html

And the data was used in an important paper in the data-mining community – Keogh and Lin, Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research. http://www.cs.ucr.edu/~eamonn/meaningless.pdf

ds = prtDataGenCylinderBellFunnel;
imagesc(ds.X);
title(‘Cylinders (1:266), Bells (267:532), and Funnels (533:798)’);

Conclusion

That’s all for now. Hope you enjoy these new data sets, we’re always adding new data to the PRT; let us know what you’d like to see!




Comments