PR Links Home Course Information PR Links Projects Homework

bullet

Feature Selection

bulletMultidimensional Scaling:
bulletWebpage by Stephen P. Bourgatti, with good description and example.
bulletMatlab example using the statistics toolbox
bulletWrappers (Kohavi and John)
bulletBasic paper describing wrapper approach
bulletMore detailed paper with more thorough description  (44 pages)
bulletFilters
bulletInformation theory based approach to feature selection (RELIEF)
bulletCorrelation-based feature selection, comparison to wrappers
bullet

Classifier Comparison and Tools

bulletBootstrap and applications; nice article from IEEE Signal Processing Magazine, very accessible and readable, by Zoubir and Boashash.
bullet

Decision Trees Tutorials and Software

bulletOverview of Decision Trees, focuses on ID3: nice worked out examples used in class
bulletTutorial on building ID3 and C4.5 decision trees, Temple University
bulletQuinlan's university web page, has papers and code for c4.5
bulletQuinlan's company Rulequest.com with new improved tree building routines (See5).
bulletNotes on Bagging and Boosting Classifiers
bulletBagging, Breiman, technical report providing theoretical development of bagging.
bulletBoosting, Recent tutorial article by Schapire on Boosting, its history and variations.
bulletApplying Bagging and Boosting to C4.5, written by Quinlan in 1996.
bullet

Support vector machines background

bulletBurges, C.J.C, "A Tutorial on Support Vector Machines for Pattern Recognition," Data Mining and Knowledge Discovery, Volume 2, number 2, pages 121-167, 1998.
bulletHsu, C.W, Chang, C.C., Lin, C.J., "A Practical Guide to Support Vector Classification,"
bulletKernal Machines site
bulletScholkopf NIPS 2001 Tutorial
bulletScholkopf Kernal Machines review paper.
bulletIn Class SVM demos (use libsvm)
bullet

Tools

bulletGGobi - Statistical visualization package for exploring data.
bulletR: open source statistical software package
bulletMatlab - The statistical, neural network, and fuzzy logic toolbox will be very useful for processing data for homework and projects.
bulletClassification Toolbox-written to support the textbook. This toolbox started as a course assignment in Dr. Ron Meir’s graduate course, Pattern Recognition at Technion – Israel Institute of Technology. The foundation for the toolbox, as well as most of the basic algorithms, were coded by Elad Yom-Tov and Hilit Serby. A year later, Igor Makienko and Victor Yosef coded the Voted Perceptron algorithms. More about the toolbox here.
bulletBootstrap Toolbox by Zoubir and Iskander.
bullet

Data Sets and Repositories

bulletSTATLIB. Datasets from the Statistics Department at CMU.
bulletUCI Machine Learning Data Repository

If you publish material based on databases obtained from the UCI repository, then, in your acknowledgments, please note the assistance you received by using this repository. This will help others to obtain the same data sets and replicate your experiments.

Blake, C.L. & Merz, C.J. (1998). UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science.