Overview of this Lecture / week
This week and next week we will cover Classification. This is perhaps one of the most commonly used machine learning techniques used for Data Mining. There are two main methods for Classification and these are regularly used interchangeably. The first is the labeling of binary or multi-class features. The second is the prediction of continuous value features.
This week we will cover what is classification, what it is typically used for, the typical process, some data preparation requirements for classification, we will example decision trees and how they are created, then look at Random Forests, and finally how to evaluate the classification models (How do you know what is good).
Next week we will look at some additional Classification algorithms and also look at Regression. Many of the machine learning algorithms can be used for classification and regression problems.
Click here to download the notes for the Classification topic. These notes cover this week and next week.
Videos of Notes
The lab for this week involves using SAS Enterprise Miner to create a Decision Tree model and to explore the various evaluation metrics.
Additional Reading Materials