DM_WK2

Overview of this Lecture / week

This week we are going to look at the typical life-cycle used for most Data Science/Data Mining etc projects. This is called CRISP-DM. It has been around a long time and many people and consultancy companies have adopted it. Some have created their own versions of it that. They all do and say the same thing.

The lab work will involve you setting up an account on the SAS Cloud service called SAS OnDemand. You can then connect to my class group, download the SAS Enterprise Miner (EM) app, log-in using this app and create a project. You can use this project for all your lab and assignment work.

The SAS Enterprise Miner is hosted by SAS on their OnDemand Service. DIT has no control or administration access of this software. If you encounter any issues with your account, you will need to contact SAS Support.

Because SAS Enterprise Miner is hosted outside of DIT you will be able to use the software at home, work, in DIT, etc. So you can complete the lab exercises anywhere.

Notes

Click here to download notes for Week 2 notes.

L2 - DM Life Cycles

Videos of Notes



Related Videos

Lab Exercises

Task 1

The SAS Enterprise Miner is hosted by SAS on their OnDemand Service. This is hosted external to TU Dublin. It is a service the SAS provides for Universities and Companies around the World.

TU Dublin has no control or administration access of this software.

Because SAS Enterprise Miner is hosted outside of TU Dublin, you will be able to use the software at home, work, etc. This will allow you can complete the lab exercises anywhere.

You need to sign up and register to use SAS OnDemand for this class group. You need to have completed all the sign up steps before next weeks class. First create a SAS Profile. Sign into your profile on SAS website.

After you have successfully created your Profile, follow these steps:

Sign on the the Control Center at https://odamid.oda.sas.com.

Complete the registration for SAS OnDemand. You will need to wait for another email. This can take some time to arrive, for example anything from 1 minute to 24 hours.

Look for the Enrollments and click on Enroll in a course link. Click this link to start the enrollment.

Enter the Course Codeaa28fe08-5318-463f-ba13-3572db3d990c

Submit the form.

Confirm that this is the correct course and then click the button to finish enrolling.

SAS On Demand Course Name : TU Dublin: Data Mining

Level: Graduate

Institution: TU Dublin

SAS Course code : aa28fe08-5318-463f-ba13-3572db3d990c

This video shows the steps involved in creating your SAS Profile, accessing SAS OnDemand, and accessing the software.

Follow the SAS OnDemand Student Registration Handbook  and how to register for a class.

It can take anything from 1 minute to 24hours for your account to become active.

If you have any issues with setting up your SAS OnDemand Account or using SAS Enterprise Miner then you will need to contact SAS Support.

Task 2

NOTE: The SAS OnDemand interface has just been updated and is slightly different to what is shown in this video 🙁

Download the SAS Enterprise Miner Java application.

Log into the SAS OnDemand Control Center.

IMPORTANT:

        1. Check the version of Java required. If you have a newer version of Java to what is required, you do not need to do anything.
        2. When Opening the App, and you are prompted to update your version of Java, Don’t Do It!  Everything will work.

NOTE: The SAS interface has recently been updated recently. This video hasn’t been updated to reflect this change.  The same steps still apply

Click on the Enterprise Miner link. This will download a small Java application to your machine.

You should save this file (to your desktop), as you can use just use this app from now on.

You do not need to download this file every time.

Task 3

Open the SAS Enterprise Miner application, login and create a project.
Create a SAS EM Project for your Lab work. Call it My Lab Work.

NOTE: The SAS interface has recently been updated recently. This video hasn’t been updated to reflect this change.  The same steps still apply

When you have created the project you are ready to start using SAS Enterprise Miner.

That’s all the lab work for this week.

NOTE: If you have a high resolution screen or when you open the SAS Enterprise Miner Java app, it appears with really tiny fonts!. If this is happening, then check out this solution from another student.

Task 4 – Framing a Data Science and Data Mining Problem/Question – Group Discussion – in groups of 2-4 people

In the lecture we looked at framing a Data Science and Data Mining Problem.  Use this approach to frame possible Data Science/Data Mining problems for each of the following:

      • Direct Marketing Campaign to launch a new Financial product
      • Christmas Mail Order campaign.  With different different product categories of  household items, Christmas decorations, children toys.
      • Car insurance company analyzing different groups of customers and their history
      • HR department in a company looking to analyze and predict who might be a good hire based on the CV

Think about the questions you would need to ask and understand each step of framing the problem. Put yourself in the position of being an employee in that company.

Depending on your experience, you may need to do some research on different use cases for each of these.

It would be beneficial if you could complete this task working in groups of 2-4 people. This will allow you to share ideas, knowledge, experience and be able to learn from each other.

 

What to prepare for next week

Make sure you have completed all the above steps.

Maybe go through them again and have the project set up for next weeks class.

We will use some R examples over the next two weeks. You will need to install R (and RStudio), and you might have already completed this in another module.

Additional Reading Materials

A survey of data mining and knowledge discovery process models and methodologies

Crisp-DM

Fayyad Paper on KDD

Introduction to Data Mining – Book Chapter

Data Mining in Business – Book Chapter

Data Mining A Closer Look – Book Chapter

CRISP-DM Guide

CRISP-DM Process Model

When Algorithms Control the World