Overview of this Lecture / week

Welcome to the first week of the module, Data Mining.  We will only have a lecture this week (no lab work), but check out a few things you can work on to prepare for lab work over the coming weeks.

The lecture this week will take about 2 hours. The topic is an overview to the whole are of Data Mining/Data Science/Machine Learning/Predictive Analytics/… etc, whatever the current marketing trendy work or phrase they are using this week. This will include the context of the topic, examples of how it is being used industry, typical skills and what languages or products are being used. Sometimes it is a case of Theory vs Reality, what we constantly hear or ready about versus what companies are really doing and using

Notes

Click here to download notes for Week 1 – Data Mining Intro.

Videos of Notes



Related videos

Lab Exercises

Task 1 – What are your experiences

Ideally work together with 2-3 other students in the class.

Have a short discussion to share your experiences. Some people will have lots of experience, some very little and others have a some. We can all learn from each other.

Your experiences can include projects you worked on, projects you are aware off, articles you have read, videos you have watched, etc. We are constantly learning about new things.

[You don’t have to give away any company confidential information. You can talk in a general way, and about your experiences.]

Use the Analytics Value Chain to guide your discussion.

Use the questions listed for each stage as a guide. You don’t have to answer each question. You are just trying to share knowledge and experiences, be they good, bad or anything in-between.

      • Strategy – what projects? what were the problems? what was the goal? what were they trying to achieve?
      • Data – what kind of data? what volumes of data? data locations? what it easy to get access to the data? was there politics involved? any data storage/location issues? GDPR?
      • Analytics – it isn’t all about machine learning. Analytics can be anything really. What kind of analytics was performed? what tools and languages used? how was the results presented?
      • Implementation – what happened to the analytics? were they put into production? front-end apps, back-end apps, daily updates, batch processing, etc.,  was there any issues with these?
      • Maintenance – are the analytics tracked over time? is there any trends identified? how do they know the analytics/models are still working? any issues integrating with development team?

After your discussion write up a 2 page summary documenting all the shared experiences, taking note of what others shared, how their experiences are different to yours, what was interesting, lessons learned, things to remember etc.

Over the next 2 weeks, have similar conversations with other people in the class (who weren’t in the original group). Do you learn anything new? is there commonality? etc.

Task 2 – [complete before next week]
You will need a network account to be able to log into the computers in the lab.
Or you can use your own laptop. If you are going to use your own laptop then you need to complete Task 2

Task 3 – [complete before next week]
It is recommended that you use your own laptop for the Data Mining Labs.
We will be using SAS Enterprise Miner, and this is hosted by SAS on their cloud platform called SAS OnDemand.
You will need to have Java installed on your laptop. Following the instructions on the following link to install Java. You can install a more recent version of Java

SAS webpage detailing the Java requiremens for using SAS OnDemand Enterprise Miner.

NOTE: If you have a higher or more recent version of Java installed then you do not need to installed the version listed on SAS website.

You could use the PCs in each of the SoC computer labs, but this will mean you will need to download the SAS Enterprise Miner Java application each time you want to use it.

Task 4 – [complete before next week]

Download and install R & Python languages on your laptop.

This will be covered in the Working With Data module.

I’ll be giving some examples in R and Python, along with some exercises over the coming weeks.

What to prepare for next week

It is preferred if you use you own laptop for all lab work. This will allow you to work anywhere on the lab exercises.

Make sure you are using a supported version of Java needed for SAS Enterprise Miner. If not, you will need to upgrade the version of Java on your machine.

Additional Reading Materials

Predictive Analytics 101

Data Science Use Cases

Why Should CIOs Consider Advanced Analytics

IT has 26 Words for Data Mining

Data Mining in Business Chapter

Fayyad Paper on KDD

Overview of Data Warehousing Paper

Corporate Data : A brief histroy – Bill Inmon

Getting an Edge – Irish Times

Analytics in Football

Getting big impact from big data

How to get the most from big data