In this class we will look at the basics of the Map-Reduce process, it’s different components, and how it all works together.
The lab exercises main towards you building and running your first Map-Reduce process.
Follow the notes very carefully. There are some configurations settings. If you miss one of these or do one incorrectly, the code will not compile and/or will not run for you. See the sample code.
Complete Exercise 1 and Exercise 2 before the next class.
WARNING: You need to be careful of what version of Hadoop is used for each tutorial/examples. Some package and functions names might have slightly different names and/or functionality between different versions of Hadoop. RTFM! (Read The Fabulous Manuals) i.e. the documentation!
Other tutorials – Alternative Lab Exercises
- Hadoop & Mapreduce Examples: Create First Program in Java
- Word Count Program With MapReduce and Java
- MapReduce Tutorial – Fundamentals of MapReduce with MapReduce Example
- How to Execute WordCount Program in MapReduce using Cloudera Distribution Hadoop(CDH)
- Counting words with Hadoop—running your first program
- MapReduce Word Count Example
- Word Count MapReduce Program in Hadoop
- Writing a Simple Word Counter using Hadoop MapReduce
Additional Reading and Materials
Shakespeare data set – gdrive link
Shakespeare data set – dropbox link
Shakespeare data set – website link
Hadoop in Action – Some Case Studies
10 Hadoop Tutorials