softwaremom

Just another WordPress.com site

Compiling Cloudera WordCount Hadoop Sample Code

Since I’ve been having trouble building the sample code from the Hadoop – The Definitive Guide, using Eclipse.  I thought I’d go back to the Cloudera site and built it’s sample code, following the instruction from here:  http://www.cloudera.com/content/cloudera-content/cloudera-docs/HadoopTutorial/CDH4/Hadoop-Tutorial/ht_usage.html

However, since the version of Hadoop has changed, I am using Hadoop 2.0.0-cdh4.5.0, the ClassPath is different, too.  When I try compiling WordCount.java:

$ mkdir wordcount_classes
$ javac -cp <classpath> -d wordcount_classes WordCount.java
where <classpath> is:/usr/lib/hadoop/*:/usr/lib/hadoop/client-0.20/* 

I get errors: 

WordCount.java:9: package org.apache.hadoop.mapred does not exist
import org.apache.hadoop.mapred.*;
^
WordCount.java:14: cannot find symbol
symbol : class MapReduceBase
location: class org.myorg.WordCount
public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, IntWritable> {

Obviously, we need more jar file to be included in the ClassPath.  After checking with the installed directories, this works:

javac -cp /usr/lib/hadoop/*:/usr/lib/hadoop-mapreduce/* -d wordcount_classes WordCount.java

After that, I can create the jar file, run the application, and exam the results, just like what’s in the instruction.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: