Just another site

Monthly Archives: November 2013

Restarting My Blog

I’ve been thinking of switching my focus on technology for a while: I’ve been on the Microsoft bandwagon since 1995 and devoted all my professional energy keeping up with the latest of what Redmond has to offer.  This is great at my current position.  However, it keeps me somewhat shut off from the other camp, like, the rest of the world.  With Hadoop and its derived technologies so prominent these days, all the major companies, when hiring a BI talent, request Hadoop skills.  I want to be on that boat!

3 weeks ago, I installed a VirtualBox VM with Linux/Ubuntu.  Installed and configured Hadoop 2.2.0 and have the system ready as a single node machine.  It was much harder than I thought and I felt like a baby taking her first steps.  What I’ve learned so far:

1. how to keep my system updated.

2. how to download a package or a piece software and install it on the machine.

3. how to create a new user as the service account, and how to change user in terminal.

4. how to use Nano editor and what sudo means.

5. the “official” Hadoop installation doc is very hard to read because it assumes functional knowledge of Linux.  Online search results, though helpful, often outdated.  In trying to update the config xml files, one blogger refer to a file exists in a different location and that threw me off.  What I did is to keep looking and not getting stuck in one place.  Eventually, it will work.

I am at the point where I tried to compile the sample WordCount java script in Hadoop

$ javac -classpath ${HADOOP_HOME}/hadoop-${HADOOP_VERSION}-core.jar -d wordcount_classes

I am getting 18 errors, starting from

import org.apache.hadoop.mapred.*

It says package org.apache.hadoop.mapred does not exist.  I am gussing the classpath is incorrect.

On a separate endeavor, I am learning Java from the beginning.  I wrote a HelloWorld program today.  Just have to keep at it, even if it’s just a little bit of progress each day.