Wednesday, 17 February 2016

Building hadoop-2.6.3 on Mac OS X El Capitan 10.11.3 with native components

INSTALL PRE-REQUISITES


INSTALL PROTOBUF 2.5.0
  • svn co http://svn.macports.org/repository/macports/trunk/dports/devel/protobuf-cpp -r 105333 
  • cd protobuf-cpp/ 
  • sudo port install
To Verify:
  • protoc --version 
  • libprotoc 2.5.0
ACQUIRE SOURCES
    git clone https://github.com/apache/hadoop.git
    git checkout release-2.6.3
INSTALL ORACLE JDK 1.7
export JAVA_HOME=`/usr/libexec/java_home -v 1.7`
sudo mkdir $JAVA_HOME/Classes
sudo ln -s $JAVA_HOME/lib/tools.jar $JAVA_HOME/Classes/classes.jar
Build Hadoop 2.6.3 from source code
mvn clean install package -Pdist,native,src -DskipTests
verify below folder exits after build
../hadoop/hadoop-dist/target/hadoop-2.6.3/lib/native
Update mvn plugins
The m2eclipse and command line mvn tool take two very different approaches to 
Eclipse/Maven integration. 
The mvn eclipse:eclipse command reads your pom file and creates Eclipse projects with correct metadata so that Eclipse will understand project types, relationships, classpath, etc. It does not actually import those projects into a workspace as creating a workspace or importing projects into a workspace requires running Eclipse. You have to re-run this command when anything in your pom changes. Once you run this command, it is simple to import the created projects into your workspace. Just start Eclipse and use File -> Import -> Existing Projects wizard. Once you've imported projects you will not have to repeat this process after re-generating metadata unless the number of projects have changes.Just start Eclipse back up,select all projects and invoke refresh from the context menu.
cd hadoop/hadoop-maven-plugins mvn clean install -DskipTests mvn eclipse:clean -DskipTests mvn eclipse:eclipse
Resources
Working with Hadoop under Eclipse
https://wiki.apache.org/hadoop/EclipseEnvironment
Hadoop Wiki
https://wiki.apache.org/hadoop/FrontPage
How to set up Eclipse environment for Pig
https://cwiki.apache.org/confluence/display/PIG/How+to+set+up+Eclipse+environment

Building Hive from Source
https://cwiki.apache.org/confluence/display/Hive/GettingStarted#GettingStarted-BuildingHivefromSource
Data Science Bootcamps
http://www.thisismetis.com/data-science

Build Hive from source code
git clone https://github.com/apache/hive.git
cd hive
git checkout release-1.2.1
mvn clean install -DskipTests -Phadoop-2
cd itests/
mvn clean install -DskipTests -Phadoop-2
ops folder structure
├── dn
├── logs
│   ├── hadoop
│   ├── hbase
│   └── yarn
├── nn
├── pids
└── tmp

Contributors:  Sandeep Setty, Venkat Mogili