Tuesday, 2 April 2013

Hadoop implementation(setup) with window 7 using CYGWIN

 I'll just show how to configure Hadoop on a single window in pseudo distributed mode.

Versions used :

1- window 7 with CYGWIN
2- Java (Oracle java-1.6)
3- Hadoop (Apache hadoop-1.0.3)

If you have everything in place, start following the steps shown below to configure Hadoop on your machine :

1) For install cygwin properly follow the link :
3) Extract hadoop 1.0.3 inside cygwin ( c:\cygwin\usr\local\hadoop-1.0.3
4) Now open and edit  hadoop-env.sh using path C:\cygwin\usr\local\hadoop-1.0.3\conf  and change java    path according to environment path(NOTE : Before change in java path first set java path in environment variable)
  export JAVA_HOME=C:\\Java\\jre1.6.0
Than go to cygwin terminal and check:

5)Now ,we ll start with the actual configuration process. Hadoop is configured using a set of configuration files present inside the HADOOP_HOME/conf directory. These are xml files having a set of properties in form of key value pairs.We'll modify the following 3 files for our setup.

Edit in mapred-site
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
 <name>mapred.job.tracker</name>
 <value>localhost:9001</value>
</property>
</configuration>

Edit in hdfs-site
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
 <name>dfs.replication</name>
 <value>1</value>
</property>
 </configuration>

Edit in core-site 
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
 <property>
 <name>fs.default.name</name>
 <value>hdfs://localhost:9000</value>
</property>
</configuration>

save all files and close.

6)To format the filesystem, enter the following command in cygwin:
 bin/hadoop namenode -format


7) Now type command  bin/start-dfs.sh 
8) and than bin/start-mapred.sh
ar in palce of therse command we can type bin/start-all.sh (deprecated)

9) Check logs inside C:\cygwin\usr\local\hadoop-1.0.3\logs

10) After check log findout particular ips than check and get output like that
 if these screen show successfully than we ask hadoop configure properly.




1 comment:

  1. I get a lot of great information here and this is what I am searching for Hadoop. Thank you for your sharing. I have bookmark this page for my future reference.Thanks so much for the work you have put into this post.
    Hadoop Training in hyderabad

    ReplyDelete