TEPS
TO INSTALL HADOOP-1.0.3 SINGLE NODE CLUSTER IN UBUNTU-14.0.4 LTS
1] First login as super user from normal
user and then only start installation
Ex: praveen@delllaptop]
sudo su
by giving super user password then prompt changes to
root@delllaptop]
by giving super user password then prompt changes to
root@delllaptop]
2]
Connect to internet and Update ubuntu by giving the command
sudo apt-get update
3]
Install java from internet by giving command
sudo
apt-get install openjdk-7-jdk
check for its installation
in the path /usr/lib/jvm/java-1.7.0-openjdk-amd64
4]
Install openssh server and create keys and configure it
sudo apt-get install openssh-server
ssh-keygen –t dsa –P ‘ ‘ –f
~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >>
~/.ssh/authorized_keys
5] Create a directory named hadoop in /usr/local/
sudo mkdir
/usr/local/hadoop
Copy the
above tar file hadoop-1.2.0 .tar.gz from home to hadoop directory
6]
Unzip the tar file from there /usr/local/hadoop
sudo tar –zxvf hadoop-1.2.0.tar.gz
7] Set the path variable by editing the bashrc
file by typing the command
sudo nano $HOME/.bashrc
goto the end of the file and add
these two lines
export
PATH=$PATH:$HADOOP_HOME/bin
8] Run
the bash shell from terminal
exec
bash
9] Give the path command at the terminal
$PATH
$PATH
10] Set
the hadoop configuration files by navigating to
/usr/local/hadoop/hadoop-1.0.3/conf
a)hadoop-env.sh -------to setup the
java environment to tell hadoop version of java
sudo
nano hadoop-env.sh
Remove the comment from the line
and edit the java installation path
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-i386 (for virtual machine)
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-i386 (for virtual machine)
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64(for
desktops)
b) core-site.xml ----to configure the name node and tmp directory
sudo
nano core-site.xml
add these lines
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:10001</value>
</property>
<name>fs.default.name</name>
<value>hdfs://localhost:10001</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/hadoop-1.2.0/tmp</value>
</property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/hadoop-1.2.0/tmp</value>
</property>
c)
mapred-site.xml ---------to set
jobtracker
sudo
nano mapred-site.xml
<property>
<name>mapred.job.tracker</name>
</value>localhost:10002</value>
</property>
<name>mapred.job.tracker</name>
</value>localhost:10002</value>
</property>
11]
Create a tmp directory under the /usr/local/hadoop/hadoop-0.20.0
Sudo mkdir tmp
Give
the permission for the user
Sudo
chown harish
/usr/local/hadoop/hadoop-1.2.0
Sudo
chmod 777 /usr/local/hadoop/hadoop-1.2.0/tmp
12]
Format the namenode and start the nodes
Goto
bin directory
hadoop
namenode –format
13]
Start all hadoop daemons by giving the command
Start-all.sh
To start the specific node
hadoop daemons start namenode
14] Look for the browser interface by Access
the webpage
Go
to web browser and type localhost: 50070
15] To run the default word count program in
ubuntu. Follow these steps
create file by giving command sudo nano one.txt and type some contents in it .
create a input folder under hdfs by giving command hadoop fs –mkdir input
copy one.txt into input directory by giving command hadoop fs –copyFromLocal one.txt input
check for the copied file by giving command hadoop fs –ls input
Then run the word count program from /usr/local/hadoop/hadoop-1.2.0 by giving command
hadoop jar hadoop-examples-1.2.0.jar
wordcount input output
Then look for the output in output folder by typing the command
hadoop fs –ls output
Then open the part-r-00000 file by typing the command
hadoop fs –cat part-r-00000
Then look for the output in output folder by typing the command
hadoop fs –ls output
Then open the part-r-00000 file by typing the command
hadoop fs –cat part-r-00000
Installing HBase in Standalone Mode
1.Download the latest
stable version of HBase form http://www.interiordsgn.
com/apache/hbase/stable/
using
“wget” command, and extract it using the tar “zxvf”
command. See the following command:
$ cd usr/local/
$ wget
http://www.interior-dsgn.com /apache/hbase/stable/hbase-0.94.8-tar.gz
$ tar -zxvf hbase-0.94.8.tar.gz
2. Configuring HBase
in Standalone Mode:
hbase-env.sh:
Set the java Home
for HBase and open hbase-env.sh file from the conf folder. Edit
JAVA_HOME
environment
variable and change the existing path to your current JAVA_HOME variable as
shown
below:
cd /usr/local//Hadoop/Hbase/conf
gedit hbase-env.sh
This will open the
env.sh file of HBase. Now replace the existing JAVA_HOME value with your
current value as
shown below.
export JThis is
the main configuration file of HBase. Set the data directory to an appropriate
location by opening the HBase home folder in /usr/local/HBase. Inside the conf
folder, you will find several files,
open the hbase-site.xml
file as shown below.
#cd /usr/local//hadoopHBase/
#cd conf
# gedit
hbase-site.xm l
Inside the hbase-site.xml
file, you will find the <configuration> and </configuration>
tags.
Within them, set
the HBase directory under the property key with the name “hbase.rootdir” as
shown below.AVA_HOME=/usr/lib/jvm /java-1.7.0
<configuration>
//Here you have to set
the path where you want HBase to store its files.
<property>
<name>hbase.rootdir</nam e>
<value>file:/home/hadoop/HBase/HFiles</value>
</property>
//Here you have
to set the path where you want HBase to store its built in zookeeper files.
<property>
<nam e>hbase.zookeeper.property.dataDir</nam e>
<value>/home/hadoop/zookeeper</value>
</property>
</configuration>
With this, the
HBase installation and configuration part is successfully complete.
We can start HBase
by using start-hbase.sh script provided in the bin folder of HBase. For
that, open HBase Home Folder and run HBase start script as shown below:
$ cd /usr/local/hadoop/HBase/bin
$ ./start-hbase.sh
If everything goes
well, when you try to run HBase start script, it will prompt you a message
saying
that HBase has started:
starting m aster,
logging to /usr/local/Hadoop/HBase/bin/../logs/hbase-tpmaster localhost.
Localdomain.out
Starting HBaseShell
After Installing HBase successfully, you can
start HBase Shell. Open the terminal, and login as super user.
Start Hadoop File System:
Browse through
Hadoop home sbin folder and start Hadoop file system as shown below:
$ cd $
HADOOP_HOME/sbin
$ start-all.sh
Start HBase:
Browse through the
HBase root directory bin folder and start HBase.
$ cd
/usr/local/HBase
$ ./bin/start-hbase.sh
Start HBase Master
Server:
This will be the
same directory. Start it as shown below:
$ ./bin/local-m
aster-backup.sh start 2 (num ber signifies specific server.)
Start Region:
Start the region
server as shown below:
$ ./bin/./local-regionservers.sh start 3
Start HBase Shell
You can start
HBase shell using the following command:
$ cd bin
$ ./hbase shell
This will give you
the HBase Shell Prompt as shown below.
2014-12-09
14:24:27,526 INFO [main] Configuration.deprecation:
hadoop.native.lib
is deprecated. Instead, use io.native.lib.available
HBase Shell; enter
'help<RETURN>' for list of supported com m ands.
Type
"exit<RETURN>" to leave the HBase Shell
Version
0.98.8-hadoop2, r6cfc8d064754251365e070a10a82eb169956d5fe, Fri
Nov 14 18:26:29
PST 2014
hbase(main):001:0>
HBase Web
Interface:
To access the web
interface of HBase, type the following url in the browser:
http://localhost:60010
This interface lists your currently running
Region servers, backup masters and HBase tables.
Setting Java
Environment:
We can also
communicate with HBase using Java libraries, but before accessing HBase using
Java
API you need to
set classpath for those libraries.
Setting the
Classpath:
Before proceeding
with programming, set the classpath to HBase libraries in .bashrc file.
Open .bashrc in
any of the editors as shown below.
$ gedit ~/.bashrc
Set classpath for
HBase libraries (lib folder in HBase) in it as shown below.
export CLASSPATH =
$ CLASSPATH://hom e/hadoop/hbase/lib/*
This is to prevent the “class not found”
exception while accessing the HBase using java API.
Hbase
shell commands:
HBase
shell commands are mainly categorized into 6 parts
1) General
HBase shell commands:
status
Show
cluster status. Can be ‘summary’, ‘simple’, or ‘detailed’.
The
default is ‘summary’.
hbase>
status
hbase>
status ‘simple’
hbase>
status ‘summary’
hbase>
status ‘detailed’
version
Output
this HBase versionUsage:
hbase>
version
whoami
Show the current hbase user.Usage:
hbase>
whoami
2) Tables
Management commands:
alter
Alter
column family schema; pass table name and a dictionary
specifying
new column family schema.
hbase>
alter ‘t1’, NAME => ‘f1’, VERSIONS => 5
You
can operate on several column families:
hbase>
alter ‘t1’, ‘f1’, {NAME => ‘f2’, IN_MEMORY => true}, {NAME => ‘f3’,
VERSIONS => 5}
To
delete the ‘f1’ column family in table ‘t1’, use one of:hbase> alter ‘t1’,
NAME => ‘f1’, METHOD => ‘delete’
hbase>
alter ‘t1’, ‘delete’ => ‘f1’
You
can also change table-scope attributes like MAX_FILESIZE, READONLY,
MEMSTORE_FLUSHSIZE,
DEFERRED_LOG_FLUSH, etc. These can be put at the end;
for
example, to change the max size of a region to 128MB, do:
hbase>
alter ‘t1’, MAX_FILESIZE => ‘134217728’
hbase>
alter ‘t1’, CONFIGURATION =>
{‘hbase.hregion.scan.loadColumnFamiliesOnDemand’ => ‘true’}
hbase>
alter ‘t1’, {NAME => ‘f2’, CONFIGURATION =>
{‘hbase.hstore.blockingStoreFiles’ => ’10’}}
You
can also remove a table-scope attribute:
hbase>
alter ‘t1’, METHOD => ‘table_att_unset’, NAME => ‘MAX_FILESIZE’
hbase>
alter ‘t1’, METHOD => ‘table_att_unset’, NAME => ‘coprocessor$1’
There
could be more than one alteration in one command:
hbase>
alter ‘t1’, { NAME => ‘f1’, VERSIONS => 3 },
{
MAX_FILESIZE => ‘134217728’ }, { METHOD => ‘delete’, NAME => ‘f2’ },
OWNER
=> ‘johndoe’, METADATA => { ‘mykey’ => ‘myvalue’ }
create
Create
table; pass table name, a dictionary of specifications per
column
family, and optionally a dictionary of table configuration.
hbase>
create ‘t1’, {NAME => ‘f1’, VERSIONS => 5}
hbase>
create ‘t1’, {NAME => ‘f1’}, {NAME => ‘f2’}, {NAME => ‘f3’}
hbase>
# The above in shorthand would be the following:
hbase>
create ‘t1’, ‘f1’, ‘f2’, ‘f3’
hbase>
create ‘t1’, {NAME => ‘f1’, VERSIONS => 1, TTL => 2592000, BLOCKCACHE
=> true}
hbase>
create ‘t1’, {NAME => ‘f1’, CONFIGURATION =>
{‘hbase.hstore.blockingStoreFiles’ => ’10’}}
Table
configuration options can be put at the end.
describe
Describe
the named table.
hbase>
describe ‘t1’
disable
Start
disable of named table
hbase>
disable ‘t1’
disable_all
Disable
all of tables matching the given regex
hbase>
disable_all ‘t.*’
is_disabled
verifies
Is named table disabled
hbase>
is_disabled ‘t1’
drop
Drop
the named table. Table must first be disabled
hbase>
drop ‘t1’
drop_all
Drop
all of the tables matching the given regex
hbase>
drop_all ‘t.*’
enable
Start
enable of named table
hbase>
enable ‘t1’
enable_all
Enable
all of the tables matching the given regex
hbase>
enable_all ‘t.*’
is_enabled
verifies
Is named table enabled
hbase>
is_enabled ‘t1’
exists
Does
the named table exist
hbase>
exists ‘t1’
list
List
all tables in hbase. Optional regular expression parameter could
be
used to filter the output
hbase>
list
hbase>
list ‘abc.*’
show_filters
Show
all the filters in hbase.
hbase>
show_filters
alter_status
Get
the status of the alter command. Indicates the number of regions of the table
that have received the updated schema Pass table name.
hbase>
alter_status ‘t1’
Are you trying to make money from your visitors via popup advertisments?
ReplyDeleteIf so, did you take a look at PopCash?