Test installed version successfully completed


ps: Please leave a message if you need

Words written in the front

About this blog
This tutorial was not written by myself , All the big bulls on the website , Super detailed summary one by one , Thank you for your attention to beginners !!
There is no ranking for the blogs posted here , There is no good or bad , And it's just the tip of the iceberg , Picked out a few articles that the blogger thought were suitable for beginners .
About tutorial mode
If there is a preliminary understanding , And there are certain linux Basic words , You can directly see the installation command posted by the last Blogger , And operation , Simple , But it's a complete installation .
It doesn't matter if there's no foundation , Bloggers will recommend a lot of blogs for beginners , Step by step understanding .

Let's get in hadoop Learning and installation of

Default to fully distributed installation
The experimental conditions are 3 platform Vm Of ubuntu-server virtual machine ( Maybe in the blog linux It's different , But it doesn't matter , Not much )
Why not choose the latest version of hadoop What about ? Because there are a few tutorials in the latest version , Bloggers are just beginning to learn , So I chose the lower version with higher compatibility .

first , We learn from installing virtual machines :
If possible , Basically hadoop Installation of is complete , If there are still many problems, please read on .

Here are more installation details , There are many installation pits .

Detailed explanation of pictures and texts

Pseudo Distributed installation

The following three are for the latest version 3.0 Installation process of
https://www.jianshu.com/p/1d99be0d2544 <https://www.jianshu.com/p/1d99be0d2544>

Errors encountered during installation

ps: Typical mistakes encountered by bloggers

Notes on pit prevention , Many typical error solutions

hadoop Version selection and difference of official website download

https://www.jianshu.com/p/a6bfe81247b6 <https://www.jianshu.com/p/a6bfe81247b6>
Error: JAVA_HOME is not set and could not be found


ERROR: Cannot set priority of datanode process 32156

thus hadoop Installation completed

Hbase + zookeeper Installation of

Bloggers did not install according to one article , At that time, we did not find a suitable full version of the installation , And then I found out
There are version constraints in it , Very valuable

Bloggers follow this article hbase Started installation of , One thing is for bloggers hbase There's no inside zookeeper, So there is a setting that doesn't follow the above

About Hbase Disable self brought ZooKeeper, Use installed ZooKeeper

Then enter zookeeper Installation of
thus , installation is complete


Maybe it's too many failures , The installation at the back is more manual , It's all installed in one night .

What do you think is important during installation :

* use ssh Free connection only needs master And slave Between ,slave No need between .
* When configuring a file , Folders can be created by yourself .
* 3.0 Version and 2.x Differences between versions still exist , If familiar 2.x During installation 3.0 Installation may be better
* centos Version of virtual machine root account , Although the installation does not need the interference of permission , But in the future, permission management is still needed in actual use .
* The use of commands can speed up the installation process
* Understand the contents of the configuration file , It's actually meaningful .
* Configuration requires only one , The rest of the replication will do !!!
Instruction collection

Initial configuration
sudo apt install openssh-server hadoop1 hadoop2 hadoop3
Modify source
sudo vi /etc/apt/sources.list # The source image is annotated by default to improve apt update speed , You can cancel the note if necessary deb
https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial main restricted universe
multiverse# deb-src https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial main
restricted universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/
xenial-updates main restricted universe multiverse# deb-src
https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-updates main restricted
universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/
xenial-backports main restricted universe multiverse# deb-src
https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-backports main restricted
universe multiverse deb https://mirrors.tuna.tsinghua.edu.cn/ubuntu/
xenial-security main restricted universe multiverse# deb-src
https://mirrors.tuna.tsinghua.edu.cn/ubuntu/ xenial-security main restricted
universe multiverse
to update , Download compilation tools
sudo apt update sudo apt install vim ps -A | grep apt sudo kill -9 process ID # sudo
apt-get upgrade
Change name
sudo hostname hadoop1
Modification time
sudo tzselect sudo cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
install jdk, verification
tar -zxf jdk-8u161-linux-x64.tar.gz sudo mv jdk1.8.0_161/ /opt/ sudo vim
/etc/profile.d/jdk1.8.sh export JAVA_HOME=/opt/jdk1.8.0_161 export JRE_HOME=
${JAVA_HOME}/jre export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib export
PATH=${JAVA_HOME}/bin:$PATH source /etc/profile java -version
ssh Watertight connection
sudo apt install ssh sudo apt install rsync ssh-keygen -t rsa ssh-copy-id -i ~/
.ssh/id_rsa.pub hadoop1 ssh-copy-id -i ~/.ssh/id_rsa.pub hadoop2 ssh-copy-id -i
~/.ssh/id_rsa.pub hadoop3
hadoop Environment variable configuration for
sudo vim /etc/profile.d/hadoop2.7.5.sh #!/bin/sh export HADOOP_HOME=
"/opt/hadoop-2.7.5" export PATH="$HADOOP_HOME/bin:$PATH" export HADOOP_CONF_DIR=
$HADOOP_HOME/etc/hadoop export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
hadoop Internal environment variable
sudo vim /etc/profile.d/hadoop-env.sh export JAVA_HOME=/opt/jdk1.8.0_161
Configuration from node
slaves hadoop2 hadoop3
File configuration
---------------core-site.xml---------------<configuration> <!--
appoint hdfs Of nameservice by ns1 --> <property> <name>fs.defaultFS</name> <value>
hdfs://Hadoop1:9000</value> </property> <!-- Size of read/write buffer used in
SequenceFiles. --> <property> <name>io.file.buffer.size</name> <value>131072</
value> </property> <!-- appoint hadoop Temporary directory , Self created --> <property> <name>hadoop.tmp.dir</
name> <value>/home/zd/hadoop/tmp</value> </property> </configuration>
---------------hdfs-site.xml---------------<configuration> <property> <name>
dfs.namenode.secondary.http-address</name> <value>hadoop1:50090</value> </
property> <property> <name>dfs.replication</name> <value>2</value> </property> <
property> <name>dfs.namenode.name.dir</name> <value>
file:/home/zd/hadoop/hdfs/name</value> </property> <property> <name>
dfs.datanode.data.dir</name> <value>file:/home/zd/hadoop/hdfs/data</value> </
property> </configuration>
---------------yarn-site.xml---------------<configuration> <!-- Site specific
YARN configuration properties --> <!-- Configurations for ResourceManager --> <
property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</
value> </property> <property> <name>yarn.resourcemanager.address</name> <value>
hadoop1:8032</value> </property> <property> <name>
yarn.resourcemanager.scheduler.address</name> <value>hadoop1:8030</value> </
property> <property> <name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop1:8031</value> </property> <property> <name>
yarn.resourcemanager.admin.address</name> <value>hadoop1:8033</value> </property
> <property> <name>yarn.resourcemanager.webapp.address</name> <value>
hadoop1:8088</value> </property> </configuration>
---------------mapred-site.xml---------------<configuration> <property> <name>
mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name
>mapreduce.jobhistory.address</name> <value>hadoop1:10020</value> </property> <
property> <name>mapreduce.jobhistory.address</name> <value>hadoop1:19888</value>
</property> </configuration>
Copy to from node
scp -r hadoop-2.7.5 hadoop2: sudo mv hadoop-2.7.5 /opt/
cd /opt/hadoop-2.7.5 hdfs namenode -format
cd /opt/hadoop-2.7.5/sbin ./start-all.sh
testing , Web page detection
to configure hbase environment variable
sudo vim /etc/profile.d/hbase1.2.6.sh export HBASE_HOME=/opt/hbase-1.2.6 export
inside jdk to configure
vim /opt/hbase-1.2.6/conf/hbase-env.sh modify # export JAVA_HOME=/opt/jdk1.8.0_161
configuration file
---------------hbase-site.xml---------------<configuration> <property> <name>
hbase.rootdir</name> <!-- hbase Storage data directory --> <value>
hdfs://Hadoop1:9000/zd/hbase/hbase_db</value> <!--
Port to and Hadoop Of fs.defaultFS Port consistency --> </property> <property> <name>
hbase.cluster.distributed</name> <!-- Distributed deployment or not --> <value>true</value> </property
> </configuration>
This is not the same as the above tutorial
cd /opt/hbase-1.2.6/conf vim regionservers Remove default localhost, join hadoop2,hadoop3
Copy to from node
scp -r hbase-1.2.6 hadoop2:
zookeeper Environment variables for
sudo vim /etc/profile.d/zookeeper-3.4.8.sh export ZOOKEEPER=/opt/zookeeper-3.4.
8 export PATH=$PATH:$ZOOKEEPER/bin source /etc/profile
to configure
cp zoo_sample.cfg zoo.cfg modify : dataDir=/opt/zookeeper-3.4.8/data add to :
server.1=hadoop1:2888:3888 server.2=hadoop2:2888:3888 server.3=hadoop3:2888:3888
Send to from node
scp -r zookeeper-3.4.8 hadoop2:
to configure
Under three machines data Create one in the directory myid Documents of cd /usr/tools/zookeeper-3.4.8/data vi myid
And fill in the numbers master yes server.1,myid Fill in 1 slaver1 yes server.2,myid Fill in 2 slaver2 yes server.3
,myid Fill in 3 Three start respectively zookeeper zkServer.sh start then , View status on each machine : zkServer.sh status
The results are as follows :ZooKeeper JMX enabled by default Using config: /usr/tools/zookeeper-3.4.8
/bin/../conf/zoo.cfgMode: follower use jps see : jps result QuorumPeerMain
complete , Thank you for reading .