hadoop的安装与配置（完全分布式）

完全分布式模式：

　　前面已经说了本地模式和伪分布模式，这两种在hadoop的应用中并不用于实际，因为几乎没人会将整个hadoop集群搭建在一台服务器上（hadoop主要是围绕：分布式计算和分布式存储，如果以一台服务器做，那就完全违背了hadoop的核心方法）。简单说，本地模式是hadoop的安装，伪分布模式是本地搭建hadoop的模拟环境。（当然实际上并不是这个样子的，小博主有机会给大家说！）

那么在hadoop的搭建，其实真正用于生产的就是完全分布式模式：

思路简介

域名解析

ssh免密登陆

java和hadoop环境

配置hadoop文件

复制主节点到其他节点

格式化主节点

hadoop搭建过程+简介

在搭建完全分布式前大家需要了解以下内容，以便于大家更好的了解hadoop环境：

1.hadoop的核心：分布式存储和分布式计算（用官方的说法就是HDFS和MapReduce）

2.集群结构：1+1+n 集群结构（主节点+备用节点+多个从节点）

3.域名解析：这里为了方便，我们选择修改/etc/hosts实现域名解析（hadoop会在…/etc/hadoop/salves下添加从节点，这里需要解析名，当然你也能直接输入ip地址，更简单）

4.hadoop的命令发放，需要从ssh接口登录到其他服务器上，所以需要配置ssh免密登陆

5.本文采取1+1+3 集群方式：域名为：s100（主）,s10（备主）,s1,s2,s3（从）

一：配置域名解析

主——s100:

[root@localhost ~]# vim /etc/hosts

1 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
2 ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
3 192.168.1.68 s100
4 192.168.1.108 s1
5 192.168.1.104 s2
6 192.168.1.198 s3
7 192.168.1.197 s10

将s100上的/etc/hosts拷贝到其他hadoop的集群服务器上。例如：

将s100的/etc/hosts拷贝到s1上

[root@localhost ~]# scp /etc/hosts root@192.168.1.108:/etc/hosts
The authenticity of host \'192.168.1.108 (192.168.1.108)\' can\'t be established.
RSA key fingerprint is dd:64:75:5f:96:11:07:39:a3:fb:aa:3c:30:ae:59:82.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added \'192.168.1.108\' (RSA) to the list of known hosts.
root@192.168.1.108\'s password: 
hosts                                                    100%  246     0.2KB/s   00:00

将所有服务器的域名解析配置完成，进行下一步

二：配置ssh免密码登录主——s100：

ssh生成相应密钥对：id_rsa私钥和id_rsa.pub公钥

[root@localhost ~]# ssh-keygen -t rsa -P \'\'
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
a4:6e:8d:31:66:e1:92:04:37:8e:1c:a5:83:5e:39:c5 root@localhost.localdomain
The key\'s randomart image is:
+--[ RSA 2048]----+
|  o.=.           |
| o BoE           |
|. =+o . .        |
|. .o.o +         |
| .  o B S        |
|     = =         |
|      + .        |
|     .           |
|                 |
+-----------------+
[root@localhost ~]# cd /root/.ssh/
[root@localhost .ssh]# ls
id_rsa  id_rsa.pub  known_hosts

默认是存在/当前user/.ssh（/root/.ssh或者/home/user/.ssh）下的！

有了密钥对：将id_rsa.pub加到授权中：

[root@localhost .ssh]# cat id_rsa.pub >> authorized_keys（/root/.ssh下）
[root@localhost .ssh]# ls
authorized_keys  id_rsa  id_rsa.pub  known_hosts

试一下是否本地免密登陆设置成功：

[root@localhost .ssh]# ssh localhost
The authenticity of host \'localhost (::1)\' can\'t be established.
RSA key fingerprint is 9e:e0:91:0f:1f:98:af:1a:83:5d:33:06:03:8a:39:93.
Are you sure you want to continue connecting (yes/no)? yes（第一次登陆需要确定）
Warning: Permanently added \'localhost\' (RSA) to the list of known hosts.
Last login: Tue Dec 26 19:09:23 2017 from 192.168.1.156
[root@localhost ~]# exit
logout
Connection to localhost closed.

ok！没有问题，那么配置其他服务器，其实只需要把本机s100的id_rsa.pub复制到其他服务器上就可以了！

这里就选择ssh-copy-id命令传送到其他服务器上

[root@localhost .ssh]# ssh-copy-id root@s1（s1是主机地址，这里提醒大家一下，因为有人因为这个问题问过我╭(╯^╰)╮）
The authenticity of host \'s1 (192.168.1.108)\' can\'t be established.
RSA key fingerprint is dd:64:75:5f:96:11:07:39:a3:fb:aa:3c:30:ae:59:82.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added \'s1\' (RSA) to the list of known hosts.
root@s1\'s password: 
Now try logging into the machine, with "ssh \'root@s1\'", and check in:
  .ssh/authorized_keys
to make sure we haven\'t added extra keys that you weren\'t expecting.

主节点

三：配置java环境和安装hadoop（hadoop环境）

备注：这里小伙伴必须要知道的是，不管hadoop的主节点还是从节点甚至说备主节点，他们的java环境和hadoop环境都一样，所以我们只需要配置完一台服务器，就完成了所有的配置过程

因为完全分布模式也是在本地模式的基础上配置的，所以我们首先配置本地模式：

完全分布式模式 = 本地模式 + 配置文件

java环境和hadoop的安装等过程就是前面所说的本地模式了，这里就不多说了：

四：配置内容：

备注：对于配置文件以后会有时间会单独写一篇相关的文档

主要修改以下五个文件：

hadoop的配置文件：/data/hadoop/etc/hadoop
[root@localhost hadoop]# cd /data/hadoop/etc/hadoop
[root@localhost hadoop]# ls
core-site.xml
hdfs-site.xml
mapred-site.xml
yarn-site.xml4
slaves

配置 core-site.xml：

主要：指定主节点

<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://s100/</value>
        </property>
        #临时文件
        <property>
                <name>hadoop.tmp.dir</name>
                <value>/root/hadoop</value>
        </property>
</configuration>

配置hdfs-site.xml：

主要：指定备份数量

<configuration>
        #默认备份数为3，如果采取默认数，那么slaves不能少于备份数
        <property>
                <name>dfs.replication</name>
                <value>2</value>#备份数
        </property>
        #备主
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>s10:50000</value>
        </property>
 
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>file:///${hadoop.tmp.dir}/dfs/name</value>
        </property>
</configuration>

配置mapred-site.xml：

主要：指定资源管理yran方法

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
</configuration>

配置yarn-site.xml：

<configuration>
        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>s100</value>
        </property>
        
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
</configuration>

配置slaves：

s1
s2
s3

五：scp-java环境和hadoop配置文件（hadoop环境）,java环境直接安装

做到这里，基本就完成了，现在就把主节点的所以配置都放到从节点上！

scp -r /data/hadoop root@s1:/data/

复制hadoop

[root@localhost ~]# scp /etc/profile root@s1:/etc/profile

复制环境变量

登录到s1中执行source

[root@localhost ~]# ssh s1
Last login: Wed Dec 27 23:18:48 2017 from s100
[root@localhost ~]# source /etc/profile

s1配置完成，其他的服务器一样！

六：格式化主节点

[root@localhost ~]# hadoop namenode -format

启动hadoop：

start-all.sh

关闭hadoop：

stop-all.sh

jps查询进程信息

主节点：

[root@localhost ~]# jps
 30996 Jps
 30645 NameNode
 30917 ResourceManager

2主节点：

[root@localhost ~]# jps
 33571 Jps
 33533 SecondaryNameNode

从节点：

[root@localhost ~]# jps
 33720 Jps
 33691 NodeManager
 33630 DataNode

本文链接：https://www.cnblogs.com/xieys-1993/articles/11983132.html

hadoop的安装与配置（完全分布式）

hadoop搭建过程+简介

hadoop的安装与配置（完全分布式）的更多相关文章

随机推荐

热门专题

目录导航