目录
1、同步/usr/local/hadoop目录文件到slave节点
2、同步/home/hadoopdir目录文件到slave节点
今天我们基于开源软件搭建满足企业需求的Hadoop生态系统,构建基础的大数据分析平台
准备3台机器搭建Hadoop完全分布式集群,其中1台机器作为Master节点,另外两台机器作为Slave节点,主机名分别为Slave1和Slave2
准备
资源准备
|-----------|-----------------------|
| 资源名称 | 存储目录 |
| hadoop安装包 | /opt/package/software |
- 检查实验环境(防火墙、hosts配置、ssh互信)
- 部署hadoop集群(安装hadoop、创建hdfs数据文件、修改配置文件、主从节点同步)
- 测试hadoop集群(启动集群、验证集群)
实验架构
在目录/usr/local/下 设置主机名,ip与机器名映射关系
|----------------|--------|--------------------------|
| ip地址m | 机器名 | 类型 |
| 192.168.10.147 | master | NameNode ResourceManager |
| 192.168.10.148 | slave1 | DataNode NodeManger |
| 192.168.10.149 | slave2 | DataNode NodeManger |
环境准备
- Hadoop2.7.5
- VMware Workstation 15.1.0 Pro for Windows
- 虚拟机镜像
实验步骤
(一)查看环境
#关闭防火墙命令
root@slave1 \~\]# systemctl stop firewalld.service
#### 1、检查防火墙是否关闭
\[root@slave1 \~\]# firewall-cmd --state
not runningh3
#### 2、检查三台虚拟机hosts文件
\[root@master \~\]# cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.80.101 master
192.168.80.102 slave1
192.168.80.103 slave2
#### 3、检查ssh环境
\[root@master \~\]# ssh slave1 date
Mon Nov 19 10:23:43 CST 2018
\[root@master \~\]# ssh slave2 date
Mon Nov 19 10:23:52 CST 2018
### (二)部署hadoop集群
#### 1、安装haoop
**#解压安装包h3**
\[root@master \~\]# tar zxvf /opt/package/software/hadoop-2.7.3.tar.gz -C /usr/local
**#重命名Hadoop安装目录**
\[root@master \~\]# mv /usr/local/hadoop-2.7.3 /usr/local/hadoop
#### 2、创建hdfs数据文件存储目录
**#删除并创建hdfs数据文件存储目录**
\[root@master \~\]# rm -rf /home/hadoopdir
\[root@master \~\]# mkdir /home/hadoopdir
**#创建临时文件存储目录**
\[root@master \~\]# mkdir /home/hadoopdir/tmp
**#创建namenode数据目录**
\[root@master \~\]# mkdir -p /home/hadoopdir/dfs/name
**#创建datanode数据目录**
\[root@master \~\]# mkdir /home/hadoopdir/dfs/data
#### 3、修改配置文件
**1)配置环境变量**
**#检查环境变量**
\[root@master \~\]# vi /etc/profile
**export HADOOP_INSTALL=/usr/local/hadoop**
**export PATH=${HADOOP_INSTALL}/bin:${HADOOP_INSTALL}/sbin:${PATH}**
**#/etc/profile文件生效**
\[root@master \~\]# source /etc/profile
**#hadoop-env.sh配置JAVA_HOME**
\[root@master \~\]# vi /usr/local/hadoop/etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/local/jdk/jre
**#验证Hadoop版本**
\[root@master \~\]# hadoop version
Hadoop 2.7.5
**2)修改core-site.xml内容参考如下**
\[root@master \~\]# vim /usr/local/hadoop/etc/hadoop/core-site.xml
master: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-master.out
slave1: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-slave1.out
slave2: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-slave2.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:lrhnnND23cf0F9Azp4qUwS+Ek6+LscJ28CRce/NofA0.
ECDSA key fingerprint is MD5:56:6b:86:5e:df:6f:4f:70:af:fc:3f:d2:81:c8:a8:e6.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-root-secondarynamenode-master.out
starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-root-resourcemanager-master.out
slave1: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-slave1.out
slave2: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-slave2.out
(六)验证hadoop集群
1、JPS查看Java进程
#master
root@master \~\]# jps 7779 Jps 7349 SecondaryNameNode 7499 ResourceManager 7134 NameNode #slave1 \[root@slave1 \~\]# jps 3169 DataNode 3445 Jps 3277 NodeManager #slave2 \[root@slave2 \~\]# jps 3270 NodeManager 3162 DataNode 3391 Jps #### 2、登录网页查看 打开浏览器,登录http://master:50070  正在上传...重新上传取消 打开浏览器,查看yarn环境,登录[http://master:8088](http://master:8088 "http://master:8088")  正在上传...重新上传取消 采用完全分布式集群安装方式,需要提前部署JDK环境、SSH验证等过程。安装并启动后可以访问 Web 界面 [http://localhost:50070](http://localhost:50070/ "http://localhost:50070") 查看 NameNode 和 Datanode 信息,还可以在线查看 HDFS 中的文件。