Hadoop开发实战:https://www.borimooc.com/course/1004.htm
hadoop是适合海量数据的分布式存储,和分布式计算的框架
- mapreduce:适合海量数据的分布式计算,分为map阶段、shuffle阶段和reduce阶段
- hdfs:分布式文件系统,适合海量数据的分布式存储
- yarn:资源调度工具
在使用Hadoop之前,首先要了解Hadoop的安装模式。
- 单机(Standalone mode)
- 伪分布式模式(Pseudo-Distributed mode)
- 全分布式模式(Cluster mode)
具体介绍如下:
- 单机模式:Hadoop的默认模式是单机模式。在不了解硬件安装环境的情况下,Hadoop第一次解压其源码包时,它保守地选择了最小配置。Hadoop会完全运行在本地。因此它不需要与其他节点进行交互,那么它也就不使用HDFS,也不加载任何Hadoop守护进程。单机模式不需要启动任何服务即可使用,一般只用于调试。
- 伪分布式模式:伪分布式模式是完全分布式模式的一个特例。Hadoop程序的守护进程运行在一台节点上,使用伪分布式模式来调试Hadoop分布式程序中的代码,以及验证程序是否准确执行。
1 用户权限
在安装Hadoop集群中所有的组件全部使用hd用户安装。(上传、解压、配置、启动、关闭等)、
root 是Linux系统的超级用户。一般不要使用这个用户。在我们安装集群的时候,只有在修改/etc/profile的文件的时候才会需要用到root用户。
在进行Hadoop集群安装的时候请再三确认当前用户是谁
root@localhost \~\] 当前用户是root localhost计算机名 \~用户的根目录(/root)  
\[hd@localhost \~\] 当前用户是hd localhost计算机名 \~用户的根目录(/home/hd)
\`\`\`Plain Text root 超级管理员(系统文件修改)
hd 普通角色用户 (在/home/hd/\* 进行所有增删改查)
# 2 获取机器的IP地址  
\`\`\`PowerShell  
\[root@localhost \~\]# ifconfig  
eno16777736: flags=4163\ mtu 1500  
inet 192.168.126.128 netmask 255.255.255.0 broadcast 192.168.126.255
如果没有查看IP的命令就安装net-tools
\[hd@localhost root\]$ su root  
Password:  
\[root@localhost \~\]*# yum install -y net-tools*
## ******3 设置网卡为静态的地址******
方法一:
#切换root用户  
\[hd@bogon Desktop\]$ su root  
Password:  
#修改网卡  
\[root@bogon Desktop\]# vi /etc/sysconfig/network-scripts/ifcfg-eth0  
#把以下这些修改就可以:  
BOOTPROTO="static" #修改  
ONBOOT="yes" #修改  
IPADDR=192.168.245.20 #修改  
#重启网卡  
\[root@bogon Desktop\]# service network restart  
重新登录再查看IP信息  
\[root@bogon Desktop\]# ifconfig
方法二:
网络连接方式为nat
*image-20210605110613959*
查询自己的网关
*在这里插入图片描述*
在终端输入 nmtui 命令,进入 NetworkManager
Plain Text \[root@local \~\]nmtui
*image-20210605105104249*
根据查看到的网关,添加到Gateway中。address为ip地址。Dns为8.8.8.8
*image-20210605105220429*
设置好ip之后进行保存,重启网络服务
*image-20210605110018031* 
*image-20210605110355272*
本地ping一下。保证网络畅通。
*image-20210605110209887*
使用shell工具进行远程连接
*image-20210605111101609*
## ******4 Linux安装Java环境******
因为Hadoop由Java语言开发,Hadoop集群的使用同样依赖于Java环境,所以在安装Hadoop集群前,需要先安装并配置好JDK。
### ******4.1 把Linux自带Java环境删除******
\[hd@localhost \~\]$ su root  
Password:   
\[root@localhost hd\]*# yum remove -y java\**
### ******4.2 上传Java包******
\[root@localhost hd\]*# su hd*   
\[hd@localhost \~\]$  
\[hd@localhost \~\]$  
\[hd@localhost \~\]$ pwd  
/home/hd  
\[hd@localhost \~\]$ mkdir apps *#上传到此目录*   
\[hd@localhost \~\]$ cd apps/  
\[hd@localhost apps\]$  
*#上传过程*   
\[hd@localhost apps\]$ ll  
total 178952  
-rw-rw-r--. 1 hd hd 183246769 Apr 26 2018 jdk-8u121-linux-x64.tar.gz
### ******4.3 解压java包******
*#解压*   
\[hd@localhost apps\]$ tar -zxvf jdk-8u121-linux-x64.tar.gz   
\[hd@localhost apps\]$ ll  
total 178956  
drwxr-xr-x. 8 hd hd 4096 Dec 12 2016 jdk1.8.0_121  
-rw-rw-r--. 1 hd hd 183246769 Apr 26 2018 jdk-8u121-linux-x64.tar.gz  
\[hd@localhost apps\]$  
*#目录改名*   
\[hd@localhost apps\]$ mv jdk1.8.0_121/ java   
\[hd@localhost apps\]$ ll  
total 178956  
drwxr-xr-x. 8 hd hd 4096 Dec 12 2016 java  
-rw-rw-r--. 1 hd hd 183246769 Apr 26 2018 jdk-8u121-linux-x64.tar.gz
### ******4.4 配置java环境******
\[hd@localhost apps\]$ su root  
Password:   
\[root@localhost apps\]*# cd java/*   
\[root@localhost java\]*# pwd*   
/home/hd/apps/java  
\[root@localhost java\]*#*   
\[root@localhost java\]*# vi /etc/profile*
使用vi编辑器,在/etc/profile增加java环境变量
Properties files export JAVA_HOME=/home/hd/apps/java export PATH=$PATH:$JAVA_HOME/bin
重加载一下系统环境
\[root@localhost java\]*# source /etc/profile*   
\[root@localhost java\]*# java -version*   
java version "1.8.0_121"  
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)  
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
### ******4.5 配置第二台,第三台机器的Java环境******
使用scp远程拷贝命令
scp file2 \[\[user@\]host2:\]file2
1.把每一台机的java目录,拷贝到第二台机器
\[root@localhost apps\]*# su hd*   
\[hd@localhost apps\]$  
\[hd@localhost apps\]$ scp -r java hd@192.168.126.129:/home/hd/apps/
2.把每一台机的profile文件,拷贝到第二台机器
\[hd@localhost apps\]$ su root  
Password:   
\[root@localhost apps\]*# scp /etc/profile root@192.168.126.129:/etc/*   
The authenticity of host '192.168.126.129 (192.168.126.129)' can't be established.  
ECDSA key fingerprint is fb:0a:7a:9f:9a:bc:4f:ff:66:29:1d:1d:b9:a0:35:d1.  
Are you sure you want to **continue** connecting (yes/no)? yes  
Warning: Permanently added '192.168.126.129' (ECDSA) to the list of known hosts.  
root@192.168.126.129's password:   
profile 100% 1820 1.8KB/s 00:00   
\[root@localhost apps\]*#*
3.第二台机器加载profile
\[hd@localhost apps\]$ source /etc/profile  
\[hd@localhost apps\]$  
\[hd@localhost apps\]$ java -version  
java version "1.8.0_121"  
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)  
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)  
\[hd@localhost apps\]$
第三台机,执行以上的步骤
## ******5 安装hadoop之前准备******
### ******5.1 修改主机名******
1. 第一台机器master
2. 第二台机器slave01
3. 第三台机器slave02
\[hd@localhost \~\]$ hostnamectl set-hostname master  
==== AUTHENTICATING FOR org.freedesktop.hostname1.set-static-hostname ===  
Authentication is required to set the statically configured local host name, as well as the pretty host name.  
Authenticating as: root  
Password:  
==== AUTHENTICATION COMPLETE ===
Plain Text \[hd@localhost \~\]$ hostnamectl set-hostname slave01
Plain Text \[hd@localhost \~\]$ hostnamectl set-hostname slave02
方法二
\[root@localhost \~\]$nmtui
### ******5.2 修改/etc/hosts 文件******
\[hd@master \~\]$ su root  
Password:  
\[root@master hd\]# vi /etc/hosts  
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4  
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6  
192.168.126.128 master  
192.168.126.129 slave01  
192.168.126.130 slave02
同步到第二,三台机器
#第二台机器  
\[root@master hd\]# scp /etc/hosts root@slave01:/etc/  
#第三台机器  
\[root@master hd\]# scp /etc/hosts root@slave02:/etc/
### ******5.3 关闭防火墙******
启动:service firewalld start  
systemctl start firewalld  
查看状态:service firewalld status  
systemctl status firewalld  
停止: service firewalld disable  
systemctl disable firewalld  
禁用:service firewalld stop  
systemctl stop firewalld  
重启:service firewalld restart  
systemctl restart firewalld
### ******5.4 免密登录******
需要做的免密的机器
Plain Text 机器----\>机器(免密登录) master ----\> slave01 master ----\> slave02 master ----\> master
#### ******5.4.1 生成密钥******
\[hd@master \~\]$ ssh-keygen  
Generating public/private rsa key pair.  
Enter file in which to save the key (/home/hd/.ssh/id_rsa):  
Enter passphrase (empty for no passphrase):  
Enter same passphrase again:  
Your identification has been saved in /home/hd/.ssh/id_rsa.  
Your public key has been saved in /home/hd/.ssh/id_rsa.pub.  
The key fingerprint is:  
ef:ff:98:6c:a4:66:ca:66:a0:cd:a4:da:75:9c:c0:9f hd@slave02  
The key's randomart image is:  
+--\[ RSA 2048\]----+  
\| \|  
\| \|  
\| \|  
\| . \|  
\| o S \|  
\| o+ + . \|  
\| \*..E .o \|  
\| .o.ooo.+..o \|  
\| ... oo+.o=.. \|  
+-----------------+
#### ******5.4.2 拷贝密钥到你需要免密登录的机器******
\[hd@master \~\]$ ssh-copy-id slave02  
The authenticity of host 'slave02 (192.168.126.130)' can't be established.  
ECDSA key fingerprint is 09:57:a3:56:3b:5f:f0:01:55:0e:42:f3:4c:43:3d:d5.  
Are you sure you want to continue connecting (yes/no)? yes  
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed  
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys  
hd@slave02's password:  
  
Number of key(s) added: 1  
  
Now try logging into the machine, with: "ssh 'slave02'"  
and check to make sure that only the key(s) you wanted were added.
#### ******5.4.3 测试免密登录******
\[hd@master \~\]$ ssh slave02
## ******6 Hadoop安装******
### ******6.1 上传hadoop安装包******
### ******6.2 解压安装包******
\[hd@master apps\]$ su hd  
Password:  
\[hd@master apps\]$ pwd  
/home/hd/apps  
\[hd@master apps\]$ tar -zxvf hadoop-2.8.1.tar.gz
### ******6.3 改目录名称******
hd@master apps\]$ mv hadoop-2.8.1 hadoop  
\[hd@master apps\]$ ll
### ******6.4 修改hadoop配置文件******
#### ******6.4.1 修改hadoop-env.sh******
/hadoop/etc/hadoop/
#在文件的尾部(按"G"可以跳到文档的尾部),增加  
export JAVA_HOME=/home/hd/apps/java
#### ******6.4.2 修改core-site.xml******
\<**configuration** \>  
*\