hadoop集群搭建 (超详细) 接入Impala、Hive,AI 大模型的数据底座

Cloudera CDP7.3 统信(海光cpu)uel20-1070-e安装CMP v7.13(国产CDH、CDP)指南

(含文件下载)

目录

[一、 安装准备......................................................................................................... 2](#一、 安装准备......................................................................................................... 2)

[二、操作步骤........................................................................................................... 2](#二、操作步骤........................................................................................................... 2)

[第1步 操作系统设置(全部节点等同设置)........................................ 2](#第1步 操作系统设置(全部节点等同设置)........................................ 2)

[第2步 安装Python3.9.14........................................................................... 4](#第2步 安装Python3.9.14........................................................................... 4)

[第3步 安装postgresql (HUE需要server)......................................... 4](#第3步 安装postgresql (HUE需要server)......................................... 4)

[第4步 安装openjdk8-8.0+372_1-cloudera.x86_64.rpm的java(所有节点).................................................................................................................... 6](#第4步 安装openjdk8-8.0+372_1-cloudera.x86_64.rpm的java(所有节点).................................................................................................................... 6)

[第5步 用httpd部署自己的YUM源........................................................ 7](#第5步 用httpd部署自己的YUM源........................................................ 7)

[第6步 安装libffi及os-release包............................................................. 8](#第6步 安装libffi及os-release包............................................................. 8)

[第7步 添加用户.......................................................................................... 8](#第7步 添加用户.......................................................................................... 8)

[第8步 安装mysql8.0.39及创建数据库................................................... 8](#第8步 安装mysql8.0.39及创建数据库................................................... 8)

[第9步 离线安装daemons、agent,server,support-cdh6(可选)可以自建httpd的yum源安装....................................................................... 11](#第9步 离线安装daemons、agent,server,support-cdh6(可选)可以自建httpd的yum源安装....................................................................... 11)

[第10步 python验证安装......................................................................... 12](#第10步 python验证安装......................................................................... 12)

[第11步 配置server和agent................................................................... 12](#第11步 配置server和agent................................................................... 12)

[第12步 server和agent的启动等命令.................................................. 13](#第12步 server和agent的启动等命令.................................................. 13)

[第13步 WEBUI登录.................................................................................. 13](#第13步 WEBUI登录.................................................................................. 13)

[第14步 安装CDH时检测"Host Inspector"....................................... 13](#第14步 安装CDH时检测“Host Inspector”....................................... 13)

[第15步 WEBUI的CDH安装.................................................................... 14](#第15步 WEBUI的CDH安装.................................................................... 14)

[三、 集群设置安装向导...................................................................................... 17](#三、 集群设置安装向导...................................................................................... 17)

[1. 选择需要安装的服务:........................................................................... 17](#1. 选择需要安装的服务:........................................................................... 17)

[2. 点击"继续",进入集群角色分配,一台机器作为管理节点,另外三台机器作为DataNode:.............................................................................. 18](#2. 点击“继续”,进入集群角色分配,一台机器作为管理节点,另外三台机器作为DataNode:.............................................................................. 18)

[3. 点击"继续",进入下一步,测试数据库连接:.............................. 19](#3. 点击“继续”,进入下一步,测试数据库连接:.............................. 19)

[四、CMP 服务开启高可用.................................................................................. 21](#四、CMP 服务开启高可用.................................................................................. 21)

[3.1 HDFS 高可用.......................................................................................... 21](#3.1 HDFS 高可用.......................................................................................... 21)

[3.2 YARN 高可用.......................................................................................... 21](#3.2 YARN 高可用.......................................................................................... 21)

[五、使用 Haproxy 给 CMP 服务配置负载均衡............................................. 21](#五、使用 Haproxy 给 CMP 服务配置负载均衡............................................. 21)

[六、CDP 集群组件功能测试............................................................................... 22](#六、CDP 集群组件功能测试............................................................................... 22)

[6.1 HDFS 可用性测试.................................................................................. 22](#6.1 HDFS 可用性测试.................................................................................. 22)

[6.2 HIVE 可用性测试.................................................................................. 22](#6.2 HIVE 可用性测试.................................................................................. 22)

[6.3 Mapreduce 可用性测试....................................................................... 22](#6.3 Mapreduce 可用性测试....................................................................... 22)

[6.4 Spark 可用性测试................................................................................ 23](#6.4 Spark 可用性测试................................................................................ 23)

下载地址:

链接: https://pan.baidu.com/s/1VZTI__mUL6LIu3HlsmtScg 提取码:gkey1

  • 安装准备

Openeuler、统信uel20-1070-ede 安装根分区必须大于200G

/dev/mapper/uos-root 分区"/"

1.1 集群内的管理节点、工作节点必须做到互信,标识网络标志;关闭iptable等防火墙;

1.2 集群内的节点都要做到基础条件安装测试完毕;

1.3 平台用到mysql;

1.4 必须安装和启用Kerberos,并启用转发;

1.5 Java使用372版本,mysql的share库mysql-connector-j-8.0.33-1.el8.noarch.rpm,postgresql的share库在/opt/cloudera/cm/lib/postgresql-42.7.2.jar;

1.6 kafka需要启动kerberos,启用转发、自更新;

1.7安装、启用chrony或ntp是必须的;

1.8 mysql数据库、操作系统用户列表要提前创建;

1.9 以上测试没问题后,按照1.10操作

1.10节点采用离线安装daemons、agent,server(只一台);

1.11 集群的添加和创建用WEBUI,灵活配置。

二、操作步骤

复制代码
统信uel20-1070-ede 安装根分区必须大于200G
复制代码
/dev/mapper/uos-root分区"/"

第1步 操作系统设置(全部节点等同设置)

hostnamectl set-hostname master10.cmp.cn 192.168.200.210

。。。。。。

#vim /etc/hosts

192.168.13.152 master10.cmp.cn master10

。。。。。。

vim /etc/sysconfig/network

HOSTNAME=master10.cmp.cn

复制代码
互信

ssh-keygen -t rsa

ssh-copy-id 192.168.200.210

复制代码
#yum install vim tmux lrzsz rsync unzip wget -y

#yum install mod_ssl -y

复制代码
# 停止firewall
复制代码
systemctl stop firewalld.service# 禁止firewall开机启动
复制代码
systemctl disable firewalld.service# 确认关闭
复制代码
firewall-cmd --state# 临时关闭
复制代码
setenforce 0# 永久关闭# /etc/sysconfig/selinux是/etc/selinux/config的一个软链接
复制代码
sed -i s@enforcing@disabled@g /etc/selinux/config
复制代码
systemctl stop tuned
复制代码
systemctl disable tuned
复制代码
# 临时关闭
复制代码
echo never > /sys/kernel/mm/transparent_hugepage/enabled
复制代码
echo never > /sys/kernel/mm/transparent_hugepage/defrag
复制代码
#永久关闭
复制代码
echo "echo never > /sys/kernel/mm/transparent_hugepage/enabled" >> /etc/rc.d/rc.local && echo "echo never > /sys/kernel/mm/transparent_hugepage/defrag" >>/etc/rc.d/rc.local && chmod +x /etc/rc.d/rc.local
复制代码
# 确认关闭
复制代码
grep -i HugePages_Total /proc/meminfo
复制代码
cat /proc/sys/vm/nr_hugepages
复制代码
sysctl vm.nr_hugepages
复制代码
# for runtime effect
复制代码
sudo sysctl -w vm.swappiness=1
复制代码
# for permanent effect
复制代码
echo vm.swappiness=1 >> /etc/sysctl.conf
复制代码
# check
复制代码
cat /proc/sys/vm/swappiness
复制代码
修改# vim /etc/security/limits.conf,添加
复制代码
*  soft nofile 128000
复制代码
*  hard nofile 128000
复制代码
*  soft nproc 128000
复制代码
*  hard nproc 128000
复制代码
修改umask有时候kylin会把umask设置成0077,导致了很多问题
复制代码
# 临时生效umask 0022
复制代码
删除/etc/bashrc最后一行umask 0077

第2步 安装Python3.9.14

复制代码
#yum install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gcc make libffi-devel -y
复制代码
#yum install wget -y
复制代码
#cd /opt
复制代码
#wget https://mirrors.aliyun.com/python-release/source/Python-3.9.14.tgz
复制代码
tar -zxvf Python-3.9.14.tgz
复制代码
cd Python-3.9.14# 安装位置为/usr/local/python3目录
复制代码
./configure --enable-shared
复制代码
make && make altinstall

#python3.9 --version

pip3.9 list

复制代码
验证

python3.9

Python 3.9.14 (main, Feb 7 2025, 11:19:13)

GCC 7.3.0\] on linux Type "help", "copyright", "credits" or "license" for more information. \>\>\> import ssl \>\>\> ssl.OPENSSL_VERSION 'OpenSSL 1.1.1f 31 Mar 2020' \>\>\> import hashlib \>\>\> hashlib.sha256('the _hashlib module didnt build!'.encode('ascii')).hexdigest() '4f61f9875aaac7e19e4a08f6bae49128cbdb8c9586ff22d41c5c6c9916fa6a97' \>\>\> ### **第3步 安装postgresql (HUE需要server)** ``` yum install postgresql-server postgresql-devel python3-devel -y ``` ``` ``` ``` # HUE需要psycopg2 ``` ``` pip3.9 install -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host mirrors.aliyun.compsycopg2 ``` ``` Successfully installed psycopg2-2.9.10 ``` ``` 配置yarn数据库,并能外机访问 ``` ``` 。。。。。。 ``` ``` 初始化和修改配置 ``` rm -rf /var/lib/pgsql/ #chmod -R 777 /var/lib/pgsql/ ``` yum install postgresql-server postgresql-devel python3-devel -y ``` ``` postgresql-setup initdb ``` ``` cat - > /var/lib/pgsql/data/pg_hba.conf << EOF ``` ``` # "local" is for Unix domain socket connections only ``` ``` local all all peer ``` ``` local all posgtres trust ``` ``` local all all md5 ``` ``` # IPv4 local connections: ``` ``` host all all 127.0.0.1/32 trust ``` ``` host all all 0.0.0.0/0 md5 ``` ``` # IPv6 local connections: ``` ``` host all all ::1/128 ident ``` ``` # Allow replication connections from localhost, by a user with the ``` ``` # replication privilege. ``` ``` local replication all peer ``` ``` host replication all 127.0.0.1/32 ident ``` ``` host replication all ::1/128 ident ``` ``` EOF ``` ``` sed -i 's/max_connections = 100/max_connections=1000/' /var/lib/pgsql/data/postgresql.conf ``` ``` echo "listen_addresses = '*'" >> /var/lib/pgsql/data/postgresql.conf ``` ``` systemctl enable postgresql ``` ``` systemctl start postgresql ``` ``` 创建数据库和用户 ``` ``` 使用管理员登录数据库 ``` ``` sudo -u postgres psql ``` ALTER USER postgres PASSWORD 'Redhat_ARM64'; \\q ``` 切换到root用户,重启postgresql服务 ``` systemctl restart postgresql ``` 修改默认生成的 postgres 用户密码(此postgres非上面的postgres用户,此为数据库的用户,上面的为操作系统的用户) ``` su - postgres psql -U postgres alter user postgres with encrypted password 'Redhat_ARM64'; ``` 创建用户和数据库并授权 ``` create user das with password 'Redhat_ARM64'; // 创建用户 create database das owner das; // 创建数据库 grant all privileges on database das to das; // 授权 **结果:** 主机名:192.168.13.157 端口:5432 用户名:postgres 密码:Redhat_ARM64 Root/Redhat_ARM64 create user root with password 'Redhat_ARM64'; create database yarn owner root; // 创建数据库 grant all privileges on database yarn to root; // 授权 ``` ``` ### **第4步 安装openjdk8-8.0+372_1-cloudera.x86_64.rpm的java(所有节点)** ``` #yum localinstall openjdk8-8.0+372_1-cloudera.x86_64.rpm -y ``` # yum localinstall /opt/mysql-connector-j-8.0.33-1.el8.noarch.rpm # mysql-connector-java-8.0.25.jar放到/usr/share/java/ #ln -s /usr/share/java/mysql-connector-java-8.0.25.jar /usr/share/java/mysql-connector-java.jar ``` 安装后的路径/usr/java/jdk1.8u372-b07-cloudera ``` ``` 需要配置 ``` ``` vim /etc/profile ``` ``` export JAVA_HOME=/usr/java/jdk1.8u372-b07-cloudera ``` ``` export PATH=$JAVA_HOME/bin:$PATH ``` ``` export CLASSPATH=.$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar ``` ``` TMOUT=0 # 设置为0表示禁用超时自动注销 ``` ``` #source /etc/profile ``` 验证Java ``` # java -version ``` ``` openjdk version "1.8.0_372" ``` ``` OpenJDK Runtime Environment (Temurin)(build 1.8.0_372-b07) ``` ``` OpenJDK 64-Bit Server VM (Temurin)(build 25.372-b07, mixed mode) ``` ``` 统信X86操作系统需要执行 ``` ``` yum install java-11-openjdk-devel ``` ``` ``` ``` #ll /usr/lib/jvm/java-11-openjdk-11.0.21.9-1.up1.uel20.01.x86_64/bin/jar ``` ``` #ll /usr/lib/jvm/java-11-openjdk-11.0.21.9-1.up1.uel20.01.x86_64/bin/java ``` ``` #ll /usr/lib/jvm/java-11-openjdk-11.0.13.9-6.ky10.x86_64/bin/jar ``` ``` #ll /usr/lib/jvm/java-11-openjdk-11.0.13.9-6.ky10.x86_64/bin/java ``` ``` 确认 ``` ``` # ll /usr/java/jdk1.8u372-b07-cloudera/bin/jar ``` ``` # ll /usr/java/jdk1.8u372-b07-cloudera/bin/java ``` ### **第5步 用httpd部署自己的YUM源** 安装 httpd web 服务 该服务主要用于提供一个内网的安装源,如操作系统源,以及之后需要的 CM 和 CMP 源。 在10.111.15.50节点,即bigdata50节点配置即可 注意:先把 /etc/yum.repos.d/下的repo文件放到bak下,否则在联网的情况下,会安装一些不匹配的包,造成http启动报错等等一些问题,在更新源 yum clean all yum makecache yum repolist #安装 \[root@bigdata50 \~\]$ yum -y install httpd #配置 修改 /etc/httpd/conf/httpd.conf 第284行 由 AddType application/x-gzip .gz .tgz 修改为 AddType application/x-gzip .gz .tgz .parcel #重启服务并设定开机启动 systemctl status httpd systemctl start httpd systemctl enable httpd #按照顺序安装,前两个包是第三个包的依赖,所以必须先安装 yum -y install yum-utils #创建本地的系统repo包路径 createrepo /var/www/html/\*\*\* 系统会根据提示安装依赖的包。 #转移默认的repo包 mkdir /etc/yum.repos.d/bak mv /etc/yum.repos.d/\* /etc/yum.repos.d/bak #编辑本地的系统repo包 vi /etc/yum.repos.d/os.repo \[CenOS8-Base

name=os_repo

baseurl=http://192.168.0.211/iso/BaseOS

enabled=true

gpgcheck=false

更新源

yum clean all

yum makecache

yum repolist

说明:本地的操作系统的yum源已经配置完成,其他主机想使用该机器的yum源,可以借助后面的脚本把os.repo同步到其他主机的目录/etc/yum.repos.d中

第6步 安装libffi及os-release包

复制代码
yum install bind-utils cyrus-sasl-gssapi mod_ssl openssl-devel portmap iperf3 -y

ll /usr/lib64/libffi.so.6

所有节点同样处理

复制代码
echo "Red Hat Enterprise Linux release 8.8 (Ootpa)" > /etc/redhat-release
复制代码
chmod 644 /etc/redhat-release

第7步 添加用户

第8步 安装mysql8.0.39及创建数据库

yum localinstall mysql-community-common-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

yum localinstall mysql-community-client-plugins-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

yum localinstall mysql-community-libs-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

yum localinstall mysql-community-client-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

yum localinstall mysql-community-icu-data-files-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

yum localinstall mysql-community-server-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

yum localinstall mysql-community-devel-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

设置"%"可访问;

HUE特殊处理

pip3.9 install mysqlclient

Hue节点安装MySQL客户端

yum install -y mysql-devel xmlsec1 xmlsec1-openssl --nogpgcheck

#yum install mysql-community-client-8.0.39 --nogpgcheck

或者

yum localinstall mysql-community-common-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

yum localinstall mysql-community-client-plugins-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

yum localinstall mysql-community-libs-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

yum localinstall mysql-community-client-8.0.39-1.el8.x86_64.rpm --nogpgcheck -y

具体以mysql 的server作为参考;

安装可能用到的数据库

CREATE DATABASE scm DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

FLUSH PRIVILEGES;

create user 'scm'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'scm '@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on scm.* to 'scm'@'%' with grant option;

FLUSH PRIVILEGES;


CREATE DATABASE amon DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

FLUSH PRIVILEGES;

create user 'amon'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'amon'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on amon.* to 'amon'@'%' with grant option;

FLUSH PRIVILEGES;

CREATE DATABASE rman DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; FLUSH PRIVILEGES;

create user 'rman'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'rman'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on rman.* to 'rman'@'%' with grant option;

FLUSH PRIVILEGES;

CREATE DATABASE hue DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; FLUSH PRIVILEGES;

create user 'hue'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'hue'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on hue.* to 'hue'@'%' with grant option;

FLUSH PRIVILEGES;

CREATE DATABASE hive DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

FLUSH PRIVILEGES;

create user 'hive'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'hive'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on hive.* to 'hive'@'%' with grant option;

FLUSH PRIVILEGES;

CREATE DATABASE ranger DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;FLUSH PRIVILEGES;

create user 'rangeradmin'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'rangeradmin'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on ranger.* to 'rangeradmin'@'%' with grant option;

FLUSH PRIVILEGES;

CREATE DATABASE rangerkms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;FLUSH PRIVILEGES;

create user 'rangerkms'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'rangerkms'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on rangerkms.* to 'rangerkms'@'%' with grant option;

FLUSH PRIVILEGES;

CREATE DATABASE nav DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

FLUSH PRIVILEGES;

create user 'nav'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'nav'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on nav.* to 'nav'@'%' with grant option;

FLUSH PRIVILEGES;

CREATE DATABASE navms DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; FLUSH PRIVILEGES;

create user 'navms'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'navms'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on navms.* to 'navms'@'%' with grant option;

FLUSH PRIVILEGES;

CREATE DATABASE oozie DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; FLUSH PRIVILEGES;

create user 'oozie'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'oozie'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on oozie.* to 'oozie'@'%' with grant option;

FLUSH PRIVILEGES;


创建安装Knox

CREATE DATABASE knox DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

FLUSH PRIVILEGES;

create user 'knox'@'%' identified by 'Redhat_ARM64';

FLUSH PRIVILEGES;

alter user 'knox'@'%' IDENTIFIED WITH mysql_native_password BY 'Redhat_ARM64';

FLUSH PRIVILEGES;

grant all privileges on amon.* to 'knox'@'%' with grant option;

FLUSH PRIVILEGES;

第9步 离线安装daemons、agent,server,support-cdh6(可选)可以自建httpd的yum源安装

#所有节点安装

yum localinstall /opt/cloudera-manager-daemons-7.13.1.100-63338448.uel20.x86_64.rpm -y

yum localinstall /opt/cloudera-manager-agent-7.13.1.100-63338448.uel20.x86_64.rpm -y

#主管理节点额外安装

yum localinstall /opt/cloudera-manager-server-7.13.1.100-63338448.uel20.x86_64.rpm -y

可选安装support-cdh6

yum localinstall cloudera-manager-support-cdh6-7.13.1.100-63338448.uel20.x86_64.rpm

第10步 python验证安装

#cd /opt/cloudera/cm-agent/bin

sh python3.9

反馈:Python 3.9.14 (main, May 17 2025, 20:44:07)

GCC 7.3.0\] on linux # sh python3.8 反馈:Python 3.9.14 (main, May 17 2025, 20:44:07) \[GCC 7.3.0\] on linux # sh /opt/cloudera/cm-agent/bin/python 反馈:Python 3.9.14 (main, May 17 2025, 20:44:07) \[GCC 7.3.0\] on linux ### **第11步 配置server和agent** 必须集群外机器能用3306端口访问mysql 必须都是ip地址,不能用host本地域名 #/opt/cloudera/cm/schema/scm_prepare_database.sh -h 192.168.200.220 mysql scm root Redhat_ARM64 --force 只要success既可 对于manifest.json可以只放最后一个 重启server或者等待3分钟既可出现 ![](https://i-blog.csdnimg.cn/direct/6c16224a6a7a44209f9903199d27e050.png) 所有agent节点,包含server节点,一般用hosts的本地域名 #vim /etc/cloudera-scm-agent/config.ini server_host=master30.cmp.cn ### **第12步 server和agent的启动等命令** systemctl status cloudera-scm-server systemctl stop cloudera-scm-server systemctl enable cloudera-scm-server systemctl restart cloudera-scm-server -------------------------------------------------------------------------------------------------------------------------- systemctl status cloudera-scm-agent systemctl stop cloudera-scm-agent systemctl enable cloudera-scm-agent systemctl restart cloudera-scm-agent ### **第13步 WEBUI登录** # 查看启动状态(等待约8分钟直到7180端口可用) netstat -anp \| grep 7180 本地访问计算机修改hosts [http://master20.cmp.cn:7180/](http://master20.caimp.cn:7180/ "http://master20.cmp.cn:7180/") 默认用户名admin默认密码admin 选择"Try \*\*\*\* for 60 days";到期后联系我们。 ### **第14步 安装CDH时检测"Host Inspector"** ![](https://i-blog.csdnimg.cn/direct/451a9def412d47edb320e8d2ef0f5b8d.png) 可以不用管,直接勾选"I understand the risks of not running the inspections or the detected issues, let me continue with cluster setup."既可。 ### **第15步 WEBUI的CDH安装** 开启webUI配置CDH,按照,检测每个节点都要有kerberos的客户端 添加集群,按照步骤依次操作即可;(以下截图在安装过程中以实际为准) ![](https://i-blog.csdnimg.cn/direct/cea7970e7dd7463eb1246151ef5094b5.png) ![](https://i-blog.csdnimg.cn/direct/e90052c218ec4c9f98e08435278ae427.png) ![](https://i-blog.csdnimg.cn/direct/ee57a34145e44cfbbeb6b0ba9cb3f4a0.png) ![](https://i-blog.csdnimg.cn/direct/b8eef996a9aa44fc8ce7259aa12f3bec.png) ![](https://i-blog.csdnimg.cn/direct/efdaabfaa4694ef2b51348f8a1fdb9fa.png) 自动进入下一步主机检查和网络检查,确保所有检查项均通过: ![](https://i-blog.csdnimg.cn/direct/a5514113765a4ba0bae3aaf98a030eaa.png) 需要手工点击进行网络性能和主机检查: ![](https://i-blog.csdnimg.cn/direct/7aba9ca9c1734724ac9a425e7981babc.png) 这一项可以不管,直接勾选点击"完成"既可。 如果有错误或者黄色警告,查看"显示检查器结果",并逐项解决,然后"重新运行"检查,直到所有的检查都通过,否则没办法点击继续下一步: ![](https://i-blog.csdnimg.cn/direct/591b5ad75d0b4b9382ba69afbf8274a2.png) 点击完成进入服务安装向导。 **三、 集群设置安装向导** ### **1. 选择需要安装的服务:** ![](https://i-blog.csdnimg.cn/direct/25c9e47d76c3466b86db36e35a182e5a.png) 自定义服务中可以看到所有组件,可以根据自己的需求来选择: ![](https://i-blog.csdnimg.cn/direct/66ddfb885d2e461d98c772dfd1574dd5.png) 1. **点击"继续",进入集群角色分配,一台机器作为管理节点,另外三台机器作为DataNode:** ![](https://i-blog.csdnimg.cn/direct/7de9107dc0364c0a8ea186afbe60dd80.png) 注意: Cloudera Management Service中的Activity Monitor现在已经基本上不用,可以不安装该服务。Telemetry Publisher是遥感服务,用于Workload XM通信,如果没有计划使用Workload XM,则不需要安装该服务。ZooKeeper至少安装3节点,需要为奇数节点数,本项目安装5个节点。 ### **3. 点击"继续",进入下一步,测试数据库连接:** ![](https://i-blog.csdnimg.cn/direct/ee7931db6de8444bb0a2a4365c415b39.png) 测试都成功后才能点击继续: 1. 点击"继续",进入参数设置,此处使用默认参数,根据实际情况进行目录修改: 2. ![](https://i-blog.csdnimg.cn/direct/38a1311dd6f74c55bca5ddb38cc5429d.png) ![](https://i-blog.csdnimg.cn/direct/def9b59e498e49dab0b43ec5ed330483.png) 1. 点击"继续",进入各个服务启动: ![](https://i-blog.csdnimg.cn/direct/019f94179c2e445ba65fa9301c739636.png) 1. 安装成功,点击继续: ![](https://i-blog.csdnimg.cn/direct/065266c2558048f385336cd3438e59b9.png) 7.安装成功后进入home管理界面,系统会自动恢复成没有错误的状态: ![](https://i-blog.csdnimg.cn/direct/8acd805f67854cd988c2197cf8171866.png) ## **四、CMP 服务开启高可用** 大部分服务可以通过在安装的过程中选择多个实例实现高可用,但是 HDFS 和 YARN 需要手动开启高可用。 **3.1 HDFS 高可用** 在 Cloudera Manager 界面中,点击 HDFS 服务 选择 Actions \> Enable High Availability 填入 NameService Name,默认是 nameservice1,点击"继续" 选择另一台 NameNode 和选择 JournalNode 机器(奇数个),点击"继续" 填入 JournalNode Edits Directory,点击"继续" 等待配置完成,点击"继续" ![](https://i-blog.csdnimg.cn/direct/b68e1c4c77414bb2996234edc9b08c64.png) **3.2 YARN 高可用** 在 Cloudera Manager 界面中,点击 YARN 服务 选择 Actions \> Enable High Availability 等待配置完成 ## **五、使用 Haproxy 给 CMP 服务配置负载均衡** 集群中的某些服务,例如 HiveServer2,Impalad,Solr 等,需要额外配置一个负载均衡 器实现负载均衡,这里以开源 haproxy 为例,当然也可以使用其他负载均衡器,例如 nginx 等 ## **六、CDP 集群组件功能测试** **6.1 HDFS 可用性测试** 依次运行以下命令确认 hdfs 可以正常使用 hdfs dfs -ls / echo aaa \> /tmp/test.txt hdfs dfs -put /tmp/test.txt /tmp hdfs dfs -get /tmp/test.txt **6.2 HIVE 可用性测试** 使用以下命令确认 hive 可以正常使用 echo "zhangsan,25" \> file1 hdfs dfs -mkdir /tmp/test hdfs dfs -put file1 /tmp/test 建立 test.hql 文件,内容如下 create external table test ( name string, age int ) row format delimited fields terminated by ',' location '/tmp/test' 运行以下命令,确保 hive 可以正常使用 hive -f test.hql hive -e "select \* from test" 2\> /dev/null hive -e "select count(\*) from test" 2\> /dev/null 删除测试数据,如需要测试 impala,可待测试完 impala 后删除 hive -e "drop table test" sudo -u hdfs hdfs dfs -rm -r /tmp/test/ **6.3 Mapreduce 可用性测试** 运行以下命令确认 mapreduce 任务可以正常执行,jar 包根据实际情况替换 # CMP pi hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar pi 10 10000 **6.4 Spark 可用性测试** 运行以下命令确认 spark 任务可以正常执行,jar 包根据实际情况替换 spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client /opt/cloudera/parcels/CDH/jars/spark-examples_2.12-3.4.1.7.3.1.100-57.jar 不启用kerberos的,kudu就不能启用ranger 去掉下图 ![](https://i-blog.csdnimg.cn/direct/90138979b14a4cba836e9593456797f0.png) 安装Range KMS **等待时间要长** ![](https://i-blog.csdnimg.cn/direct/3882e8a7cbda41b2ace4ec93e1b1c29c.png) 用keyadmin Redhat_ARM64登录,修改 kms://http@worker3.cmp.cn:9292/kms ![](https://i-blog.csdnimg.cn/direct/db78268750ef49118010b8465cf9a940.png)

相关推荐
都是蠢货6 小时前
mysql中null是什么意思?
android·数据库·mysql
励志成为糕手6 小时前
MapReduce工作流程:从MapTask到Yarn机制深度解析
大数据·hadoop·分布式·mapreduce·yarn
爱技术的阿呆7 小时前
MySQL子查询及其案例
数据库·mysql
可爱又迷人的反派角色“yang”7 小时前
zookeeper概念与部署
分布式·zookeeper·云原生
音符犹如代码7 小时前
Kafka 技术架构与核心原理深度解析
大数据·微服务·架构·kafka
Query*7 小时前
分布式消息队列kafka【四】—— 消费者进阶提升
分布式·kafka·linq
Query*7 小时前
分布式消息队列kafka【三】—— 生产者进阶提升
分布式·kafka·linq
Logic1017 小时前
《Mysql数据库应用》 第2版 郭文明 实验1 在MySQL中创建数据库和表核心操作与思路解析
数据库·sql·mysql·学习笔记·计算机网络技术·形考作业·国家开放大学
九转苍翎7 小时前
深入解析MySQL(8)——核心日志与备份恢复
mysql