参考
https://cloud.tencent.com/developer/article/1078314
https://mp.weixin.qq.com/s?__biz=MzI4OTY3MTUyNg==&mid=2247485861&idx=1&sn=bb930a497f63ac5e63ed20c64643eec5
机器准备
Kerberos主
ip-172-31-22-86.ap-southeast-1.compute.internal
Kerberos备
ip-172-31-21-45.ap-southeast-1.compute.internal
配置
服务端
ll /var/kerberos/krb5kdc/
vim /etc/krb5.conf
客户端
vim /etc/krb5.conf
ll /etc/krb5.keytab
其他配置操作
addprinc -randkey host/7.common2.hadoop.fql.com
addprinc -randkey host/6.common2.hadoop.fql.com
ktadd host/7.common2.hadoop.fql.com
ktadd host/6.common2.hadoop.fql.com
scp -P 39000 /etc/krb5.conf 10.14.XX.248:/etc/
scp -P 39000 /etc/krb5.keytab 10.14.XX.248:/etc/
scp -P 39000 /var/kerberos/krb5kdc/.k5.LEXIN.COM 10.14.XX.248:/etc/
scp -P 39000 /var/kerberos/krb5kdc/kadm5.acl 10.14.XX.248:/etc/
scp -P 39000 /var/kerberos/krb5kdc/kdc.conf 10.14.XX.248:/etc/
命令
shell
一、krb5kdc 和 kadmin 是主备节点都需要启动的服务
systemctl enable krb5kdc
systemctl start krb5kdc
systemctl status krb5kdc
`krb5kdc is the Kerberos version 5 Authentication Service and Key Distribution Center (AS/KDC).`
systemctl enable kadmin
systemctl start kadmin
systemctl status kadmin
二、kprop 是在备节点启动的服务
systemctl enable kprop
systemctl start kprop
systemctl status kprop
主备同步
shell
一、一次性同步
在主节点上使用kprop命令将master.dump文件同步至备节点
kprop -f /var/kerberos/krb5kdc/master.dump -d -P 754 6.common2.hadoop.fql.com
二、定时同步
在主节点上配置crontab
# Kerberos同步
*/6 * * * * sh /var/kerberos/krb5kdc/kprop_sync.sh >> /var/kerberos/krb5kdc/kprop_sync.log 2>&1
vim /var/kerberos/krb5kdc/kprop_sync.sh
#!/bin/bash
DUMP=/var/kerberos/krb5kdc/master.dump
PORT=754
SLAVE="6.common2.hadoop.fql.com"
TIMESTAMP=`date`
echo "begin at $TIMESTAMP"
/sbin/kdb5_util dump $DUMP
/sbin/kprop -f $DUMP -d -P $PORT $SLAVE
验证
结论:
1、一个账号,需要在Kerberos主备服务端都存在,才能认证使用。
2、主备的 krb5kdc 服务都停止后,kinit 有以下异常信息
shell
[hive@1 ~]$ kinit -kt /etc/security/keytab/hive.service.keytab hive/1.common2.hadoop.fql.com@LEXIN.COM
kinit: Cannot contact any KDC for realm 'LEXIN.COM' while getting initial credentials
[root@1 ~]# kinit test_2023
kinit: Cannot contact any KDC for realm 'LEXIN.COM' while getting initial credentials
3、主备的 krb5kdc 服务都停止后,HDFS 异常,NameNode 会有以下异常信息:
shell
Caused by: java.io.IOException: Couldn't setup connection for nn/2.common2.hadoop.fql.com@LEXIN.COM to 1.common2.hadoop.fql.com/10.14.37.243:8485
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:772)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1938)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:743)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:846)
at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:423)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1621)
at org.apache.hadoop.ipc.Client.call(Client.java:1450)
... 11 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: ICMP Port Unreachable)]
at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:211)
at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:407)
at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:629)
at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:423)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:833)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:829)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1938)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:828)
4、主备的 krb5kdc 服务都停止后,再恢复启动主节点或备节点,kerberos 就会恢复正常使用。