说明
因环境特殊,描述粗略,若有类似环境问题难以解决,可直接留言。或email联系zhuyifei.ruichuang@gmail.com
问题现象
hadoop的9000端口无法被访问
环境描述
k8s部署了hadoop测试环境。采用all in one pod的方式。给其他pod提供服务,需可在hadoop所在的namespace内外均可访问。
排查记录
查询pod
bash
root@master:~# kubectl get pod -n hive
NAME READY STATUS RESTARTS AGE
hive-metastore-595d9d4b6b-vrhbh 1/1 Running 2 (83d ago) 83d
hive312-deployment-656c8d98f8-fv676 1/1 Running 0 7d2h
ubuntu-deployment-69845fb6d5-bl9sm 1/1 Running 0 8d
进入pod内的容器查询进程,此处确认了端口和进程都在
bash
root@master:~# kubectl exec -it -n hive hive312-deployment-656c8d98f8-fv676 -- bash
root@hive312-deployment-656c8d98f8-fv676:/# lsof -i :9000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 3396 hadoop 270u IPv4 2124636242 0t0 TCP localhost:9000 (LISTEN)
...
问题出在localhost:9000,这里表示监听本地,不接受其他来源。导致跨pod访问被拒绝。 停掉hadoop。务必先关停,后改配置,否则会导致服务只能kill -9关停。
bash
root@hive312-deployment-656c8d98f8-fv676:/opt/hadoop-3.1.1# su - hadoop
hadoop@hive312-deployment-656c8d98f8-fv676:~$ cd /opt/hadoop-3.1.1/sbin/
hadoop@hive312-deployment-656c8d98f8-fv676:/opt/hadoop-3.1.1/sbin$ ./stop-all.sh
(关键)修改配置文件,此处只展示了关键配置,没这一段的就追加进去。这里表示监听所有的访问者。
bash
root@hive312-deployment-656c8d98f8-fv676:/opt/hadoop-3.1.1/etc/hadoop# cat hdfs-site.xml
<property>
<name>dfs.namenode.rpc-address</name>
<value>0.0.0.0:9000</value>
</property>
启动hadoop
bash
root@hive312-deployment-656c8d98f8-fv676:/opt/hadoop-3.1.1/etc/hadoop# su - hadoop
hadoop@hive312-deployment-656c8d98f8-fv676:~$ cd /opt/hadoop-3.1.1/sbin/
hadoop@hive312-deployment-656c8d98f8-fv676:/opt/hadoop-3.1.1/sbin$ ./start-all.sh
查询进程,看到*:9000
,此时允许所有访问者。
bash
root@hive312-deployment-656c8d98f8-fv676:/opt/hadoop-3.1.1/etc/hadoop# lsof -i :9000
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
java 3396 hadoop 270u IPv4 2124636242 0t0 TCP *:9000 (LISTEN)
java 3396 hadoop 280u IPv4 2124655660 0t0 TCP hive312-deployment-656c8d98f8-fv676:9000->hive312-deployment-656c8d98f8-fv676:37284 (ESTABLISHED)
java 3491 hadoop 295u IPv4 2124641848 0t0 TCP hive312-deployment-656c8d98f8-fv676:37284->hive312-deployment-656c8d98f8-fv676:9000 (ESTABLISHED)
访问测试 此时在其他的namespace的pod的容器内,向该容器的9000端口发起访问,测试成功。