背景: 使用了格式化,导致首重了新的集群ID
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /work1/home/hadoop/dfs/data/current/BP-1873526852-172.16.21.30-1692769875005 is in an inconsistent state: namespaceID is incompatible with others.
at org.apache.hadoop.hdfs.server.common.StorageInfo.setNamespaceID(StorageInfo.java:189)
解决方法:
1. 在namenode上, ${dfs.namenode.name.dir}/current/VERSION 里找到clusterID ( ${dfs.namenode.name.dir}在hdfs-site.xml里定义)
2. 在出问题的datanode上, ${dfs.namenode.data.dir}/current/VERSION 里找到clusterID, 用步骤1中得到的clusterID覆盖之。
3. 在问题节点上重启datanode 发现启动不了 datanode
报错
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /work1/home/hadoop/dfs/data/current/BP-1873526852-172.16.21.30-1692769875005 is in an inconsistent state: namespaceID is incompatible with others.
将新生成的目录删除, 不一致的主要原因是namespaceID没改成原来的
rm -rf /work1/home/hadoop/dfs/data/current/BP-1873526852-172.16.21.30-1692769875005
删除后又导致文件块缺少
The number of live datanodes 1 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached.
所以删除有问题的文件
hdfs fsck /
hadoop dfsadmin -safemode leave
hdfs dfs -rm -r /user/hadoop/input 我这里是这样的文件,如果你也有问题按实际文件删除, 当然数据重要的话要导出来备份啊
出来新的问题:
java.io.IOException: Inconsistent checkpoint fields.
LV = -63 namespaceID = 648161912 cTime = 0 ; clusterId = CID-46a67bdd-c7b0-4056-9c54-d82d5d84964a ; blockpoolId = BP-1873526852-172.16.21.30-1692769875005.
Expecting respectively: -63; 648161912; 0; CID-46a67bdd-c7b0-4056-9c54-d82d5d84964a; BP-1073838461-172.16.20.24-1691128065381.
at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:134)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:531)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:395)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:361)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:357)
at java.lang.Thread.run(Thread.java:750)
-63 namespaceID = 648161912 cTime = 0 ; clusterId = CID-46a67bdd-c7b0-4056-9c54-d82d5d84964a ; blockpoolId = BP-1873526852-172.16.21.30-1692769875005.
-63; 648161912; 0; CID-46a67bdd-c7b0-4056-9c54-d82d5d84964a; BP-1073838461-172.16.20.24-1691128065381.
修改name
/home/hadoop/dfs/name/current/VERSION 的内容中的
blockpoolID=BP-1873526852-172.16.21.30-1692769875005
layoutVersion=-63
这个63不能乱换, 主要思路就是保持各个VERSION一致
最终各个目录内的VERSION如下
name目录: /home/hadoop/dfs/name/current/VERSION内容
#Thu Aug 24 11:23:17 CST 2023
namespaceID=648161912
clusterID=CID-46a67bdd-c7b0-4056-9c54-d82d5d84964a
cTime=0
storageType=NAME_NODE
blockpoolID=BP-1073838461-172.16.20.24-1691128065381
layoutVersion=-63
data目录:
/home/hadoop/dfs/data/current/VERSION
#Thu Aug 24 11:23:22 CST 2023
storageID=DS-11772fa3-8b9e-4b22-aa52-19946db147de
clusterID=CID-46a67bdd-c7b0-4056-9c54-d82d5d84964a
cTime=0
datanodeUuid=1620ce70-27e6-4aff-9279-4c90fa703dbd
storageType=DATA_NODE
layoutVersion=-56
/home/hadoop/dfs/data/current/BP-1073838461-172.16.20.24-1691128065381/current/VERSION
#Thu Aug 24 11:23:22 CST 2023
namespaceID=648161912
cTime=0
blockpoolID=BP-1073838461-172.16.20.24-1691128065381
layoutVersion=-56
关注的点是 blockpoolID ,clusterID,namespaceID, 格式化后 name目录下面这个id变了, 要改成旧的