Hadoop集群找不到native-hadoop

1.问题描述

bash 复制代码
========hive 运行中的问题,需要把把native复制进去 /usr/lib
2023-02-15 19:59:42,165 WARN scheduler.TaskSetManager: Lost task 11.0 in stage 1.0 (TID 3, common4, executor 2): java.lang.RuntimeException: Hive Runtime Error while closing operators: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: SequenceFile doesn't work with GzipCodec without native-hadoop code!
at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:626)
at org.apache.hadoop.hive.ql.exec.spark.HiveReduceFunctionResultList.closeRecordProcessor(HiveReduceFunctionResultList.java:67)
at org.apache.hadoop.hive.ql.exec.spark.HiveBaseFunctionResultList.hasNext(HiveBaseFunctionResultList.java:96)
at scala.collection.convert.Wrappers$JIteratorWrapper.hasNext(Wrappers.scala:43)
at scala.collection.Iterator.foreach(Iterator.scala:941)
at scala.collection.Iterator.foreach$(Iterator.scala:941)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1429)
at org.apache.spark.rdd.AsyncRDDActions.$anonfun$foreachAsync$2(AsyncRDDActions.scala:127)
at org.apache.spark.rdd.AsyncRDDActions.$anonfun$foreachAsync$2$adapted(AsyncRDDActions.scala:127)
at org.apache.spark.SparkContext.$anonfun$submitJob$1(SparkContext.scala:2242)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:127)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:444)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:447)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: SequenceFile doesn't work with GzipCodec without native-hadoop code!
at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1112)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:733)
at org.apache.hadoop.hive.ql.exec.spark.SparkReduceRecordHandler.close(SparkReduceRecordHandler.java:610)
... 17 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: SequenceFile doesn't work with GzipCodec without native-hadoop code!
at org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1086)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1109)
... 19 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.IllegalArgumentException: SequenceFile doesn't work with GzipCodec without native-hadoop code!
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:742)
at org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:897)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928)
at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:995)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:941)
at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:928)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:1050)
at org.apache.hadoop.hive.ql.exec.GroupByOperator.flush(GroupByOperator.java:1076)
... 20 more

2.原因分析

找不到压缩类型支持的包,hadoop的依赖没有找到

3.问题解决

bash 复制代码
#cp -d 表示带软连接复制
sudo cp -d /data/module/hadoop-3.3.4/lib/native/lib* /usr/lib/
sudo chown hadoop:hadoop /usr/lib/lib*
#有必要查看libhadoop.so.1.0.0是否是空的,很重要。。。。

测试

bash 复制代码
(base) [hadoop@hadoop1 native]$ hadoop checknative
2023-12-25 14:20:21,615 INFO bzip2.Bzip2Factory: Successfully loaded & initialized native-bzip2 library system-native
2023-12-25 14:20:21,618 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2023-12-25 14:20:21,623 WARN erasurecode.ErasureCodeNative: Loading ISA-L failed: Failed to load libisal.so.2 (libisal.so.2: cannot open shared object file: No such file or directory)
2023-12-25 14:20:21,623 WARN erasurecode.ErasureCodeNative: ISA-L support is not available in your platform... using builtin-java codec where applicable
2023-12-25 14:20:21,658 INFO nativeio.NativeIO: The native code was built without PMDK support.
Native library checking:
hadoop:  true /data/module/hadoop-3.3.4/lib/native/libhadoop.so.1.0.0
zlib:    true /lib64/libz.so.1
zstd  :  true /lib64/libzstd.so.1
bzip2:   true /lib64/libbz2.so.1
openssl: false Cannot load libcrypto.so (libcrypto.so: cannot open shared object file: No such file or directory)!
ISA-L:   false Loading ISA-L failed: Failed to load libisal.so.2 (libisal.so.2: cannot open shared object file: No such file or directory)
PMDK:    false The native code was built without PMDK support.
相关推荐
hqyjzsb1 小时前
2025文职转行AI管理岗:衔接型认证成为关键路径
大数据·c语言·人工智能·信息可视化·媒体·caie
sniper_fandc2 小时前
Elasticsearch从入门到进阶——分布式特性
大数据·分布式·elasticsearch
YangYang9YangYan3 小时前
大专计算机技术专业就业方向:解读、规划与提升指南
大数据·人工智能·数据分析
扫地的小何尚3 小时前
AI创新的火花:NVIDIA DGX Spark开箱与深度解析
大数据·人工智能·spark·llm·gpu·nvidia·dgx
B站_计算机毕业设计之家3 小时前
spark实战:python股票数据分析可视化系统 Flask框架 金融数据分析 Echarts可视化 大数据技术 ✅
大数据·爬虫·python·金融·数据分析·spark·股票
hzp6663 小时前
spark动态分区参数spark.sql.sources.partitionOverwriteMode
大数据·hive·分布式·spark·etl·partitionover
yumgpkpm7 小时前
CMP(类ClouderaCDP7.3(404次编译) )完全支持华为鲲鹏Aarch64(ARM),粉丝数超过200就开源下载
hive·hadoop·redis·mongodb·elasticsearch·hbase·big data
0和1的舞者8 小时前
《Git:从入门到精通(八)——企业级git开发相关内容》
大数据·开发语言·git·搜索引擎·全文检索·软件工程·初学者
运维行者_9 小时前
AWS云服务故障复盘——从故障中汲取的 IT 运维经验
大数据·linux·运维·服务器·人工智能·云计算·aws
TDengine (老段)9 小时前
TDengine 配置参数作用范围对比
大数据·数据库·物联网·时序数据库·tdengine·涛思数据