spark-thrift-server 报错 Wrong FS

### 文章目录

  • [@[toc]](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [具体报错](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [实际原因](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [查看 hive 元数据](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [修改 spark-thrift-server 配置](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [修改 hive 元数据](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)

具体报错

spark-thrift-server 执行删表语句,出现如下报错

复制代码
Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Wrong FS: hdfs://RMSS02ETL:9000/user/hive/warehouse/meta_data.db/dt_segment, expected: hdfs://hadoopmaster) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) 
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) 
    at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) 
    at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Wrong FS: hdfs://RMSS02ETL:9000/user/hive/warehouse/meta_data.db/dt_segment, expected: hdfs://hadoopmaster) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) 
    at org.apache.spark.sql.hive.HiveExternalCatalog.dropTable(HiveExternalCatalog.scala:517) 
    at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.dropTable(ExternalCatalogWithListener.scala:104) 
    at org.apache.spark.sql.catalyst.catalog.SessionCatalog.dropTable(SessionCatalog.scala:778) 
    at org.apache.spark.sql.execution.command.DropTableCommand.run(ddl.scala:248) 
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) 
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) 
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) 
    at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3687) 
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) 
    at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) 
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) 
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) 
    at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3685) 
    at org.apache.spark.sql.Dataset.<init>(Dataset.scala:228) 
    at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) 
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) 
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) 
    at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:615) 
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) 
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:610) 
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:650) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:325) ... 16 more

实际原因

  • hadoop 使用了 ha 模式,有双 namenodespark-thrift-server 配置的 --conf spark.sql.warehouse.dir 地址是其中一个 namenode 地址,需要修改成 nameservice 的地址
  • 原因是 hive-metastore 配置的地址是 nameservice 地址,hive 元数据有问题,所以可以建库建表,可以查询,但是不能删表

查看 hive 元数据

  • hive.dbs - hive 库元数据信息
  • hive.sds - hive 表元数据信息

查看默认的 hdfs 路径

mysql 复制代码
 select * from hive.dbs where NAME='default'; 

默认的 hdfs 地址是走的 nameservice

复制代码
+-------+-----------------------+-----------------------------------------+---------+------------+------------+ 
| DB_ID | DESC                  | DB_LOCATION_URI                         | NAME    | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+-----------------------------------------+---------+------------+------------+ 
| 1     | Default Hive database | hdfs://hadoopmaster/user/hive/warehouse | default | public     | ROLE       | 
+-------+-----------------------+-----------------------------------------+---------+------------+------------+ 

查看错误的 hdfs 地址

mysql 复制代码
select * from hive.dbs where DB_LOCATION_URI like '%RMSS02ETL%'; 

错误的 hdfs 地址走的是 namenode 地址

复制代码
+-------+------+--------------------------------------------------------+-----------+------------+------------+ 
| DB_ID | DESC | DB_LOCATION_URI                                        | NAME      | OWNER_NAME | OWNER_TYPE |
+-------+------+--------------------------------------------------------+-----------+------------+------------+ 
|    12 |      | hdfs://RMSS02ETL:9000/user/hive/warehouse/meta_data.db | meta_data | hive       |     USER   |
+-------+------+--------------------------------------------------------+-----------+------------+------------+

查看 hive 表元数据数据

mysql 复制代码
select * from hive.sds where LOCATION like '%RMSS02ETL%' \G;

LOCATION 处的地址也是 namenode 的地址

复制代码
                    SD_ID: 3768
                    CD_ID: 378
             INPUT_FORMAT: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
            IS_COMPRESSED:
IS_STOREDASSUBDIRECTORIES:
                 LOCATION: hdfs://RMSS02ETL:9000/user/hive/warehouse/meta_data.db/dt_segment
              NUM_BUCKETS: -1
            OUTPUT_FORMAT: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
                 SERDE_ID: 3768

修改 spark-thrift-server 配置

--conf spark.sql.warehouse.dir 参数修改成 nameservice 的地址,重启 spark-thrift-server 使配置生效

修改 hive 元数据

修改 hive 库元数据

mysql 复制代码
update hive.dbs set DB_LOCATION_URI=REPLACE(DB_LOCATION_URI,'RMSS02ETL:9000','hadoopmaster'); 

修改 hive 表元数据

mysql 复制代码
update hive.sds set LOCATION=REPLACE(LOCATION,'RMSS02ETL:9000','hadoopmaster');

最后重新尝试删表,可以成功

相关推荐
goTsHgo42 分钟前
Flink的 RecordWriter 数据通道 详解
大数据·flink
Romantic Rose1 小时前
你所拨打的电话是空号?手机状态查询API
大数据·人工智能
随缘而动,随遇而安2 小时前
第四十六篇 人力资源管理数据仓库架构设计与高阶实践
大数据·数据库·数据仓库·sql·数据库架构
小宋10212 小时前
Linux安装Elasticsearch详细教程
大数据·elasticsearch·搜索引擎
程序员老周6663 小时前
数据仓库标准库模型架构相关概念浅讲
大数据·数据仓库·hive·数仓·拉链抽取·增量抽取·数据仓库架构
Made in Program5 小时前
从数据格式转换的角度 flink cdc 如何写入paimon?
大数据·flink·paimon
jzy37116 小时前
Hive疑难杂症全攻克:从分隔符配置到权限避坑实战指南
大数据·apache hive
Elastic 中国社区官方博客6 小时前
Elasticsearch:加快 HNSW 图的合并速度
大数据·数据库·人工智能·elasticsearch·搜索引擎·ai·全文检索
syounger6 小时前
宝马集团加速 ERP 转型和上云之旅
大数据·人工智能
EasyNTS6 小时前
ONVIF/RTSP/RTMP协议EasyCVR视频汇聚平台RTMP协议配置全攻略 | 直播推流实战教程
大数据·网络·人工智能·音视频