spark-thrift-server 报错 Wrong FS

### 文章目录

  • [@[toc]](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [具体报错](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [实际原因](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [查看 hive 元数据](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [修改 spark-thrift-server 配置](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)
  • [修改 hive 元数据](#文章目录 @[toc] 具体报错 实际原因 查看 hive 元数据 修改 spark-thrift-server 配置 修改 hive 元数据)

具体报错

spark-thrift-server 执行删表语句,出现如下报错

复制代码
Error: org.apache.hive.service.cli.HiveSQLException: Error running query: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Wrong FS: hdfs://RMSS02ETL:9000/user/hive/warehouse/meta_data.db/dt_segment, expected: hdfs://hadoopmaster) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:361) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.$anonfun$run$2(SparkExecuteStatementOperation.scala:263) 
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties(SparkOperation.scala:78) 
    at org.apache.spark.sql.hive.thriftserver.SparkOperation.withLocalProperties$(SparkOperation.scala:62) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.withLocalProperties(SparkExecuteStatementOperation.scala:43) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:263) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2$$anon$3.run(SparkExecuteStatementOperation.scala:258) 
    at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$2.run(SparkExecuteStatementOperation.scala:272) 
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Wrong FS: hdfs://RMSS02ETL:9000/user/hive/warehouse/meta_data.db/dt_segment, expected: hdfs://hadoopmaster) at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112) 
    at org.apache.spark.sql.hive.HiveExternalCatalog.dropTable(HiveExternalCatalog.scala:517) 
    at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.dropTable(ExternalCatalogWithListener.scala:104) 
    at org.apache.spark.sql.catalyst.catalog.SessionCatalog.dropTable(SessionCatalog.scala:778) 
    at org.apache.spark.sql.execution.command.DropTableCommand.run(ddl.scala:248) 
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) 
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) 
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) 
    at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3687) 
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103) 
    at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163) 
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) 
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64) 
    at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3685) 
    at org.apache.spark.sql.Dataset.<init>(Dataset.scala:228) 
    at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) 
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) 
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) 
    at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:615) 
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772) 
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:610) 
    at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:650) 
    at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:325) ... 16 more

实际原因

  • hadoop 使用了 ha 模式,有双 namenodespark-thrift-server 配置的 --conf spark.sql.warehouse.dir 地址是其中一个 namenode 地址,需要修改成 nameservice 的地址
  • 原因是 hive-metastore 配置的地址是 nameservice 地址,hive 元数据有问题,所以可以建库建表,可以查询,但是不能删表

查看 hive 元数据

  • hive.dbs - hive 库元数据信息
  • hive.sds - hive 表元数据信息

查看默认的 hdfs 路径

mysql 复制代码
 select * from hive.dbs where NAME='default'; 

默认的 hdfs 地址是走的 nameservice

复制代码
+-------+-----------------------+-----------------------------------------+---------+------------+------------+ 
| DB_ID | DESC                  | DB_LOCATION_URI                         | NAME    | OWNER_NAME | OWNER_TYPE |
+-------+-----------------------+-----------------------------------------+---------+------------+------------+ 
| 1     | Default Hive database | hdfs://hadoopmaster/user/hive/warehouse | default | public     | ROLE       | 
+-------+-----------------------+-----------------------------------------+---------+------------+------------+ 

查看错误的 hdfs 地址

mysql 复制代码
select * from hive.dbs where DB_LOCATION_URI like '%RMSS02ETL%'; 

错误的 hdfs 地址走的是 namenode 地址

复制代码
+-------+------+--------------------------------------------------------+-----------+------------+------------+ 
| DB_ID | DESC | DB_LOCATION_URI                                        | NAME      | OWNER_NAME | OWNER_TYPE |
+-------+------+--------------------------------------------------------+-----------+------------+------------+ 
|    12 |      | hdfs://RMSS02ETL:9000/user/hive/warehouse/meta_data.db | meta_data | hive       |     USER   |
+-------+------+--------------------------------------------------------+-----------+------------+------------+

查看 hive 表元数据数据

mysql 复制代码
select * from hive.sds where LOCATION like '%RMSS02ETL%' \G;

LOCATION 处的地址也是 namenode 的地址

复制代码
                    SD_ID: 3768
                    CD_ID: 378
             INPUT_FORMAT: org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
            IS_COMPRESSED:
IS_STOREDASSUBDIRECTORIES:
                 LOCATION: hdfs://RMSS02ETL:9000/user/hive/warehouse/meta_data.db/dt_segment
              NUM_BUCKETS: -1
            OUTPUT_FORMAT: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
                 SERDE_ID: 3768

修改 spark-thrift-server 配置

--conf spark.sql.warehouse.dir 参数修改成 nameservice 的地址,重启 spark-thrift-server 使配置生效

修改 hive 元数据

修改 hive 库元数据

mysql 复制代码
update hive.dbs set DB_LOCATION_URI=REPLACE(DB_LOCATION_URI,'RMSS02ETL:9000','hadoopmaster'); 

修改 hive 表元数据

mysql 复制代码
update hive.sds set LOCATION=REPLACE(LOCATION,'RMSS02ETL:9000','hadoopmaster');

最后重新尝试删表,可以成功

相关推荐
r-t-H13 分钟前
从零开始搭建CDH-第十二章
linux·hive·spark·centos·hbase
云登指纹浏览器34 分钟前
指纹浏览器RPA自动化实战:跨境电商多账号运营效率提升指南
大数据·自动化·rpa
2601_957879331 小时前
短视频矩阵的数据驱动运营:从流量监测到内容迭代的完整技术链路
大数据·矩阵·音视频
珠海西格电力1 小时前
零碳园区的碳排放指标计算的实操步骤
大数据·运维·人工智能·物联网·能源
WL_Aurora1 小时前
大数据技术之SparkSQL
大数据·sparksql
简信CRM1 小时前
小微型企业如何利用CRM对公司内外部管理进行优化转型?
大数据·crm·简信crm
逐米时代2 小时前
成都制造企业采购合同风险审核,AI智能体该查哪些条款?
大数据·人工智能
lizhihai_993 小时前
股市学习心得-与英伟达核心 PCB 相关的八家关联企业
大数据·人工智能·学习
WL_Aurora3 小时前
大数据项目实战:网站流量日志分析
大数据
AC赳赳老秦4 小时前
OpenClaw碎片时间利用:设置轻量化自动化任务,高效利用职场碎片时间
java·大数据·运维·服务器·数据库·自动化·openclaw