kyuubi整合spark on yarn

目录

概述

目标:

  • 1.实现kyuubi spark on yarn
  • 2.实现 kyuubi spark on yarn 资源的动态分配

注意:版本 kyuubi 1.8.0 、 spark 3.4.2 、hadoop 3.3.6

前置准备请看如下文章

文章 链接
hadoop一主三从安装 链接
spark on yarn 链接

实践

下载

官网下载地址

配置

官方文档

修改配置文件

bash 复制代码
# 需要修改的配置文件
[root@hadoop01 conf]# ls
kyuubi-defaults.conf.template  kyuubi-env.sh.template  log4j2.xml.template
[root@hadoop01 conf]# pwd
/data/hadoop/soft/apache-kyuubi-1.8.0/conf

# 第一步: 修改日志文件
[root@hadoop01 conf]# mv log4j2.xml.template log4j2.xml
# 第二步: kyuubi-env.sh
# export SPARK_HOME=/data/hadoop/soft/spark-3.4.2
[root@hadoop01 conf]# ls
kyuubi-defaults.conf.template  kyuubi-env.sh.template  log4j2.xml
[root@hadoop01 conf]# mv  kyuubi-env.sh.template kyuubi-env.sh
[root@hadoop01 conf]# vi kyuubi-env.sh 
# 第三步: kyuubi-defaults.conf
[root@hadoop01 conf]# mv kyuubi-defaults.conf.template kyuubi-defaults.conf
[root@hadoop01 conf]# vi kyuubi-defaults.conf 

kyuubi.engine.type                       SPARK_SQL
kyuubi.engine.share.level                USER
# ============ Spark Config ============
spark.master=yarn
spark.submit.deployMode=cluster
spark.driver.memory 4g
spark.executor.memory 4g
spark.executor.cores 2
#Dynamic allocation
spark.dynamicAllocation.enabled=true
##false if prefer shuffle tracking than ESS
spark.shuffle.service.enabled=true
spark.dynamicAllocation.initialExecutors=2
spark.dynamicAllocation.minExecutors=6
spark.dynamicAllocation.maxExecutors=6
spark.dynamicAllocation.executorAllocationRatio=0.5
spark.dynamicAllocation.executorIdleTimeout=200s
spark.dynamicAllocation.cachedExecutorIdleTimeout=30min
# true if prefer shuffle tracking than ESS
spark.dynamicAllocation.shuffleTracking.enabled=false
spark.dynamicAllocation.shuffleTracking.timeout=30min
spark.dynamicAllocation.schedulerBacklogTimeout=1s
spark.dynamicAllocation.sustainedSchedulerBacklogTimeout=1s
spark.cleaner.periodicGC.interval=5min
# aqe
spark.sql.adaptive.enabled=true
spark.sql.adaptive.forceApply=false
spark.sql.adaptive.logLevel=info
spark.sql.adaptive.advisoryPartitionSizeInBytes=128m
spark.sql.adaptive.coalescePartitions.enabled=true
spark.sql.adaptive.coalescePartitions.minPartitionNum=30
spark.sql.adaptive.coalescePartitions.initialPartitionNum=5120
spark.sql.adaptive.fetchShuffleBlocksInBatch=true
spark.sql.adaptive.localShuffleReader.enabled=true
spark.sql.adaptive.skewJoin.enabled=true
spark.sql.adaptive.skewJoin.skewedPartitionFactor=3
spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes=256m
spark.sql.adaptive.nonEmptyPartitionRatioForBroadcastJoin=0.2
spark.sql.adaptive.optimizer.excludedRules
spark.sql.autoBroadcastJoinThreshold=-1
# spark 返回数据大小设定
spark.kryoserializer.buffer.max=2047m
spark.driver.maxResultSize=4096m
# paimon
#spark.sql.catalog.hive.warehouse=hdfs:///data/paimon
spark.sql.catalog.paimon=org.apache.paimon.spark.SparkCatalog
spark.sql.catalog.paimon.warehouse=hdfs:///data/paimon
spark.sql.extensions=org.apache.paimon.spark.PaimonSparkSessionExtension


# Log into Spark History Server
spark.eventLog.enabled true
spark.eventLog.dir hdfs://hadoop01:9000/spark-eventlog
spark.serializer org.apache.spark.serializer.KryoSerializer
spark.eventLog.compress true
#将 spark 历史服务器与 yarn 历史服务器进行整合,写到 HDFS 中
spark.yarn.historyServer.address hadoop01:18080

# spark UI参数
spark.ui.retainedJobs 50
spark.ui.retainedStages 300

启动

bash 复制代码
[root@hadoop01 apache-kyuubi-1.8.0]# ./bin/kyuubi start
bin/beeline -u 'jdbc:hive2://10.32.36.142:10009/' -n root
24/02/26 15:47:30 INFO Client: Application report for application_1708505130791_0007 (state: RUNNING)
24/02/26 15:47:31 INFO Client: Application report for application_1708505130791_0007 (state: RUNNING)
24/02/26 15:47:32 INFO Client: Application report for application_1708505130791_0007 (state: RUNNING)
24/02/26 15:47:33 INFO Client: Application report for application_1708505130791_0007 (state: RUNNING)
2024-02-26 15:47:35.793 INFO KyuubiSessionManager-exec-pool: Thread-58 org.apache.kyuubi.ha.client.zookeeper.ZookeeperDiscoveryClient: Get service instance:hadoop04:39670 engine id:application_1708505130791_0007 and version:1.8.0 under /kyuubi_1.8.0_USER_SPARK_SQL/root/default
24/02/26 15:47:35 INFO Client: Application report for application_1708505130791_0007 (state: RUNNING)
Connected to: Spark SQL (version 3.4.2)
Driver: Kyuubi Project Hive JDBC Client (version 1.8.0)
Beeline version 1.8.0 by Apache Kyuubi
0: jdbc:hive2://10.32.36.142:10009/> 
bash 复制代码
0: jdbc:hive2://10.32.36.142:10009/> show databases;
2024-02-26 15:48:03.130 INFO KyuubiSessionManager-exec-pool: Thread-66 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[81c88029-474d-4ed2-98fc-21013bf10cc3]: PENDING_STATE -> RUNNING_STATE, statement:
show databases
2024-02-26 15:48:03.261 INFO KyuubiSessionManager-exec-pool: Thread-66 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[81c88029-474d-4ed2-98fc-21013bf10cc3]: RUNNING_STATE -> FINISHED_STATE, time taken: 0.131 seconds
+------------+
| namespace  |
+------------+
| default    |
+------------+
1 row selected (0.34 seconds)
0: jdbc:hive2://10.32.36.142:10009/> 


bash 复制代码
set spark.sql.catalog.paimon=org.apache.paimon.spark.SparkCatalog
set spark.sql.catalog.paimon.warehouse=hdfs:///data/paimon;


24/02/26 20:30:08 INFO ExecuteStatement: Execute in full collect mode
24/02/26 20:30:08 INFO V2ScanRelationPushDown: 
Pushing operators to trace_log_refdes_hive_ro
Pushed Filters: IsNotNull(id), EqualTo(id,11C0928D5A29E048E063AA2C200ABEF3)
Post-Scan Filters: isnotnull(id#354),(id#354 = 11C0928D5A29E048E063AA2C200ABEF3)
         
24/02/26 20:30:08 INFO V2ScanRelationPushDown: 
Output: pcbid#345, rid#346, refdes#347, bm_circuit_no#348, timestamp#349, pickupstatus#350, serial_number#351, flag#352, kitid#353, id#354, createdate#355, etl#356, opt1#357, opt2#358, opt3#359, opt4#360, opt5#361, nozzleid#362, laneno#363, componentbarcode#364, pn#365, lotcode#366, datecode#367, verdor#368, workorder#369, dt#370
         
24/02/26 20:30:08 INFO SparkContext: Starting job: collect at ExecuteStatement.scala:72
24/02/26 20:30:08 INFO SQLOperationListener: Query [4a52f7cf-f446-40b0-b16f-6b895ab84224]: Job 5 started with 1 stages, 1 active jobs running
24/02/26 20:30:08 INFO SQLOperationListener: Query [4a52f7cf-f446-40b0-b16f-6b895ab84224]: Stage 6.0 started with 1 tasks, 1 active stages running

三千五百万

主键id单笔获取

bash 复制代码
0: jdbc:hive2://10.32.36.142:10009/> select count(*) from trace_log_refdes_hive_ro where id='11C0928D5A26E048E063AA2C200ABEF3';
24/02/27 08:43:33 INFO AdaptiveSparkPlanExec: Final plan:
*(1) HashAggregate(keys=[], functions=[count(1)], output=[count(1)#523L])
+- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#527L])
   +- *(1) Project
      +- *(1) Filter (isnotnull(id#505) AND (id#505 = 11C0928D5A26E048E063AA2C200ABEF3))
         +- BatchScan trace_log_refdes_hive_ro[id#505] PaimonScan: [trace_log_refdes_hive_ro], PushedFilters: [IsNotNull(id),Equal(id, 11C0928D5A26E048E063AA2C200ABEF3)] RuntimeFilters: []

24/02/27 08:43:33 INFO ExecuteStatement: Processing root's query[b4889a64-d943-41ae-819e-d8a0491c60c6]: RUNNING_STATE -> FINISHED_STATE, time taken: 1.871 seconds
24/02/27 08:43:33 INFO SQLOperationListener: Query [b4889a64-d943-41ae-819e-d8a0491c60c6]: Job 7 succeeded, 0 active jobs running
2024-02-27 08:43:33.005 INFO KyuubiSessionManager-exec-pool: Thread-496 org.apache.kyuubi.operation.ExecuteStatement: Query[b4889a64-d943-41ae-819e-d8a0491c60c6] in FINISHED_STATE
2024-02-27 08:43:33.006 INFO KyuubiSessionManager-exec-pool: Thread-496 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[b4889a64-d943-41ae-819e-d8a0491c60c6]: RUNNING_STATE -> FINISHED_STATE, time taken: 1.874 seconds
+-----------+
| count(1)  |
+-----------+
| 1         |
+-----------+
1 row selected (1.883 seconds)

非主键 count 测试

bash 复制代码
0: jdbc:hive2://10.32.36.142:10009/> select count(*) from trace_log_refdes_hive_ro where pcbid='E23MPM42201540';
24/02/27 08:41:56 INFO AdaptiveSparkPlanExec: Final plan:
*(2) HashAggregate(keys=[], functions=[count(1)], output=[count(1)#489L])
+- ShuffleQueryStage 0
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=109]
      +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#493L])
         +- *(1) Project
            +- *(1) Filter (isnotnull(pcbid#462) AND (pcbid#462 = E23MPM42201540))
               +- BatchScan trace_log_refdes_hive_ro[pcbid#462] PaimonScan: [trace_log_refdes_hive_ro], PushedFilters: [IsNotNull(pcbid),Equal(pcbid, E23MPM42201540)] RuntimeFilters: []

2024-02-27 08:41:56.274 INFO KyuubiSessionManager-exec-pool: Thread-495 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[f473dde1-ba80-4d3e-8dc7-fa0a3e32df1f]: RUNNING_STATE -> FINISHED_STATE, time taken: 5.368 seconds
+-----------+
| count(1)  |
+-----------+
| 108       |
+-----------+
1 row selected (5.393 seconds)

差不多6千万

bash 复制代码
24/02/27 16:54:54 INFO SQLOperationListener: Query [96e1d2f4-8230-4cc7-9b84-ce888387bb7d]: Job 2 succeeded, 0 active jobs running
24/02/27 16:54:54 INFO AdaptiveSparkPlanExec: Final plan:
*(2) HashAggregate(keys=[], functions=[count(1)], output=[count(1)#61L])
+- ShuffleQueryStage 0
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=52]
      +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#64L])
         +- *(1) Project
            +- BatchScan trace_log_refdes_hive_ro[] PaimonScan: [trace_log_refdes_hive_ro] RuntimeFilters: []

24/02/27 16:54:54 INFO CodeGenerator: Code generated in 7.970386 ms
24/02/27 16:54:54 INFO ExecuteStatement: Processing root's query[96e1d2f4-8230-4cc7-9b84-ce888387bb7d]: RUNNING_STATE -> FINISHED_STATE, time taken: 21.921 seconds
2024-02-27 16:54:54.490 INFO KyuubiSessionManager-exec-pool: Thread-806 org.apache.kyuubi.operation.ExecuteStatement: Query[96e1d2f4-8230-4cc7-9b84-ce888387bb7d] in FINISHED_STATE
2024-02-27 16:54:54.490 INFO KyuubiSessionManager-exec-pool: Thread-806 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[96e1d2f4-8230-4cc7-9b84-ce888387bb7d]: RUNNING_STATE -> FINISHED_STATE, time taken: 21.924 seconds
+-----------+
| count(1)  |
+-----------+
| 59480487  |
+-----------+
1 row selected (21.942 seconds)
0: jdbc:hive2://10.32.36.142:10009/> select *  from trace_log_refdes_hive_ro limit 10;
24/02/27 16:55:27 INFO ExecuteStatement: Execute in full collect mode
24/02/27 16:55:27 INFO V2ScanRelationPushDown: 
Output: pcbid#67, rid#68, refdes#69, bm_circuit_no#70, timestamp#71, pickupstatus#72, serial_number#73, flag#74, kitid#75, id#76, createdate#77, etl#78, opt1#79, opt2#80, opt3#81, opt4#82, opt5#83, nozzleid#84, laneno#85, componentbarcode#86, pn#87, lotcode#88, datecode#89, verdor#90, workorder#91, dt#92
         
2024-02-27 16:55:28.414 INFO KyuubiSessionManager-exec-pool: Thread-807 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[cadc193d-1343-4973-828c-23f23250b3d3]: RUNNING_STATE -> FINISHED_STATE, time taken: 0.735 seconds
+-----------------+------------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
|      pcbid      |             xxx              | yyyyff  | zzzzzxxxxxxxx  |      timestamp       | gggggggfffff  |   ddddddddddddd    | cccc  | aaaaa  |                id                 |      createdate      | etl  | opt1  | opt2  | opt3  | opt4  | opt5  | nozzleid  | laneno  | componentbarcode  |  pn   | xyzzzzx  | xzzzzzzz  | txxxxx  | zxxxxxxxx  |     dt      |
+-----------------+------------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
| E23MPM42203175  | 514S00292-11420240109000060  | J0200   | 5              | 2024-02-23 18:16:42  | 0             | DLC4084004RPQVLAG  | 0     | NXT    | 11C0928D5A26E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203175  | 514S00292-11420240109000060  | J0200   | 4              | 2024-02-23 18:16:42  | 0             | DLC4084004SPQVLAF  | 0     | NXT    | 11C0928D5A27E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203175  | 514S00292-11420240117000057  | J0200   | 7              | 2024-02-23 18:16:42  | 0             | DLC4084004PPQVLAJ  | 0     | NXT    | 11C0928D5A28E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203175  | 514S00292-11420240117000057  | J0200   | 9              | 2024-02-23 18:16:42  | 0             | DLC4084004MPQVLAL  | 0     | NXT    | 11C0928D5A29E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203175  | 514S00292-11420240117000056  | J0200   | 12             | 2024-02-23 18:16:42  | 0             | DLC4084004VPQVLAC  | 0     | NXT    | 11C0928D5A2CE048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203176  | 514S00292-11420240117000056  | J0200   | 1              | 2024-02-23 18:16:42  | 0             | DLC4084005NPQVLAG  | 0     | NXT    | 11C0928D5A31E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203176  | 514S00292-11420240117000059  | J0200   | 4              | 2024-02-23 18:16:42  | 0             | DLC4084005GPQVLAN  | 0     | NXT    | 11C0928D5A34E048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42203176  | 514S00292-11420240116000073  | J0200   | 10             | 2024-02-23 18:16:42  | 0             | DLC4084005MPQVLAH  | 0     | NXT    | 11C0928D5A3AE048E063AA2C200ABEF3  | 2024-02-23 18:14:50  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42201540  | 117S0158-A420240115001804    | R0514   | 7              | 2024-02-23 18:16:42  | 0             | DLC40860DTVPQVLAW  | 0     | NXT    | 11C0928D5A59E048E063AA2C200ABEF3  | 2024-02-23 18:14:51  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
| E23MPM42201540  | 117S0158-A420240115001804    | R0401   | 1              | 2024-02-23 18:16:42  | 0             | DLC40860DU4PQVLAJ  | 0     | NXT    | 11C0928D5A5DE048E063AA2C200ABEF3  | 2024-02-23 18:14:51  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
+-----------------+------------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
10 rows selected (0.777 seconds)
0: jdbc:hive2://10.32.36.142:10009/> select *  from trace_log_refdes_hive_ro where id ='11C0928D5A5DE048E063AA2C200ABEF3';
24/02/27 16:55:49 INFO V2ScanRelationPushDown: 
Pushing operators to trace_log_refdes_hive_ro
Pushed Filters: IsNotNull(id), EqualTo(id,11C0928D5A5DE048E063AA2C200ABEF3)
Post-Scan Filters: isnotnull(id#181),(id#181 = 11C0928D5A5DE048E063AA2C200ABEF3)
         
24/02/27 16:55:49 INFO V2ScanRelationPushDown: 
Output: pcbid#172, rid#173, refdes#174, bm_circuit_no#175, timestamp#176, pickupstatus#177, serial_number#178, flag#179, kitid#180, id#181, createdate#182, etl#183, opt1#184, opt2#185, opt3#186, opt4#187, opt5#188, nozzleid#189, laneno#190, componentbarcode#191, pn#192, lotcode#193, datecode#194, verdor#195, workorder#196, dt#197
         
2024-02-27 16:55:52.191 INFO KyuubiSessionManager-exec-pool: Thread-808 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[1dd09511-efc5-4e64-ad2a-15d4b59fb553]: RUNNING_STATE -> FINISHED_STATE, time taken: 2.859 seconds
+-----------------+----------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
|      pcbid      |             xxx              | yyyyff  | zzzzzxxxxxxxx  |      timestamp       | gggggggfffff  |   ddddddddddddd    | cccc  | aaaaa  |                id                 |      createdate      | etl  | opt1  | opt2  | opt3  | opt4  | opt5  | nozzleid  | laneno  | componentbarcode  |  pn   | xyzzzzx  | xzzzzzzz  | txxxxx  | zxxxxxxxx  |     dt      |
+-----------------+----------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
| E23MPM42201540  | 117S0158-A420240115001804  | R0401   | 1              | 2024-02-23 18:16:42  | 0             | DLC40860DU4PQVLAJ  | 0     | NXT    | 11C0928D5A5DE048E063AA2C200ABEF3  | 2024-02-23 18:14:51  | N    | NULL  | NULL  | NULL  | NULL  | NULL  | NULL      | NULL    | NULL              | NULL  | NULL     | NULL      | NULL    | NULL       | 2024-02-23  |
+-----------------+----------------------------+---------+----------------+----------------------+---------------+--------------------+-------+--------+-----------------------------------+----------------------+------+-------+-------+-------+-------+-------+-----------+---------+-------------------+-------+----------+-----------+---------+------------+-------------+
1 row selected (2.875 seconds)
0: jdbc:hive2://10.32.36.142:10009/> select count(*)  from trace_log_refdes_hive_ro where pcbid ='E23MPM42203176';
2024-02-27 16:57:01.984 INFO KyuubiSessionManager-exec-pool: Thread-809 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[92377a4f-2dad-4039-be2d-4762946eaea8]: PENDING_STATE -> RUNNING_STATE, statement:
select count(*)  from trace_log_refdes_hive_ro where pcbid ='E23MPM42203176'
24/02/27 16:57:01 INFO ExecuteStatement: Processing root's query[92377a4f-2dad-
24/02/27 16:57:10 INFO SparkContext: Starting job: collect at ExecuteStatement.scala:72
24/02/27 16:57:10 INFO SQLOperationListener: Query [92377a4f-2dad-4039-be2d-4762946eaea8]: Job 6 started with 2 stages, 1 active jobs running
24/02/27 16:57:10 INFO SQLOperationListener: Query [92377a4f-2dad-4039-be2d-4762946eaea8]: Stage 8.0 started with 1 tasks, 1 active stages running
24/02/27 16:57:10 INFO SQLOperationListener: Finished stage: Stage(8, 0); Name: 'collect at ExecuteStatement.scala:72'; Status: succeeded; numTasks: 1; Took: 47 msec
24/02/27 16:57:10 INFO AdaptiveSparkPlanExec: Final plan:
*(2) HashAggregate(keys=[], functions=[count(1)], output=[count(1)#304L])
+- ShuffleQueryStage 0
   +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [plan_id=115]
      +- *(1) HashAggregate(keys=[], functions=[partial_count(1)], output=[count#308L])
         +- *(1) Project
            +- *(1) Filter (isnotnull(pcbid#277) AND (pcbid#277 = E23MPM42203176))
               +- BatchScan trace_log_refdes_hive_ro[pcbid#277] PaimonScan: [trace_log_refdes_hive_ro], PushedFilters: [IsNotNull(pcbid),Equal(pcbid, E23MPM42203176)] RuntimeFilters: []

2024-02-27 16:57:10.578 INFO KyuubiSessionManager-exec-pool: Thread-809 org.apache.kyuubi.operation.ExecuteStatement: Query[92377a4f-2dad-4039-be2d-4762946eaea8] in FINISHED_STATE
2024-02-27 16:57:10.578 INFO KyuubiSessionManager-exec-pool: Thread-809 org.apache.kyuubi.operation.ExecuteStatement: Processing root's query[92377a4f-2dad-4039-be2d-4762946eaea8]: RUNNING_STATE -> FINISHED_STATE, time taken: 8.594 seconds
+-----------+
| count(1)  |
+-----------+
| 3         |
+-----------+
1 row selected (8.604 seconds)
0: jdbc:hive2://10.32.36.142:10009/> 

结束

kyuubi整合spark on yarn 至此结束。

相关推荐
知初~7 小时前
出行项目案例
hive·hadoop·redis·sql·mysql·spark·database
努力的小T12 小时前
使用 Docker 部署 Apache Spark 集群教程
linux·运维·服务器·docker·容器·spark·云计算
Java资深爱好者17 小时前
在Spark中,如何使用DataFrame进行高效的数据处理
大数据·分布式·spark
阿里云大数据AI技术20 小时前
美的楼宇科技基于阿里云 EMR Serverless Spark 构建 LakeHouse 湖仓数据平台
大数据·阿里云·spark·serverless·emr
python资深爱好者1 天前
什么容错性以及Spark Streaming如何保证容错性
大数据·分布式·spark
猪猪果泡酒1 天前
spark
spark
weixin_307779132 天前
PySpark实现MERGE INTO的数据合并功能
大数据·python·spark
lucky_syq2 天前
Spark算子:大数据处理的魔法棒
大数据·分布式·spark
D愿你归来仍是少年3 天前
解决Python升级导致PySpark任务异常方案
大数据·开发语言·python·spark
weixin_307779133 天前
PySpark检查两个DataFrame的数据是否一致
大数据·spark·pandas