Spark Catalog

#iceberg catalog

https://iceberg.apache.org/docs/latest/spark-configuration/

相关接口

复制代码
  /**
   * (Scala-specific)
   * Create a table from the given path based on a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.0.0
   */
  @deprecated("use createTable instead.", "2.2.0")
  def createExternalTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame = {
    createTable(tableName, source, schema, options)
  }

  /**
   * (Scala-specific)
   * Create a table based on the dataset in a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.2.0
   */
  def createTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame

hive metastore

The default implementation of the Hive metastore in Apache Spark uses Apache Derby for its database persistence. This is available with no configuration required but is limited to only one Spark session at any time for the purposes of metadata storage. This obviously makes it unsuitable for use in multi-user environments, such as when shared on a development team or used in Production.

相关推荐
Q264336502311 小时前
【有源码】基于Hadoop与Spark的时尚精品店数据分析与可视化系统-基于多维度分析的零售时尚销售数据挖掘与可视化研究
大数据·hadoop·机器学习·数据挖掘·数据分析·spark·毕业设计
北邮-吴怀玉11 小时前
6.1.1.1 大数据方法论与实践指南-Spark/Flink 任务开发规范
大数据·flink·spark
LDG_AGI13 小时前
【推荐系统】深度学习训练框架(一):深入剖析Spark集群计算中Master与Pytorch分布式计算Master的区别
人工智能·深度学习·算法·机器学习·spark
LDG_AGI13 小时前
【推荐系统】深度学习训练框架(二):深入剖析Spark Cluster模式下DDP网络配置解析
大数据·网络·人工智能·深度学习·算法·机器学习·spark
丸卜1 天前
spark-RDD期中
spark
北邮-吴怀玉1 天前
6.1.1.2 大数据方法论与实践指南-实时任务(spark/flink)任务的 cicd 解决方案
大数据·flink·spark
蒋星熠2 天前
分布式计算深度解析:从理论到实践的技术探索
分布式·机器学习·spark·自动化·云计算·边缘计算·mapreduce
B站_计算机毕业设计之家2 天前
基于大数据的短视频数据分析系统 Spark哔哩哔哩视频数据分析可视化系统 Hadoop大数据技术 情感分析 舆情分析 爬虫 推荐系统 协同过滤推荐算法 ✅
大数据·hadoop·爬虫·spark·音视频·短视频·1024程序员节
面向星辰2 天前
day07 spark sql
大数据·sql·spark
智海观潮2 天前
聊聊Spark的分区
java·大数据·spark