Spark Catalog

#iceberg catalog

https://iceberg.apache.org/docs/latest/spark-configuration/

相关接口

复制代码
  /**
   * (Scala-specific)
   * Create a table from the given path based on a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.0.0
   */
  @deprecated("use createTable instead.", "2.2.0")
  def createExternalTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame = {
    createTable(tableName, source, schema, options)
  }

  /**
   * (Scala-specific)
   * Create a table based on the dataset in a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.2.0
   */
  def createTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame

hive metastore

The default implementation of the Hive metastore in Apache Spark uses Apache Derby for its database persistence. This is available with no configuration required but is limited to only one Spark session at any time for the purposes of metadata storage. This obviously makes it unsuitable for use in multi-user environments, such as when shared on a development team or used in Production.

相关推荐
为儿打call4 小时前
SparkSQL 广播超时排查:小表但是多分区 = BroadcastTimeout
大数据·spark
计算机毕业编程指导师10 小时前
【Python大数据项目推荐】基于Hadoop+Django脑卒中风险分析系统源码解析 毕业设计 选题推荐 毕设选题 数据分析 机器学习 数据挖掘
大数据·hadoop·python·计算机·spark·毕业设计·脑卒中
计算机毕业编程指导师10 小时前
【大数据毕设推荐】Hadoop+Spark电影票房分析系统,Python+Django全栈实现 毕业设计 选题推荐 毕设选题 数据分析 机器学习 数据挖掘
大数据·hadoop·python·计算机·spark·毕业设计·电影票房
计算机毕业编程指导师1 天前
【计算机毕设推荐】Python+Spark卵巢癌风险数据可视化系统完整实现 毕业设计 选题推荐 毕设选题 数据分析 机器学习 数据挖掘
hadoop·python·计算机·数据挖掘·spark·毕业设计·卵巢癌
极光代码工作室1 天前
基于大数据的校园消费行为分析系统
大数据·hadoop·python·数据分析·spark
whuang0943 天前
腾讯云 emr 无法以cosn 写入云存储
spark
howard20054 天前
2.4.3 集群模式运行Spark项目
spark·项目打包·提交运行
孤雪心殇4 天前
快速上手数仓基础知识
数据仓库·hive·spark
渣渣盟4 天前
Spark 性能调优实战:从开发到生产落地
javascript·ajax·spark
渣渣盟5 天前
大数据技术栈全景图:从零到一的入门路线(深度实战版)
大数据·hadoop·python·flink·spark