Spark Catalog

#iceberg catalog

https://iceberg.apache.org/docs/latest/spark-configuration/

相关接口

复制代码
  /**
   * (Scala-specific)
   * Create a table from the given path based on a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.0.0
   */
  @deprecated("use createTable instead.", "2.2.0")
  def createExternalTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame = {
    createTable(tableName, source, schema, options)
  }

  /**
   * (Scala-specific)
   * Create a table based on the dataset in a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.2.0
   */
  def createTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame

hive metastore

The default implementation of the Hive metastore in Apache Spark uses Apache Derby for its database persistence. This is available with no configuration required but is limited to only one Spark session at any time for the purposes of metadata storage. This obviously makes it unsuitable for use in multi-user environments, such as when shared on a development team or used in Production.

相关推荐
墨染丶eye17 小时前
数据仓库项目启动与管理
大数据·数据仓库·spark
Y1nhl1 天前
Pyspark学习一:概述
数据库·人工智能·深度学习·学习·spark·pyspark·大数据技术
chat2tomorrow3 天前
数据仓库是什么?数据仓库的前世今生 (数据仓库系列一)
大数据·数据库·数据仓库·低代码·华为·spark·sql2api
SmartManWind3 天前
YARN Container与Spark Executor参数优先级详解
大数据·javascript·spark
Freedom℡3 天前
hadoop 集群的常用命令
spark
猪猪果泡酒3 天前
Spark,hadoop的组成
spark
今天我又学废了3 天前
Spark,配置hadoop集群1
大数据·hadoop·spark
Lansonli4 天前
大数据Spark(五十六):Spark生态模块与运行模式
大数据·分布式·spark
hf2000124 天前
技术深度报道:解析云器Lakehouse如何实现超越Spark 10倍性能提升
大数据·分布式·spark
Arbori_262154 天前
Spark 程序的本地模式和集群模式
大数据·分布式·spark