Spark Catalog

#iceberg catalog

https://iceberg.apache.org/docs/latest/spark-configuration/

相关接口

复制代码
  /**
   * (Scala-specific)
   * Create a table from the given path based on a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.0.0
   */
  @deprecated("use createTable instead.", "2.2.0")
  def createExternalTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame = {
    createTable(tableName, source, schema, options)
  }

  /**
   * (Scala-specific)
   * Create a table based on the dataset in a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.2.0
   */
  def createTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame

hive metastore

The default implementation of the Hive metastore in Apache Spark uses Apache Derby for its database persistence. This is available with no configuration required but is limited to only one Spark session at any time for the purposes of metadata storage. This obviously makes it unsuitable for use in multi-user environments, such as when shared on a development team or used in Production.

相关推荐
凉白开3381 小时前
spark总结
大数据·分布式·spark
等雨季1 小时前
Spark总结
大数据·分布式·spark
xυlai1 小时前
Spark-Streaming
大数据·分布式·spark
神奇的黄豆1 小时前
Spark-Streaming核心编程(四)总结
大数据·spark
哈哈~1561 小时前
Spark SQL核心概念与编程实战:从DataFrame到DataSet的结构化数据处理
spark
哈哈~1562 小时前
Spark RDD行动算子与共享变量实战:从数据聚合到分布式通信
spark
凉白开3386 小时前
Spark-Streaming核心编程
大数据·分布式·spark
不要天天开心8 小时前
大数据利器:Kafka与Spark的深度探索
spark·scala
A-Kamen12 小时前
MySQL 存储引擎对比:InnoDB vs MyISAM vs Memory
数据库·mysql·spark
欧先生^_^15 小时前
Spark 的一些典型应用场景及具体示例
大数据·分布式·spark