Spark Catalog

#iceberg catalog

https://iceberg.apache.org/docs/latest/spark-configuration/

相关接口

复制代码
  /**
   * (Scala-specific)
   * Create a table from the given path based on a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.0.0
   */
  @deprecated("use createTable instead.", "2.2.0")
  def createExternalTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame = {
    createTable(tableName, source, schema, options)
  }

  /**
   * (Scala-specific)
   * Create a table based on the dataset in a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.2.0
   */
  def createTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame

hive metastore

The default implementation of the Hive metastore in Apache Spark uses Apache Derby for its database persistence. This is available with no configuration required but is limited to only one Spark session at any time for the purposes of metadata storage. This obviously makes it unsuitable for use in multi-user environments, such as when shared on a development team or used in Production.

相关推荐
是阿威啊20 小时前
【maap-analysis】spark离线数仓项目完整的开发流程
大数据·分布式·spark·scala
ha_lydms2 天前
3、Spark 函数_d/e/f/j/h/i/j/k/l
大数据·分布式·spark·函数·数据处理·dataworks·maxcompute
ha_lydms2 天前
4、Spark 函数_m/n/o/p/q/r
大数据·数据库·python·sql·spark·数据处理·dataworks
潘达斯奈基~2 天前
spark性能优化5:资源配置与并行度优化
大数据·ajax·性能优化·spark
ha_lydms2 天前
2、Spark 函数_a/b/c
大数据·c语言·hive·spark·时序数据库·dataworks·数据开发
ha_lydms2 天前
6、Spark 函数_u/v/w/x/y/z
java·大数据·python·spark·数据处理·dataworks·spark 函数
潘达斯奈基~2 天前
spark性能优化6:内存管理
大数据·测试工具·性能优化·spark
笨蛋少年派3 天前
*Spark简介
大数据·分布式·spark
红队it3 天前
【数据分析】基于Spark链家网租房数据分析可视化大屏(完整系统源码+数据库+开发笔记+详细部署教程+虚拟机分布式启动教程)✅
java·数据库·hadoop·分布式·python·数据分析·spark
奥特曼_ it3 天前
【数据分析】基于Spark链家网租房数据分析可视化大屏(完整系统源码+数据库+开发笔记+详细部署教程+虚拟机分布式启动教程)✔
大数据·笔记·分布式·数据挖掘·数据分析·spark·毕设