Spark Catalog

#iceberg catalog

https://iceberg.apache.org/docs/latest/spark-configuration/

相关接口

复制代码
  /**
   * (Scala-specific)
   * Create a table from the given path based on a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.0.0
   */
  @deprecated("use createTable instead.", "2.2.0")
  def createExternalTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame = {
    createTable(tableName, source, schema, options)
  }

  /**
   * (Scala-specific)
   * Create a table based on the dataset in a data source, a schema and a set of options.
   * Then, returns the corresponding DataFrame.
   *
   * @param tableName is either a qualified or unqualified name that designates a table.
   *                  If no database identifier is provided, it refers to a table in
   *                  the current database.
   * @since 2.2.0
   */
  def createTable(
      tableName: String,
      source: String,
      schema: StructType,
      options: Map[String, String]): DataFrame

hive metastore

The default implementation of the Hive metastore in Apache Spark uses Apache Derby for its database persistence. This is available with no configuration required but is limited to only one Spark session at any time for the purposes of metadata storage. This obviously makes it unsuitable for use in multi-user environments, such as when shared on a development team or used in Production.

相关推荐
Leo.yuan7 分钟前
实时数据仓库是什么?数据仓库设计怎么做?
大数据·数据库·数据仓库·数据分析·spark
£菜鸟也有梦5 小时前
从0到1,带你走进Flink的世界
大数据·hadoop·flink·spark
小伍_Five19 小时前
Spark实战能力测评模拟题精析【模拟考】
java·大数据·spark·scala·intellij-idea
不吃饭的猪19 小时前
记一次运行spark报错
大数据·分布式·spark
qq_4639448619 小时前
【Spark征服之路-2.1-安装部署Spark(一)】
大数据·分布式·spark
后端码匠1 天前
Kafka 单机部署启动教程(适用于 Spark + Hadoop 环境)
hadoop·spark·kafka
技术吧3 天前
Spark-TTS: AI语音合成的“变声大师“
大数据·人工智能·spark
MyikJ6 天前
Java互联网大厂面试:从Spring Boot到Kafka的技术深度探索
java·spring boot·微服务·面试·spark·kafka·spring security
向哆哆6 天前
Java 大数据处理:使用 Hadoop 和 Spark 进行大规模数据处理
java·hadoop·spark
阿里云大数据AI技术6 天前
Fusion引擎赋能:流利说如何用阿里云Serverless Spark实现数仓计算加速
大数据·人工智能·阿里云·spark·serverless·云计算