Links
#iceberg catalog
https://iceberg.apache.org/docs/latest/spark-configuration/
相关接口
/**
* (Scala-specific)
* Create a table from the given path based on a data source, a schema and a set of options.
* Then, returns the corresponding DataFrame.
*
* @param tableName is either a qualified or unqualified name that designates a table.
* If no database identifier is provided, it refers to a table in
* the current database.
* @since 2.0.0
*/
@deprecated("use createTable instead.", "2.2.0")
def createExternalTable(
tableName: String,
source: String,
schema: StructType,
options: Map[String, String]): DataFrame = {
createTable(tableName, source, schema, options)
}
/**
* (Scala-specific)
* Create a table based on the dataset in a data source, a schema and a set of options.
* Then, returns the corresponding DataFrame.
*
* @param tableName is either a qualified or unqualified name that designates a table.
* If no database identifier is provided, it refers to a table in
* the current database.
* @since 2.2.0
*/
def createTable(
tableName: String,
source: String,
schema: StructType,
options: Map[String, String]): DataFrame
hive metastore
The default implementation of the Hive metastore in Apache Spark uses Apache Derby for its database persistence. This is available with no configuration required but is limited to only one Spark session at any time for the purposes of metadata storage. This obviously makes it unsuitable for use in multi-user environments, such as when shared on a development team or used in Production.