Spark读取MySQL数据库表

官方地址:JDBC To Other Databases - Spark 4.0.0 Documentation

官方案例:

复制代码
// Note: JDBC loading and saving can be achieved via either the load/save or jdbc methods
// Loading data from a JDBC source
val jdbcDF = spark.read
  .format("jdbc")
  .option("url", "jdbc:postgresql:dbserver")
  .option("dbtable", "schema.tablename")
  .option("user", "username")
  .option("password", "password")
  .load()

val connectionProperties = new Properties()
connectionProperties.put("user", "username")
connectionProperties.put("password", "password")
val jdbcDF2 = spark.read
  .jdbc("jdbc:postgresql:dbserver", "schema.tablename", connectionProperties)
// Specifying the custom data types of the read schema
connectionProperties.put("customSchema", "id DECIMAL(38, 0), name STRING")
val jdbcDF3 = spark.read
  .jdbc("jdbc:postgresql:dbserver", "schema.tablename", connectionProperties)

// Saving data to a JDBC source
jdbcDF.write
  .format("jdbc")
  .option("url", "jdbc:postgresql:dbserver")
  .option("dbtable", "schema.tablename")
  .option("user", "username")
  .option("password", "password")
  .save()

jdbcDF2.write
  .jdbc("jdbc:postgresql:dbserver", "schema.tablename", connectionProperties)

// Specifying create table column data types on write
jdbcDF.write
  .option("createTableColumnTypes", "name CHAR(64), comments VARCHAR(1024)")
  .jdbc("jdbc:postgresql:dbserver", "schema.tablename", connectionProperties)

验证:

复制代码
<dependencies>
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>2.13.16</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.13</artifactId>
      <version>4.0.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.13</artifactId>
      <version>4.0.0</version>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>

    <dependency>
      <groupId>mysql</groupId>
      <artifactId>mysql-connector-java</artifactId>
      <version>8.0.15</version>
    </dependency>

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.13</artifactId>
      <version>4.0.0</version>
    </dependency>

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kafka-0-10_2.13</artifactId>
      <version>4.0.0</version>
    </dependency>
  </dependencies>

import org.apache.spark.SparkConf

object SparkCase04 {


  import org.apache.spark.sql.SparkSession


  def main(args: Array[String]): Unit = {

    val conf = new SparkConf().setMaster("local[1]").setAppName("WC")

    val spark = SparkSession.builder().config(conf).getOrCreate()


   var df = spark.read.format("jdbc")
    .option("url", "jdbc:mysql://node11:3306/wjobs?useSSL=false")
    .option("dbtable", "user")
    .option("user", "root")
    .option("password", "root123")
    .load()

    df.show()
  }


}
相关推荐
像风一样!1 小时前
MySQL Galera Cluster部署如何实现负载均衡和高可用
数据库·mysql
last_zhiyin2 小时前
Oracle sql tuning guide 翻译 Part 6-4 --- Hint使用准则和Hint使用报告
数据库·sql·oracle·sql tunning
chenchihwen3 小时前
AI代码开发宝库系列:FAISS向量数据库
数据库·人工智能·python·faiss·1024程序员节
小光学长3 小时前
基于Vue的课程达成度分析系统t84pzgwk(程序+源码+数据库+调试部署+开发环境)带论文文档1万字以上,文末可获取,系统界面在最后面。
前端·数据库·vue.js
摇滚侠3 小时前
全面掌握PostgreSQL关系型数据库,备份和恢复,笔记46和笔记47
java·数据库·笔记·postgresql·1024程序员节
周杰伦fans4 小时前
Navicat - 连接 mysql 、 sqlserver 数据库 步骤与问题解决
数据库·mysql·sqlserver
csdn_aspnet5 小时前
如何在 Ubuntu 24.04/22.04/20.04 上安装 MySQL 8.0
linux·mysql·ubuntu
码以致用5 小时前
StarRocks笔记
数据库·starrocks·olap·1024程序员节
auspicious航5 小时前
PostgreSQL数据库关于pg_rewind的认识
数据库·postgresql·oracle
最好结果5 小时前
MyBatis 精确查询逗号分隔字符串
mysql·mybatis·1024程序员节