使用两种方式
- Jpa 默认插入方法(项目默认使用hypersistence utils 优化的BaseJpaRepository)
- JdbcTemplate 执行 批量SQL
yml 开启JPA 批量配置
yaml
jpa:
properties:
hibernate:
#格式胡sql 语句
format_sql: false
# 开启批量插入
jdbc:
batch_size: 1000
batch_versioned_data: true
order_inserts: true
order_updates: true
# 来自网络,rewriteBatchedStatements=true ,但测试发现并无影响
url: jdbc:p6spy:mysql://127.0.0.1:3306/xxxx?useSSL=false&useUnicode=true&characterEncoding=utf-8&zeroDateTimeBehavior=convertToNull&transformedBitIsBoolean=true&serverTimezone=GMT%2B8&nullCatalogMeansCurrent=true&allowPublicKeyRetrieval=true&rewriteBatchedStatements=true
通知公告为例
插入5000条 数据耗时分析(数据库ID 自增)
java
@Override
public boolean save(Notice notice) {
StopWatch sw = new StopWatch();
sw.start("单条入库");
noticeDao.persist(notice);
sw.stop();
List<Notice> noticeList = new ArrayList<>(5000);
notice.setId(null);
for (int i = 0; i < 5000; i++) {
Notice copy = notice.copy();
copy.setTitle("通知公告" + i);
noticeList.add(copy);
}
sw.start("JPA 插入数据库");
// persistAll
noticeDao.persistAllAndFlush(noticeList);
sw.stop();
sw.start("JDBC插入 单次5k");
// 使用雪花ID
Snowflake snowflake = new Snowflake();
StringBuilder sb = new StringBuilder("INSERT INTO `smiletest`.`biz_notice`(`id`, `title`, `parent_id`, `json_str`, `notice_time`, `create_time`, `update_time`, `create_by`, `update_by`) VALUES");
sb.append(" (").append(snowflake.nextId()).append(", '通知公告', NULL, '[]', '2023-11-08 09:42:26', '2023-11-08 09:42:41', '2023-11-08 09:42:41', 2927901808918528, 2927901808918528)");
for (int i = 0; i < 5000; i++) {
sb.append(" ,(").append(snowflake.nextId()).append(", '通知公告', NULL, '[]', '2023-11-08 09:42:26', '2023-11-08 09:42:41', '2023-11-08 09:42:41', 2927901808918528, 2927901808918528)");
}
sb.append(";");
jdbcTemplate.execute(sb.toString());
sw.stop();
sw.start("JDBC插入 2k/次,循环100次 20w");
sb = new StringBuilder("INSERT INTO `smiletest`.`biz_notice`( `title`, `parent_id`, `json_str`, `notice_time`, `create_time`, `update_time`, `create_by`, `update_by`) VALUES");
sb.append(" ('通知公告', NULL, '[]', '2023-11-08 09:42:26', '2023-11-08 09:42:41', '2023-11-08 09:42:41', 2927901808918528, 2927901808918528)");
sb.append(" ,('通知公告', NULL, '[]', '2023-11-08 09:42:26', '2023-11-08 09:42:41', '2023-11-08 09:42:41', 2927901808918528, 2927901808918528)"
.repeat(2000));
sb.append(";");
for (int i = 0; i < 100; i++) {
jdbcTemplate.execute(sb.toString());
}
sw.stop();
System.out.println(sw.prettyPrint(TimeUnit.SECONDS));
System.out.println(sw);
return true;
}
日志输出
log
StopWatch '': running time = 22 s
---------------------------------------------
s % Task name
---------------------------------------------
000000000 00% 单条入库
000000012 57% JPA 插入数据库
000000000 02% JDBC插入 单次5k
000000009 41% JDBC插入 2k/次,循环100次 20w
StopWatch '': running time = 22338138700 ns;
[单条入库]
17840900 ns = 0%;
[JPA 插入数据库]
12799729300 ns = 57%;
[JDBC插入 单次5k]
345819100 ns = 2%;
[JDBC插入 2k/次,循环100次 20w]
9174749400 ns = 41%
不开JPA 批量配置,hypersistence utils 的BaseJpaRepository 有默认优化。
log
StopWatch '': running time = 23 s
---------------------------------------------
s % Task name
---------------------------------------------
000000000 01% 单条入库
000000013 59% JPA 插入数据库
000000000 02% JDBC插入 单次5k
000000009 39% JDBC插入 2k/次,循环100次 20w
总结
Jpa配置开启批量配置后,5K数据,batch_size 500 需19秒,1000需13秒。 jdbcTemplate 5k 毫秒级,20w 9秒遥遥领先。
在模块开发效率和性能效率之间 抉择就是了,Jpa ORM框架,切换数据库无sql影响。但jdbcTemplate 则需确认SQL语句是否兼容。