Apache Commons CSV
1、概述
Apache Commons CSV 是一个轻量级 Java CSV 处理库,用于:
- 读取 CSV(文件 / 流 / URL)
- 写入 CSV
- 支持多种"方言"(Excel、MySQL、RFC4180 等)
- 表头映射(按列名访问)
- 流式处理(适合大文件)
Maven依赖:
xml
<!-- Source: https://mvnrepository.com/artifact/org.apache.commons/commons-csv -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-csv</artifactId>
<version>1.9.0</version>
<scope>compile</scope>
</dependency>
2、核心设计
| 概念 | 说明 |
|---|---|
| CSVFormat | CSV 语法规则(分隔符、引号、换行等) |
| CSVParser | 读取并解析 CSV |
| CSVPrinter | 写入 CSV |
| CSVRecord | 一行数据 |
| 不可变对象 | CSVFormat一旦创建不可修改 |
Builder模式(推荐)
java
CSVFormat format = CSVFormat.DEFAULT
.builder()
.setHeader("id", "name")
.setSkipHeaderRecord(true)
.build();
3、常用CSVFormat预设
| 常量 | 适用场景 |
|---|---|
CSVFormat.DEFAULT |
通用 |
CSVFormat.EXCEL |
Excel 导出 |
CSVFormat.RFC4180 |
标准 CSV |
CSVFormat.MYSQL |
MySQL SELECT INTO `OUTFILE |
CSVFormat.TDF |
Tab 分隔 |
CSVFormat.POSTGRESQL_CSV |
PostgreSQL COPY |
示例:
java
CSVFormat format = CSVFormat.EXCEL.withHeader();
4、读取CSV
4.1、无表头
tom,18
jerry,20
mike,33
java
Reader reader = new FileReader("data.csv");
CSVParser parser = CSVFormat.DEFAULT.parse(reader);
for (CSVRecord record : parser) {
String col0 = record.get(0);
String col1 = record.get(1);
}
4.2、使用表头
name,age
tom,18
jerry,20
mike,33
java
Reader reader = new FileReader("./data/data.csv");
CSVFormat format = CSVFormat.DEFAULT
.builder()
.setHeader()
.setSkipHeaderRecord(true)
.build();
try(CSVParser parser = format.parse(reader)) {
for (CSVRecord record : parser) {
String name = record.get("name");
int age = Integer.valueOf(record.get("age"));
System.out.println(name + " " + age);
}
}
4.3、自定义表头
当文件没有表头的时候,可以自定义表头:
java
tom,18
jerry,20
mike,33
java
Reader reader = new FileReader("./data/data.csv");
CSVFormat format = CSVFormat.DEFAULT
.builder()
.setHeader(new String[]{"name", "age"})
.setSkipHeaderRecord(false)
.build();
try(CSVParser parser = format.parse(reader)) {
for (CSVRecord record : parser) {
String name = record.get("name");
int age = Integer.valueOf(record.get("age"));
System.out.println(name + " " + age);
}
}
4.4、大文件流式读取
不会一次性加载到内存:
java
try (CSVParser parser = CSVFormat.EXCEL
.builder()
.setHeader()
.build()
.parse(new FileReader("big.csv"))) {
for (CSVRecord r : parser) {
// 逐行处理
}
}
4.5、读取为Map
java
Reader reader = new FileReader("./data/data.csv");
CSVFormat format = CSVFormat.DEFAULT
.builder()
.setHeader(new String[]{"name", "age"})
.setSkipHeaderRecord(false)
.build();
try(CSVParser parser = format.parse(reader)) {
for (CSVRecord record : parser) {
//表头重复时会被覆盖
Map<String, String> map = record.toMap();
System.out.println(map);
}
}
5、写入CSV
5.1、基础写入
java
CSVPrinter printer = new CSVPrinter(new FileWriter("./data/a.csv"),
CSVFormat.EXCEL);
printer.printRecord("name", "age");
printer.printRecord("tom", 16);
printer.printRecord("jerry", 19);
printer.flush();
tom,18
jerry,20
mike,33
5.2、使用表头写入
java
CSVPrinter printer = CSVFormat.EXCEL
.builder()
.setHeader("name", "age")
.build()
.print(new FileWriter("./data/a.csv"));
printer.printRecord("tom", 16);
printer.printRecord("jerry", 19);
printer.flush();
name,age
tom,16
jerry,19
5.3、写入List/数组
java
CSVPrinter printer = CSVFormat.EXCEL
.builder()
.setHeader("name", "age", "score")
.build()
.print(new FileWriter("./data/a.csv"));
List<Object> list = Arrays.asList("tom", 18, 99.2);
printer.printRecord(list);
printer.flush();
name,age,score
tom,18,99.2
5.4、写入Map
java
try (CSVPrinter printer = CSVFormat.EXCEL
.builder()
.setHeader("name", "age", "score")
.build()
.print(new FileWriter("./data/a.csv"))) {
// 1. 改用 LinkedHashMap 保证顺序
Map<String, Object> map = new LinkedHashMap<>();
// 2. 必须严格按照 name, age, score 的顺序 put
map.put("name", "tom");
map.put("age", 19);
map.put("score", 99.9);
// 3. 传入 values 集合
printer.printRecord(map.values());
printer.flush();
}
name,age,score
tom,19,99.9
6、常见配置参数
| 方法 | 作用 |
|---|---|
setDelimiter(',') |
分隔符 |
setQuote('"') |
引号 |
setEscape('\\') |
转义 |
setCommentMarker('#') |
注释 |
setIgnoreSurroundingSpaces(true) |
去空格 |
setNullString("NULL") |
null 映射 |
setSkipHeaderRecord(true) |
跳过表头 |
setTrim(true) |
trim 字段 |
java
CSVFormat format = CSVFormat.EXCEL
.builder()
.setDelimiter(';')
.setTrim(true)
.build();
7、常见问题
7.1、中文乱码
java
new InputStreamReader(
new FileInputStream("data.csv"), StandardCharsets.UTF_8
)
7.2、Excel打开乱码
java
FileOutputStream fos = new FileOutputStream("data.csv");
fos.write('\ufeff');
CSVPrinter printer = CSVFormat.EXCEL.print(new OutputStreamWriter(fos, UTF_8));
7.3、空行报错
java
CSVFormat.DEFAULT
.builder()
.setIgnoreEmptyLines(true)
.build();