目录
问题描述
使用apache.poi
读取.xls文件时有The content of an excel record cannot exceed 8224 bytes
的报错。待读取的文件的内容也是通过apache.poi
写入的,我的文件修改步骤是先删除页签然后写入页签(页签名是保持不变的),这样一次修改的结果也是符合我的预期的,但是某次程序读取文件时就出现了下面的报错,而且手动也打不开文件了。
java
Exception in thread "main" org.apache.poi.util.RecordFormatException: The content of an excel record cannot exceed 8224 bytes
at org.apache.poi.hssf.record.RecordInputStream.nextRecord(RecordInputStream.java:222)
at org.apache.poi.hssf.record.RecordFactoryInputStream.nextRecord(RecordFactoryInputStream.java:253)
at org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:494)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:356)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:413)
at org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:394)
at com.mark.learning.bug.excel.ExcelXls.addSheet(ExcelXls.java:28)
at com.mark.learning.bug.excel.ExcelXls.main(ExcelXls.java:84)
版本
xml
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.17</version>
</dependency>
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi-ooxml</artifactId>
<version>3.17</version>
</dependency>
定位:打印size最大的Record
既然提示某一个Record超过了上限了,那我就把这个内容打印出来看看。最新定位到ExternSheetRecord 类的_list属性。
java
第267个record的size:6066
[EXTERNSHEET]
numOfRefs = 1010
refrec #0: extBook=0 firstSheet=-1 lastSheet=-1
refrec #1: extBook=0 firstSheet=-1 lastSheet=-1
refrec #2: extBook=0 firstSheet=-1 lastSheet=-1
refrec #3: extBook=0 firstSheet=-1 lastSheet=-1
refrec #4: extBook=0 firstSheet=-1 lastSheet=-1
refrec #5: extBook=0 firstSheet=-1 lastSheet=-1
java
public class ExternSheetRecord extends StandardRecord {
public final static short sid = 0x0017;
private final List<RefSubRecord> _list;//这里有很多的记录信息
定位:RefSubRecord
RefSubRecord记录是什么信息?什么时候进行初始化?我在构造函数打了一个断点,发现每当删除一个页签或者新增一个页签就会创建RefSubRecord页签
这里有意思的是删除页签的时候会把对应索引的记录的firstSheetIndex
和lastSheetIndex
修改为-1。但是后面新增的从时候又尝试根据这两个变量找到对应的索引
这样就导致到了ExternSheetRecord
类的_list
属性会随着程序的运行不断的增长!
解决
1.直接替换文件类型将.xls换位新版的.xlsx
2.升级版本apache.poi
版本,我尝试升级为3.8版本的时候发现就没有这个问题了,原因是再3.8中删除的删除的页签的时候不会修改RefSubRecord的信息
3.17的删除逻辑
3.8的删除逻辑
(少了上面红框的内容)
代码
复现...exceed 8224 bytes
报错的代码
java
public class ExcelTest {
private static int createSheetCnt = 0;
private static final String path = "C:\\Users\\Desktop\\test2.xls";
public void addSheet() {
try {
File file = new File(path);
FileInputStream in = new FileInputStream(file);
HSSFWorkbook workbook = new HSSFWorkbook(in);
in.close();
String sheetName = "test";
int sheetIndex = workbook.getSheetIndex(sheetName);
if (sheetIndex >= 0) {
//页签存在删除页签
workbook.removeSheetAt(sheetIndex);
}
//新建一个页签写入文件
workbook.createSheet(sheetName);
FileOutputStream fileOut = new FileOutputStream(path);
workbook.write(fileOut);
fileOut.close();
System.out.println("创建页签次数:" + ++createSheetCnt);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
@Test
public void test() {
for (int i = 0; i < 10000; i++) {
addSheet();
}
}
}
打印record信息的方法
java
public void printlnRecords() {
try {
File file = new File(path);
FileInputStream in = new FileInputStream(file);
HSSFWorkbook workbook = new HSSFWorkbook(in);
in.close();
InternalWorkbook internalWorkbook = workbook.getInternalWorkbook();
List<Record> records = internalWorkbook.getRecords();
System.out.println("records size:" + records.size());
int maxIndex = 0;
int maxRecordSize = 0;
for (int i = 0; i < records.size(); i++) {
Record record = records.get(i);
int recordSize = record.getRecordSize();
System.out.println("第" + i + "个record的size:" + recordSize);
System.out.println(record);
System.out.println();
if (recordSize > maxRecordSize) {
maxRecordSize = recordSize;
maxIndex = i;
}
}
System.out.println("第" + maxIndex + "个record的有最大size:" + maxRecordSize);
} catch (IOException e) {
throw new RuntimeException(e);
}
}