根据说明,调用测试
设置注册的API Key和Secret Key

调用类(官方文档中有)

这里改传入路径;
测试问题
1.{"error_code":110,"error_msg":"Access token invalid or no longer valid"}
查到说是
原来第一步取AccessToken方法,有误区,返回的result是一个集合,AccessToken是其中一项。。。


需要转化后获取(弱水三千,TMD只取一瓢):
自建类库,参考
public class AccessTokenInfo
{
public string refresh_token { get; set; }
public string expires_in { get; set; }
public string session_key { get; set; }
public string access_token { get; set; }
public string scope { get; set; }
public string session_secret { get; set; }
}
2.按照之前的类传入PDF不识别
{"log_id":1901887988395845459,"error_msg":"image format error","error_code":216201}
原因:给的示例只支持image,PDF需要自己调整:

至此调用成功
3.解析字符串
自建类库
public class OcrData
{
public string log_id { get; set; }
public string pdf_file_size { get; set; }
public string words_result_num { get; set; }
public InvoiceData words_result { get; set; }
}
public class InvoiceData
{
/// <summary>
/// 发票类型-电子发票(普通发票)
/// </summary>
public string InvoiceTypeOrg { get; set; }
/// <summary>
/// 发票号
/// </summary>
public string InvoiceNum { get; set; }
/// <summary>
/// 发票日期
/// </summary>
public string InvoiceDate { get; set; }
/// <summary>
/// 购买方抬头
/// </summary>
public string PurchaserName { get; set; }
/// <summary>
/// 购买方统一社会信用代码/纳税人识别号
/// </summary>
public string PurchaserRegisterNum { get; set; }
/// <summary>
/// 销售方抬头
/// </summary>
public string SellerName { get; set; }
/// <summary>
/// 销售方统一社会信用代码/纳税人识别号
/// </summary>
public string SellerRegisterNum { get; set; }
/// <summary>
/// 价税合计(小写)
/// </summary>
public string AmountInFiguers { get; set; }
/// <summary>
/// 税额-列表
/// </summary>
public List<CommodityData> CommodityTaxRate { get; set; }
/// <summary>
/// 税额-列表
/// </summary>
public List<CommodityData> CommodityTax { get; set; }
/// <summary>
/// 税额合计
/// </summary>
public string TotalTax { get; set; }
/// <summary>
/// 备注
/// </summary>
public string Remarks { get; set; }
/// <summary>
/// 开票人
/// </summary>
public string NoteDrawer { get; set; }
/// <summary>
/// 合计
/// </summary>
public string TotalAmount { get; set; }
}
public class CommodityData
{
public string row { get; set; }
public string word { get; set; }
}
4.一个pdf多张发票问题
没查到可以一次读取多个的接口,笨办法就是分割成多个pdf,然后分别读取,下面是PDF分割的方法
string inputPdfPath = "path/to/your/input.pdf";
string outputDir = "path/to/output/directory";
// 确保输出目录存在
Directory.CreateDirectory(outputDir);
using (PdfReader reader = new PdfReader(inputPdfPath))
{
using (PdfDocument pdfDoc = new PdfDocument(reader))
{
int numberOfPages = pdfDoc.GetNumberOfPages();
// 遍历每一页
for (int i = 1; i <= numberOfPages; i++)
{
// 创建新文件的路径
string outputPath = Path.Combine(outputDir, $"page_{i}.pdf");
// 创建一个新的PDF文档,只包含当前页
PdfDocument singlePageDoc = new PdfDocument(new PdfWriter(outputPath));
pdfDoc.CopyPagesTo(i, i, singlePageDoc);
singlePageDoc.Close();
---这里读取就好了
}
}
}