解析邮件文本内容; Mime文本解析; MimeStreamParser; multipart解析

原始文本

bash 复制代码
------=_Part_46705_715015081.1699589700255
Content-Type: text/html;charset=UTF-8
Content-Transfer-Encoding: base64

PGh0bWw+CiAgICA8aGVhZD4KICAgICAgICA8bWV0YSBodHRwLW
VxdWl2PSJDb250ZW50LVR5cGUiIGNvbnRlbnQ9InRleHQvaHRt
bDsgY2hhcnNldD1VVEYtOCI+CiAgICAgICAgPHRpdGxlPkpTUC
BQYWdlPC90aXRsZT4KICAgIDwvaGVhZD4KICAgIDxib2R5Pgog
ICAgICAgIDxoMT5IZWxsbyBXb3JsZCE8L2gxPgogICAgPC9ib2
R5Pgo8L2h0bWw+
------=_Part_46705_715015081.1699589700255--

Maven

xml 复制代码
 <dependency>
     <groupId>org.apache.james</groupId>
     <artifactId>apache-mime4j-core</artifactId>
     <version>0.8.9</version>
 </dependency>

解析方法

java 复制代码
String data = "------=_Part_46705_715015081.1699589700255\n" +
        "Content-Type: text/html;charset=UTF-8\n" +
        "Content-Transfer-Encoding: base64\n" +
        "\n" +
        "PGh0bWw+CiAgICA8aGVhZD4KICAgICAgICA8bWV0YSBodHRwLW\n" +
        "VxdWl2PSJDb250ZW50LVR5cGUiIGNvbnRlbnQ9InRleHQvaHRt\n" +
        "bDsgY2hhcnNldD1VVEYtOCI+CiAgICAgICAgPHRpdGxlPkpTUC\n" +
        "BQYWdlPC90aXRsZT4KICAgIDwvaGVhZD4KICAgIDxib2R5Pgog\n" +
        "ICAgICAgIDxoMT5IZWxsbyBXb3JsZCE8L2gxPgogICAgPC9ib2\n" +
        "R5Pgo8L2h0bWw+\n" +
        "------=_Part_46705_715015081.1699589700255--";
System.out.println(data);
HtmContentHandler contentHandler = new HtmContentHandler();
MimeConfig mime4jParserConfig = MimeConfig.DEFAULT;
BodyDescriptorBuilder bodyDescriptorBuilder = new DefaultBodyDescriptorBuilder();
MimeStreamParser mime4jParser = new MimeStreamParser(mime4jParserConfig, DecodeMonitor.SILENT, bodyDescriptorBuilder);
mime4jParser.setContentDecoding(true);
mime4jParser.setContentHandler(contentHandler);
mime4jParser.parse(new ByteArrayInputStream(data.getBytes(UTF_8)));
System.out.println(contentHandler.getData());

HtmContentHandler

java 复制代码
import org.apache.commons.io.IOUtils;
import org.apache.james.mime4j.MimeException;
import org.apache.james.mime4j.dom.Header;
import org.apache.james.mime4j.field.ContentTypeFieldImpl;
import org.apache.james.mime4j.message.SimpleContentHandler;
import org.apache.james.mime4j.stream.BodyDescriptor;
import org.apache.james.mime4j.stream.Field;

import java.io.IOException;
import java.io.InputStream;
import java.util.Optional;

/**
 * @author zengrenyuan
 * @date 2023/11/10
 **/
public class HtmContentHandler extends SimpleContentHandler {
    private String data;
    private String charset;
    private String contentType;

    @Override
    public void body(BodyDescriptor bd, InputStream is) throws MimeException, IOException {
        this.data = IOUtils.toString(is, Optional.ofNullable(charset).orElse("UTF-8"));
        //这里可以处理文本内容
    }

    @Override
    public void headers(Header header) {
         //在这里解析头信息
        Field contentType = header.getField("Content-Type");
        if (contentType != null) {
            if (contentType instanceof ContentTypeFieldImpl) {
                this.contentType = ((ContentTypeFieldImpl) contentType).getMimeType();
                charset = ((ContentTypeFieldImpl) contentType).getParameter("charset");
            }
        }
    }
    public String getData() {
        return data;
    }

    public String getCharset() {
        return charset;
    }

    public String getContentType() {
        return contentType;
    }
}

参考资料

https://james.apache.org/mime4j/index.html

https://github.com/apache/james-mime4j

如果想解析一段Email数据也可以参考

https://github.com/ram-sharma-6453/email-mime-parser

相关推荐
Lionel_SSL1 小时前
《深入理解Java虚拟机》第三章读书笔记:垃圾回收机制与内存管理
java·开发语言·jvm
记得开心一点嘛1 小时前
手搓Springboot
java·spring boot·spring
老华带你飞2 小时前
租房平台|租房管理平台小程序系统|基于java的租房系统 设计与实现(源码+数据库+文档)
java·数据库·小程序·vue·论文·毕设·租房系统管理平台
独行soc2 小时前
2025年渗透测试面试题总结-66(题目+回答)
java·网络·python·安全·web安全·adb·渗透测试
脑子慢且灵2 小时前
[JavaWeb]模拟一个简易的Tomcat服务(Servlet注解)
java·后端·servlet·tomcat·intellij-idea·web
华仔啊3 小时前
SpringBoot 中 6 种数据脱敏方案,第 5 种太强了,支持深度递归!
java·后端
异常驯兽师4 小时前
Spring 中处理 HTTP 请求参数注解全解析
java·spring·http
连合机器人5 小时前
晨曦中的守望者:当科技为景区赋予温度
java·前端·科技
AD钙奶-lalala5 小时前
idea新建的项目new 没有java class选项
java·ide·intellij-idea
sheji34165 小时前
【开题答辩全过程】以 12306候补购票服务系统为例,包含答辩的问题和答案
java·eclipse