FFmpeg--FlvPaser源码解析

文章目录

- FLV文件解析器实现分析
- FLV文件头结构
- FLV文件整体结构
- - 解析全部过程
  - [FLV Header解析 (9字节)](#FLV Header解析 (9字节))
  - Tag头部解析 (11字节)
  - 视频Tag解析 (类型0x09)
  - - AVC序列头解析 (AVCPacketType=0)
    - [AVC NALU解析 (AVCPacketType=1)](#AVC NALU解析 (AVCPacketType=1))
  - 音频Tag解析 (类型0x08)
  - - AAC序列头解析 (AACPacketType=0)
    - AAC原始数据解析 (AACPacketType=1)
- 完整代码

FLV文件解析器实现分析

整体架构

该解析器采用C++实现，针对Flash Video(FLV)格式进行解析。核心架构围绕主解析器类和标签类体系构建，提供完整的FLV文件解析能力。

运行：解析flv 出 aac 和 h264格式，并可以封装为flv格式

核心组件

CFlvParser - 主解析器类

承担FLV文件的整体解析流程控制
管理所有标签对象的生命周期和内存分配
提供文件统计信息和数据导出接口
处理FLV文件头解析和标签分派

标签类体系

采用面向对象设计实现多类型标签处理：

Tag: 基础标签抽象类，定义公共接口
CVideoTag : 处理视频标签(0x09)
- 支持H.264/AVC视频编码解析
- 解析关键帧和非关键帧数据
- 提取视频尺寸、帧率等参数
CAudioTag : 处理音频标签(0x08)
- 支持AAC音频编码解析
- 提取采样率、声道数等音频参数
- 处理音频特定配置信息
CMetaDataTag : 处理元数据标签(0x12)
- 解析onMetaData信息
- 提取视频时长、文件大小等元数据
- 处理自定义元数据字段

FLV文件解析流程

cpp 复制代码

int CFlvParser::Parse(uint8_t *pBuf, int nBufSize, int &nUsedLen)

解析FLV文件头(9字节)

循环解析各个Tag

读取前一个Tag的大小(4字节)

解析当前Tag头部(11字节)

根据类型创建具体Tag对象

存储到Tag向量中

视频标签处理(H.264)

cpp 复制代码

int CFlvParser::CVideoTag::ParseH264Tag(CFlvParser *pParser)

处理两种类型的视频包：

AVC序列头(nAVCPacketType=0): 包含SPS/PPS等解码参数
AVC NALU(nAVCPacketType=1): 实际的视频帧数据

关键特性：

支持不同长度的NALU长度字段(1/2/3/4字节)
为每个NALU添加H.264起始码(0x00000001)
处理时间戳信息

音频标签处理(AAC)

cpp 复制代码

int CFlvParser::CAudioTag::ParseAACTag(CFlvParser *pParser)

处理两种类型的音频包：

AAC序列头: 解析音频配置信息(profile, sample rate, channels)
AAC原始数据: 为AAC数据添加ADTS头部，生成标准AAC文件

元数据处理

cpp 复制代码

int CFlvParser::CMetaDataTag::parseMeta(CFlvParser *pParser)

解析onMetaData信息，包括：

视频参数: 时长、宽度、高度、帧率、码率
音频参数: 采样率、声道数、码率
文件信息: 文件大小、编码器信息

数据导出功能

导出原始H.264流

cpp 复制代码

DumpH264("parser.264")

将视频标签中的H.264数据提取出来，保存为原始H.264文件

导出原始AAC流

cpp 复制代码

DumpAAC("parser.aac")

将音频标签中的AAC数据添加ADTS头后，保存为标准AAC文件

重新封装FLV文件

cpp 复制代码

DumpFlv(filename)

将解析后的数据重新封装为FLV文件，包含去重处理

FLV文件头结构

FLV文件头共9个字节：

字节偏移	字段名	长度	说明
0-2	Signature	3字节	"FLV"(0x46 0x4C 0x56)
3	Version	1字节	FLV版本号(通常为1)
4	TypeFlags	1字节	音视频标志位
5-8	DataOffset	4字节	FLV头部长度(通常为9)

TypeFlags字节(第5个字节，pBuf[4])的位结构：

从右向左编号(最低位为0)：

复制代码

位位置: 7 6 5 4 3 2 1 0
        ┌─┬─┬─┬─┬─┬─┬─┬─┐
        │保留位(5位)│A│R│V│
        └─┴─┴─┴─┴─┴─┴─┴─┘
                      ↑ ↑ ↑
                      音 保 视
                      频 留 频

位7-3: 保留位(必须为0)
位2: 音频标志(Audio flag)
位1: 保留位(必须为0)
位0: 视频标志(Video flag)

c 复制代码

CFlvParser::FlvHeader *CFlvParser::CreateFlvHeader(uint8_t *pBuf)
{
    FlvHeader *pHeader = new FlvHeader;
    pHeader->nVersion = pBuf[3];        // 版本号
    pHeader->bHaveAudio = (pBuf[4] >> 2) & 0x01;    // 是否有音频
    pHeader->bHaveVideo = (pBuf[4] >> 0) & 0x01;    // 是否有视频
    pHeader->nHeadSize = ShowU32(pBuf + 5);         // 头部长度

    pHeader->pFlvHeader = new uint8_t[pHeader->nHeadSize];
    memcpy(pHeader->pFlvHeader, pBuf, pHeader->nHeadSize);

    return pHeader;
}

音频标志提取

cpp 复制代码

pHeader->bHaveAudio = (pBuf[4] >> 2) & 0x01;

操作步骤：

pBuf[4] >> 2 - 右移2位，将音频标志(位2)移动到最低位
& 0x01 - 取最低位，忽略其他位

视频标志提取

cpp 复制代码

pHeader->bHaveVideo = (pBuf[4] >> 0) & 0x01;

操作步骤：

pBuf[4] >> 0 - 不移位，视频标志已在最低位
& 0x01 - 取最低位

假设 pBuf[4] = 0b00000101：

复制代码

原始字节: 00000101
位位置:   76543210
                 ↑ 视频标志(位0) = 1
               ↑   音频标志(位2) = 1

音频提取: 00000101 >> 2 = 00000001 → & 0x01 = 1 (有音频)
视频提取: 00000101 >> 0 = 00000101 → & 0x01 = 1 (有视频)

FLV文件整体结构

FLV文件由以下部分组成：

复制代码

FLV Header | PreviousTagSize0 | Tag1 | PreviousTagSize1 | Tag2 | ...

解析全部过程

FLV文件结构解析与数据处理

FLV文件由文件头（FlvHeader）和一系列Tag组成，每个Tag前有一个4字节的PreviousTagSize字段。

FLV文件头解析

FlvHeader结构：

前3字节：固定签名"FLV"
第4字节：版本号（如0x01）
第5字节：类型标志（第0位表示音频，第2位表示视频）
第6-9字节：头部长度（通常为9）

Tag解析流程

Tag由头部（11字节）和数据区组成：

第1字节：类型（0x08音频，0x09视频，0x12脚本数据）
第2-4字节：数据区大小（大端序）
第5-7字节：时间戳低24位（大端序）
第8字节：时间戳高8位
第9-11字节：流ID（通常为0）

时间戳计算方式：

\\text{timestamp} = (\\text{nTSEx} \\ll 24) + \\text{nTimeStamp}

视频Tag处理（0x09）

数据区结构：

第1字节：帧类型（高4位）与编码类型（低4位）
第2字节：AVCPacketType（0表示AVC序列头，1表示NALU）
第3-5字节：Composition Time（大端序）

AVCPacketType为0时：

解析AVCDecoderConfigurationRecord获取SPS/PPS，添加起始码0x00000001保存。

AVCPacketType为1时：

提取NALU数据，根据lengthSizeMinusOne确定长度字段字节数，添加起始码保存。

音频Tag处理（0x08）

数据区结构：

第1字节：声音格式（高4位）、采样率（2位）、采样精度（1位）、声道数（1位）
第2字节：AACPacketType（0表示AAC序列头，1表示原始数据）

AACPacketType为0时：

解析AudioSpecificConfig获取音频参数（profile、采样率索引、通道配置）。

AACPacketType为1时：

为原始AAC数据添加ADTS头后保存。

元数据Tag处理（0x12）

解析onMetaData脚本数据，提取视频元信息（时长、宽度、高度等）。

数据导出与封装

导出H264：

遍历视频Tag，将已添加起始码的NALU数据写入文件。

导出AAC：

遍历音频Tag，将带ADTS头的AAC帧数据写入文件。

FLV重新封装：

写入FLV文件头
遍历所有Tag，按原始结构写入PreviousTagSize、Tag头及数据
视频Tag处理时跳过重复的SPS/PPS数据

cpp 复制代码

CFlvParser::FlvHeader *CFlvParser::CreateFlvHeader(uint8_t *pBuf) {
    FlvHeader *pHeader = new FlvHeader;
    pHeader->nVersion = pBuf[3];        // 版本号
    pHeader->bHaveAudio = (pBuf[4] >> 2) & 0x01;    // 是否有音频
    pHeader->bHaveVideo = (pBuf[4] >> 0) & 0x01;    // 是否有视频
    pHeader->nHeadSize = ShowU32(pBuf + 5);         // 头部长度
}

字段解析

字节0-2：签名 FLV (0x46, 0x4C, 0x56)
字节3：版本号（通常为1）
字节4：类型标志
- Bit 0：是否有视频
- Bit 2：是否有音频
- 其他位保留
字节5-8：头部长度（通常为9）

Tag头部解析 (11字节)

cpp 复制代码

CFlvParser::Tag *CFlvParser::CreateTag(uint8_t *pBuf, int nLeftLen) {
    TagHeader header;
    header.nType = ShowU8(pBuf+0);          // 标签类型
    header.nDataSize = ShowU24(pBuf + 1);   // 数据体大小
    header.nTimeStamp = ShowU24(pBuf + 4);  // 时间戳低24位
    header.nTSEx = ShowU8(pBuf + 7);        // 时间戳扩展高8位
    header.nStreamID = ShowU24(pBuf + 8);   // 流ID
    header.nTotalTS = (uint32_t)((header.nTSEx << 24)) + header.nTimeStamp;
}

字段详解

偏移	长度	字段	说明
0	1字节	nType	标签类型: 0x08=音频, 0x09=视频, 0x12=脚本
1-3	3字节	nDataSize	数据体大小(不含11字节头部)
4-6	3字节	nTimeStamp	时间戳低24位
7	1字节	nTSEx	时间戳扩展高8位
8-10	3字节	nStreamID	流ID(通常为0)

时间戳计算

cpp 复制代码

nTotalTS = (nTSEx << 24) + nTimeStamp

例如：nTSEx=0x01, nTimeStamp=0x00A5B3 → nTotalTS=0x0100A5B3

视频Tag解析 (类型0x09)

数据结构

复制代码

FrameType+CodecID(1) | AVCPacketType(1) | CompositionTime(3) | 数据

第一个字节解析

cpp 复制代码

_nFrameType = (pd[0] & 0xf0) >> 4;  // 帧类型
_nCodecID = pd[0] & 0x0f;           // 编码类型

FrameType: 1=关键帧, 2=普通帧, 3=H.263离散帧, 4=生成关键帧, 5=视频信息帧
CodecID: 2=Sorenson H.263, 3=Screen video, 4=On2 VP6, 7=AVC

AVC序列头解析 (AVCPacketType=0)

AVCDecoderConfigurationRecord结构

cpp 复制代码

int CFlvParser::CVideoTag::ParseH264Configuration(CFlvParser *pParser, uint8_t *pTagData) {
    pParser->_nNalUnitLength = (pd[9] & 0x03) + 1;  // NALU长度字段字节数
    int sps_size = CFlvParser::ShowU16(pd + 11);     // SPS长度
    int pps_size = CFlvParser::ShowU16(pd + 11 + (2 + sps_size) + 1); // PPS长度
    
    // 构建H.264元数据
    _nMediaLen = 4 + sps_size + 4 + pps_size;
    _pMedia = new uint8_t[_nMediaLen];
    memcpy(_pMedia, &nH264StartCode, 4);
    memcpy(_pMedia + 4, pd + 11 + 2, sps_size);
    memcpy(_pMedia + 4 + sps_size, &nH264StartCode, 4);
    memcpy(_pMedia + 4 + sps_size + 4, pd + 11 + 2 + sps_size + 2 + 1, pps_size);
}

详细结构

复制代码

字节0: configurationVersion = 1  
字节1: AVCProfileIndication  
字节2: profile_compatibility  
字节3: AVCLevelIndication  
字节4:  
  - bit6-7: reserved = 0x3  
  - bit4-5: lengthSizeMinusOne (NALU长度字段大小-1)  
字节5:  
  - bit5-7: reserved = 0x7  
  - bit0-4: numOfSequenceParameterSets (SPS个数)  
随后:  
  - 2字节: SPS长度  
  - SPS数据  
  - 1字节: numOfPictureParameterSets (PPS个数)  
  - 2字节: PPS长度  
  - PPS数据

AVC NALU解析 (AVCPacketType=1)

cpp 复制代码

int CFlvParser::CVideoTag::ParseNalu(CFlvParser *pParser, uint8_t *pTagData) {
    while (1) {
        // 读取NALU长度
        int nNaluLen;
        switch (pParser->_nNalUnitLength) {
        case 4: nNaluLen = ShowU32(pd + nOffset); break;
        case 3: nNaluLen = ShowU24(pd + nOffset); break; 
        case 2: nNaluLen = ShowU16(pd + nOffset); break;
        default: nNaluLen = ShowU8(pd + nOffset);
        }
        
        // 添加H.264起始码并复制数据
        memcpy(_pMedia + _nMediaLen, &nH264StartCode, 4);
        memcpy(_pMedia + _nMediaLen + 4, pd + nOffset + pParser->_nNalUnitLength, nNaluLen);
        _nMediaLen += (4 + nNaluLen);
        nOffset += (pParser->_nNalUnitLength + nNaluLen);
    }
}

H.264文件生成流程

从AVC序列头提取SPS和PPS，添加起始码 0x00000001
从视频数据包提取NALU，每个NALU前添加起始码
将所有带起始码的NALU按顺序写入 .264 文件

音频Tag解析 (类型0x08)

数据结构

复制代码

SoundFormat+Rate+Size+Type(1) | AACPacketType(1) | 数据

第一个字节解析

cpp 复制代码

_nSoundFormat = (pd[0] & 0xf0) >> 4;    // 音频格式
_nSoundRate = (pd[0] & 0x0c) >> 2;      // 采样率  
_nSoundSize = (pd[0] & 0x02) >> 1;      // 采样精度
_nSoundType = (pd[0] & 0x01);           // 声道数

音频参数详解

SoundFormat: 10=AAC
SoundRate: 0=5.5KHz, 1=11KHz, 2=22KHz, 3=44KHz
SoundSize: 0=8bit, 1=16bit
SoundType: 0=单声道, 1=立体声

AAC序列头解析 (AACPacketType=0)

cpp 复制代码

int CFlvParser::CAudioTag::ParseAudioSpecificConfig(CFlvParser *pParser, uint8_t *pTagData) {
    _aacProfile = ((pd[2]&0xf8)>>3);                    // 5bit AAC编码级别
    _sampleRateIndex = ((pd[2]&0x07)<<1) | (pd[3]>>7);  // 4bit 采样率索引
    _channelConfig = (pd[3]>>3) & 0x0f;                 // 4bit 通道配置
}

AudioSpecificConfig结构

5bit: audioObjectType (AAC编码级别)
4bit: samplingFrequencyIndex (采样率索引)
4bit: channelConfiguration (声道配置)

AAC原始数据解析 (AACPacketType=1)

cpp 复制代码

int CFlvParser::CAudioTag::ParseRawAAC(CFlvParser *pParser, uint8_t *pTagData) {
    int dataSize = _header.nDataSize - 2;   // 减去音频Tag头部
    
    // 构建ADTS头部(7字节)
    uint64_t b

完整代码

FlvParser.h

c 复制代码

#ifndef FLVPARSER_H
#define FLVPARSER_H

#include <iostream>
#include <vector>
#include <stdint.h>
#include "Videojj.h"

using namespace std;

typedef unsigned long long uint64_t;

class CFlvParser
{
public:
    CFlvParser();
    virtual ~CFlvParser();

    int Parse(uint8_t *pBuf, int nBufSize, int &nUsedLen);
    int PrintInfo();
    int DumpH264(const std::string &path);
    int DumpAAC(const std::string &path);
    int DumpFlv(const std::string &path);

private:
    // FLV头
    typedef struct FlvHeader_s
    {
        int nVersion; // 版本
        int bHaveVideo; // 是否包含视频
        int bHaveAudio; // 是否包含音频
        int nHeadSize;  // FLV头部长度
        /*
        ** 指向存放FLV头部的buffer
        ** 上面的三个成员指明了FLV头部的信息，是从FLV的头部中"翻译"得到的，
        ** 真实的FLV头部是一个二进制比特串，放在一个buffer中，由pFlvHeader成员指明
        */
        uint8_t *pFlvHeader;
    } FlvHeader;

    // Tag头部
    struct TagHeader
    {
        int nType;      // 类型
        int nDataSize;  // 标签body的大小
        int nTimeStamp; // 时间戳
        int nTSEx;      // 时间戳的扩展字节
        int nStreamID;  // 流的ID，总是0

        uint32_t nTotalTS;  // 完整的时间戳nTimeStamp和nTSEx拼装

        TagHeader() : nType(0), nDataSize(0), nTimeStamp(0), nTSEx(0), nStreamID(0), nTotalTS(0) {}
        ~TagHeader() {}
    };

    class Tag
    {
    public:
        Tag() : _pTagHeader(NULL), _pTagData(NULL), _pMedia(NULL), _nMediaLen(0) {}
        void Init(TagHeader *pHeader, uint8_t *pBuf, int nLeftLen);

        TagHeader _header;
        uint8_t *_pTagHeader;   // 指向标签头部
        uint8_t *_pTagData;     // 指向标签body，原始的tag data数据
        uint8_t *_pMedia;       // 指向标签的元数据，改造后的数据
        int _nMediaLen;         // 数据长度
    };

    class CVideoTag : public Tag
    {
    public:
        /**
         * @brief CVideoTag
         * @param pHeader
         * @param pBuf 整个tag的起始地址
         * @param nLeftLen
         * @param pParser
         */
        CVideoTag(TagHeader *pHeader, uint8_t *pBuf, int nLeftLen, CFlvParser *pParser);

        int _nFrameType;    // 帧类型
        int _nCodecID;      // 视频编解码类型
        int ParseH264Tag(CFlvParser *pParser);
        int ParseH264Configuration(CFlvParser *pParser, uint8_t *pTagData);
        int ParseNalu(CFlvParser *pParser, uint8_t *pTagData);
    };

    class CAudioTag : public Tag
    {
    public:
        CAudioTag(TagHeader *pHeader, uint8_t *pBuf, int nLeftLen, CFlvParser *pParser);

        int _nSoundFormat;  // 音频编码类型
        int _nSoundRate;    // 采样率
        int _nSoundSize;    // 精度
        int _nSoundType;    // 类型

        // aac
        static int _aacProfile;     // 对应AAC profile
        static int _sampleRateIndex;    // 采样率索引
        static int _channelConfig;      // 通道设置

        int ParseAACTag(CFlvParser *pParser);
        int ParseAudioSpecificConfig(CFlvParser *pParser, uint8_t *pTagData);
        int ParseRawAAC(CFlvParser *pParser, uint8_t *pTagData);
    };

    class CMetaDataTag : public Tag
    {
    public:
        CMetaDataTag(TagHeader *pHeader, uint8_t *pBuf, int nLeftLen, CFlvParser *pParser);

        double hexStr2double(const unsigned char* hex, const unsigned int length);
        int parseMeta(CFlvParser *pParser);
        void printMeta();

        uint8_t m_amf1_type;
        uint32_t m_amf1_size;
        uint8_t m_amf2_type;
        unsigned char *m_meta;
        unsigned int m_length;

        double m_duration;
        double m_width;
        double m_height;
        double m_videodatarate;
        double m_framerate;
        double m_videocodecid;

        double m_audiodatarate;
        double m_audiosamplerate;
        double m_audiosamplesize;
        bool m_stereo;
        double m_audiocodecid;

        string m_major_brand;
        string m_minor_version;
        string m_compatible_brands;
        string m_encoder;

        double m_filesize;
    };


    struct FlvStat
    {
        int nMetaNum, nVideoNum, nAudioNum;
        int nMaxTimeStamp;
        int nLengthSize;

        FlvStat() : nMetaNum(0), nVideoNum(0), nAudioNum(0), nMaxTimeStamp(0), nLengthSize(0){}
        ~FlvStat() {}
    };



    static uint32_t ShowU32(uint8_t *pBuf) { return (pBuf[0] << 24) | (pBuf[1] << 16) | (pBuf[2] << 8) | pBuf[3]; }
    static uint32_t ShowU24(uint8_t *pBuf) { return (pBuf[0] << 16) | (pBuf[1] << 8) | (pBuf[2]); }
    static uint32_t ShowU16(uint8_t *pBuf) { return (pBuf[0] << 8) | (pBuf[1]); }
    static uint32_t ShowU8(uint8_t *pBuf) { return (pBuf[0]); }
    static void WriteU64(uint64_t & x, int length, int value)
    {
        uint64_t mask = 0xFFFFFFFFFFFFFFFF >> (64 - length);
        x = (x << length) | ((uint64_t)value & mask);
    }
    static uint32_t WriteU32(uint32_t n)
    {
        uint32_t nn = 0;
        uint8_t *p = (uint8_t *)&n;
        uint8_t *pp = (uint8_t *)&nn;
        pp[0] = p[3];
        pp[1] = p[2];
        pp[2] = p[1];
        pp[3] = p[0];
        return nn;
    }

    friend class Tag;

private:

    FlvHeader *CreateFlvHeader(uint8_t *pBuf);
    int DestroyFlvHeader(FlvHeader *pHeader);
    Tag *CreateTag(uint8_t *pBuf, int nLeftLen);
    int DestroyTag(Tag *pTag);
    int Stat();
    int StatVideo(Tag *pTag);
    int IsUserDataTag(Tag *pTag);

private:

    FlvHeader* _pFlvHeader;
    vector<Tag *> _vpTag;
    FlvStat _sStat;
    CVideojj *_vjj;

    // H.264
    int _nNalUnitLength;

};

#endif // FLVPARSER_H

FlvParser.cpp

c 复制代码

#include <stdlib.h>
#include <string.h>
#include <assert.h>

#include <iostream>
#include <fstream>

#include "FlvParser.h"

using namespace std;

#define CheckBuffer(x) { if ((nBufSize-nOffset)<(x)) { nUsedLen = nOffset; return 0;} }

int CFlvParser::CAudioTag::_aacProfile;
int CFlvParser::CAudioTag::_sampleRateIndex;
int CFlvParser::CAudioTag::_channelConfig;

static const uint32_t nH264StartCode = 0x01000000;

CFlvParser::CFlvParser()
{
    _pFlvHeader = NULL;
    _vjj = new CVideojj();
}

CFlvParser::~CFlvParser()
{
    for (int i = 0; i < _vpTag.size(); i++)
    {
        DestroyTag(_vpTag[i]);
        delete _vpTag[i];
    }
    if (_vjj != NULL)
        delete _vjj;
}

int CFlvParser::Parse(uint8_t *pBuf, int nBufSize, int &nUsedLen)
{
    int nOffset = 0;

    if (_pFlvHeader == 0)
    {
        CheckBuffer(9);
        _pFlvHeader = CreateFlvHeader(pBuf+nOffset);
        nOffset += _pFlvHeader->nHeadSize;
    }

    while (1)
    {
        CheckBuffer(15); // nPrevSize(4字节) + Tag header(11字节)
        int nPrevSize = ShowU32(pBuf + nOffset);
        nOffset += 4;

        Tag *pTag = CreateTag(pBuf + nOffset, nBufSize-nOffset);
        if (pTag == NULL)
        {
            nOffset -= 4;
            break;
        }
        nOffset += (11 + pTag->_header.nDataSize);

        _vpTag.push_back(pTag);
    }

    nUsedLen = nOffset;
    return 0;
}

int CFlvParser::PrintInfo()
{
    Stat();

    cout << "vnum: " << _sStat.nVideoNum << " , anum: " << _sStat.nAudioNum << " , mnum: " << _sStat.nMetaNum << endl;
    cout << "maxTimeStamp: " << _sStat.nMaxTimeStamp << " ,nLengthSize: " << _sStat.nLengthSize << endl;
    cout << "Vjj SEI num: " << _vjj->_vVjjSEI.size() << endl;
    for (int i = 0; i < _vjj->_vVjjSEI.size(); i++)
        cout << "SEI time : " << _vjj->_vVjjSEI[i].nTimeStamp << endl;
    return 1;
}

int CFlvParser::DumpH264(const std::string &path)
{
    fstream f;
    f.open(path.c_str(), ios_base::out|ios_base::binary);

    vector<Tag *>::iterator it_tag;
    for (it_tag = _vpTag.begin(); it_tag != _vpTag.end(); it_tag++)
    {
        if ((*it_tag)->_header.nType != 0x09)
            continue;

        f.write((char *)(*it_tag)->_pMedia, (*it_tag)->_nMediaLen);
    }
    f.close();

    return 1;
}

int CFlvParser::DumpAAC(const std::string &path)
{
    fstream f;
    f.open(path.c_str(), ios_base::out | ios_base::binary);

    vector<Tag *>::iterator it_tag;
    for (it_tag = _vpTag.begin(); it_tag != _vpTag.end(); it_tag++)
    {
        if ((*it_tag)->_header.nType != 0x08)
            continue;

        CAudioTag *pAudioTag = (CAudioTag *)(*it_tag);
        if (pAudioTag->_nSoundFormat != 10)
            continue;

        if (pAudioTag->_nMediaLen!=0)
            f.write((char *)(*it_tag)->_pMedia, (*it_tag)->_nMediaLen);
    }
    f.close();

    return 1;
}

int CFlvParser::DumpFlv(const std::string &path)
{
    fstream f;
    f.open(path.c_str(), ios_base::out | ios_base::binary);

    // write flv-header
    f.write((char *)_pFlvHeader->pFlvHeader, _pFlvHeader->nHeadSize);
    uint32_t nLastTagSize = 0;


    // write flv-tag
    vector<Tag *>::iterator it_tag;
    for (it_tag = _vpTag.begin(); it_tag < _vpTag.end(); it_tag++)
    {
        uint32_t nn = WriteU32(nLastTagSize);
        f.write((char *)&nn, 4);

        //check duplicate start code
        if ((*it_tag)->_header.nType == 0x09 && *((*it_tag)->_pTagData + 1) == 0x01) {
            bool duplicate = false;
            uint8_t *pStartCode = (*it_tag)->_pTagData + 5 + _nNalUnitLength;
            //printf("tagsize=%d\n",(*it_tag)->_header.nDataSize);
            unsigned nalu_len = 0;
            uint8_t *p_nalu_len=(uint8_t *)&nalu_len;
            switch (_nNalUnitLength) {
            case 4:
                nalu_len = ShowU32((*it_tag)->_pTagData + 5);
                break;
            case 3:
                nalu_len = ShowU24((*it_tag)->_pTagData + 5);
                break;
            case 2:
                nalu_len = ShowU16((*it_tag)->_pTagData + 5);
                break;
            default:
                nalu_len = ShowU8((*it_tag)->_pTagData + 5);
                break;
            }
            /*
            printf("nalu_len=%u\n",nalu_len);
            printf("%x,%x,%x,%x,%x,%x,%x,%x,%x\n",(*it_tag)->_pTagData[5],(*it_tag)->_pTagData[6],
                    (*it_tag)->_pTagData[7],(*it_tag)->_pTagData[8],(*it_tag)->_pTagData[9],
                    (*it_tag)->_pTagData[10],(*it_tag)->_pTagData[11],(*it_tag)->_pTagData[12],
                    (*it_tag)->_pTagData[13]);
            */

            uint8_t *pStartCodeRecord = pStartCode;
            int i;
            for (i = 0; i < (*it_tag)->_header.nDataSize - 5 - _nNalUnitLength - 4; ++i) {
                if (pStartCode[i] == 0x00 && pStartCode[i+1] == 0x00 && pStartCode[i+2] == 0x00 &&
                        pStartCode[i+3] == 0x01) {
                    if (pStartCode[i+4] == 0x67) {
                        //printf("duplicate sps found!\n");
                        i += 4;
                        continue;
                    }
                    else if (pStartCode[i+4] == 0x68) {
                        //printf("duplicate pps found!\n");
                        i += 4;
                        continue;
                    }
                    else if (pStartCode[i+4] == 0x06) {
                        //printf("duplicate sei found!\n");
                        i += 4;
                        continue;
                    }
                    else {
                        i += 4;
                        //printf("offset=%d\n",i);
                        duplicate = true;
                        break;
                    }
                }
            }

            if (duplicate) {
                nalu_len -= i;
                (*it_tag)->_header.nDataSize -= i;
                uint8_t *p = (uint8_t *)&((*it_tag)->_header.nDataSize);
                (*it_tag)->_pTagHeader[1] = p[2];
                (*it_tag)->_pTagHeader[2] = p[1];
                (*it_tag)->_pTagHeader[3] = p[0];
                //printf("after,tagsize=%d\n",(int)ShowU24((*it_tag)->_pTagHeader + 1));
                //printf("%x,%x,%x\n",(*it_tag)->_pTagHeader[1],(*it_tag)->_pTagHeader[2],(*it_tag)->_pTagHeader[3]);

                f.write((char *)(*it_tag)->_pTagHeader, 11);
                switch (_nNalUnitLength) {
                case 4:
                    *((*it_tag)->_pTagData + 5) = p_nalu_len[3];
                    *((*it_tag)->_pTagData + 6) = p_nalu_len[2];
                    *((*it_tag)->_pTagData + 7) = p_nalu_len[1];
                    *((*it_tag)->_pTagData + 8) = p_nalu_len[0];
                    break;
                case 3:
                    *((*it_tag)->_pTagData + 5) = p_nalu_len[2];
                    *((*it_tag)->_pTagData + 6) = p_nalu_len[1];
                    *((*it_tag)->_pTagData + 7) = p_nalu_len[0];
                    break;
                case 2:
                    *((*it_tag)->_pTagData + 5) = p_nalu_len[1];
                    *((*it_tag)->_pTagData + 6) = p_nalu_len[0];
                    break;
                default:
                    *((*it_tag)->_pTagData + 5) = p_nalu_len[0];
                    break;
                }
                //printf("after,nalu_len=%d\n",(int)ShowU32((*it_tag)->_pTagData + 5));
                f.write((char *)(*it_tag)->_pTagData, pStartCode - (*it_tag)->_pTagData);
                /*
                printf("%x,%x,%x,%x,%x,%x,%x,%x,%x\n",(*it_tag)->_pTagData[0],(*it_tag)->_pTagData[1],(*it_tag)->_pTagData[2],
                        (*it_tag)->_pTagData[3],(*it_tag)->_pTagData[4],(*it_tag)->_pTagData[5],(*it_tag)->_pTagData[6],
                        (*it_tag)->_pTagData[7],(*it_tag)->_pTagData[8]);
                */
                f.write((char *)pStartCode + i, (*it_tag)->_header.nDataSize - (pStartCode - (*it_tag)->_pTagData));
                /*
                printf("write size:%d\n", (pStartCode - (*it_tag)->_pTagData) +
                        ((*it_tag)->_header.nDataSize - (pStartCode - (*it_tag)->_pTagData)));
                */
            } else {
                f.write((char *)(*it_tag)->_pTagHeader, 11);
                f.write((char *)(*it_tag)->_pTagData, (*it_tag)->_header.nDataSize);
            }
        } else {
            f.write((char *)(*it_tag)->_pTagHeader, 11);
            f.write((char *)(*it_tag)->_pTagData, (*it_tag)->_header.nDataSize);
        }

        nLastTagSize = 11 + (*it_tag)->_header.nDataSize;
    }
    uint32_t nn = WriteU32(nLastTagSize);
    f.write((char *)&nn, 4);

    f.close();

    return 1;
}

int CFlvParser::Stat()
{
    for (int i = 0; i < _vpTag.size(); i++)
    {
        switch (_vpTag[i]->_header.nType)
        {
        case 0x08:
            _sStat.nAudioNum++;
            break;
        case 0x09:
            StatVideo(_vpTag[i]);
            break;
        case 0x12:
            _sStat.nMetaNum++;
            break;
        default:
            ;
        }
    }

    return 1;
}

int CFlvParser::StatVideo(Tag *pTag)
{
    _sStat.nVideoNum++;
    _sStat.nMaxTimeStamp = pTag->_header.nTimeStamp;

    if (pTag->_pTagData[0] == 0x17 && pTag->_pTagData[1] == 0x00)
    {
        _sStat.nLengthSize = (pTag->_pTagData[9] & 0x03) + 1;
    }

    return 1;
}

CFlvParser::FlvHeader *CFlvParser::CreateFlvHeader(uint8_t *pBuf)
{
    FlvHeader *pHeader = new FlvHeader;
    pHeader->nVersion = pBuf[3];        // 版本号
    pHeader->bHaveAudio = (pBuf[4] >> 2) & 0x01;    // 是否有音频
    pHeader->bHaveVideo = (pBuf[4] >> 0) & 0x01;    // 是否有视频
    pHeader->nHeadSize = ShowU32(pBuf + 5);         // 头部长度

    pHeader->pFlvHeader = new uint8_t[pHeader->nHeadSize];
    memcpy(pHeader->pFlvHeader, pBuf, pHeader->nHeadSize);

    return pHeader;
}

int CFlvParser::DestroyFlvHeader(FlvHeader *pHeader)
{
    if (pHeader == NULL)
        return 0;

    delete pHeader->pFlvHeader;
    return 1;
}

/**
 * @brief 复制header + body
 * @param pHeader
 * @param pBuf
 * @param nLeftLen
 */
void CFlvParser::Tag::Init(TagHeader *pHeader, uint8_t *pBuf, int nLeftLen)
{
    memcpy(&_header, pHeader, sizeof(TagHeader));
    // 复制标签头部信息 header
    _pTagHeader = new uint8_t[11];
    memcpy(_pTagHeader, pBuf, 11);      // 头部
    // 复制标签 body
    _pTagData = new uint8_t[_header.nDataSize];
    memcpy(_pTagData, pBuf + 11, _header.nDataSize);
}

CFlvParser::CVideoTag::CVideoTag(TagHeader *pHeader, uint8_t *pBuf, int nLeftLen, CFlvParser *pParser)
{
    // 初始化
    Init(pHeader, pBuf, nLeftLen);

    uint8_t *pd = _pTagData;
    _nFrameType = (pd[0] & 0xf0) >> 4;  // 帧类型
    _nCodecID = pd[0] & 0x0f;           // 视频编码类型
    // 开始解析
    if (_header.nType == 0x09 && _nCodecID == 7)
    {
        ParseH264Tag(pParser);
    }
}

/**
 * @brief  CAudioTag 音频Tag Data区域开始的第一个字节包含了音频数据的参数信息，
 * 从第二个字节开始为音频流数据，但第二个字节对于AAC也要判断是AAC sequence header还是AAC raw。
 * 第一个字节：SoundFormat 4bit 音频格式 0 = Linear PCM, platform endian
                        1 =ADPCM; 2 = MP3; 3 = Linear PCM, little endian
                        4 = Nellymoser 16-kHz mono ; 5 = Nellymoser 8-kHz mono
                        6 = Nellymoser;  7 = G.711 A-law logarithmic PCM
                        8 = G.711 mu-law logarithmic PCM; 9 = reserved
                        10 = AAC ; 11  Speex 14 = MP3 8-Khz
                        15 = Device-specific sound
              SoundRate 2bit 采样率 0 = 5.5-kHz; 1 = 11-kHz; 2 = 22-kHz; 3 = 44-kHz
                        对于AAC总是3。但实际上AAC是可以支持到48khz以上的频率。
              SoundSize 1bit 采样精度  0 = snd8Bit; 1 = snd16Bit
                        此参数仅适用于未压缩的格式，压缩后的格式都是将其设为1
              SoundType 1bit  0 = sndMono 单声道; 1 = sndStereo 立体声，双声道
                        对于AAC总是1
If the SoundFormat indicates AAC, the SoundType should be set to 1 (stereo) and the
SoundRate should be set to 3 (44 kHz). However, this does not mean that AAC audio in FLV
is always stereo, 44 kHz data. Instead, the Flash Player ignores these values and
extracts the channel and sample rate data is encoded in the AAC bitstream.
 * @param pHeader
 * @param pBuf
 * @param nLeftLen
 * @param pParser
 */
CFlvParser::CAudioTag::CAudioTag(TagHeader *pHeader, uint8_t *pBuf, int nLeftLen, CFlvParser *pParser)
{
    Init(pHeader, pBuf, nLeftLen);

    uint8_t *pd = _pTagData;
    _nSoundFormat = (pd[0] & 0xf0) >> 4;    // 音频格式
    _nSoundRate = (pd[0] & 0x0c) >> 2;      // 采样率
    _nSoundSize = (pd[0] & 0x02) >> 1;      // 采样精度
    _nSoundType = (pd[0] & 0x01);           // 是否立体声
    if (_nSoundFormat == 10)                // AAC
    {
        ParseAACTag(pParser);
    }
}

int CFlvParser::CAudioTag::ParseAACTag(CFlvParser *pParser)
{
    uint8_t *pd = _pTagData;

    // 数据包的类型：音频配置信息，音频数据
    int nAACPacketType = pd[1];

    // 如果是音频配置信息
    if (nAACPacketType == 0)    // AAC sequence header
    {
        // 解析配置信息
        ParseAudioSpecificConfig(pParser, pd); // 解析AudioSpecificConfig
    }
    // 如果是音频数据
    else if (nAACPacketType == 1)   // AAC RAW
    {
        // 解析音频数据
        ParseRawAAC(pParser, pd);
    }
    else
    {

    }

    return 1;
}

int CFlvParser::CAudioTag::ParseAudioSpecificConfig(CFlvParser *pParser, uint8_t *pTagData)
{
    uint8_t *pd = _pTagData;

    _aacProfile = ((pd[2]&0xf8)>>3);    // 5bit AAC编码级别
    _sampleRateIndex = ((pd[2]&0x07)<<1) | (pd[3]>>7);  // 4bit 真正的采样率索引
    _channelConfig = (pd[3]>>3) & 0x0f;                 // 4bit 通道数量
    printf("----- AAC ------\n");
    printf("profile:%d\n", _aacProfile);
    printf("sample rate index:%d\n", _sampleRateIndex);
    printf("channel config:%d\n", _channelConfig);

    _pMedia = NULL;
    _nMediaLen = 0;

    return 1;
}

int CFlvParser::CAudioTag::ParseRawAAC(CFlvParser *pParser, uint8_t *pTagData)
{
    uint64_t bits = 0;  // 占用8字节
    // 数据长度 跳过tag data的第一个第二字节
    int dataSize = _header.nDataSize - 2;   // 减去两字节的 audio tag data信息部分

    // 制作元数据
    WriteU64(bits, 12, 0xFFF);
    WriteU64(bits, 1, 0);
    WriteU64(bits, 2, 0);
    WriteU64(bits, 1, 1);
    WriteU64(bits, 2, _aacProfile - 1);
    WriteU64(bits, 4, _sampleRateIndex);
    WriteU64(bits, 1, 0);
    WriteU64(bits, 3, _channelConfig);
    WriteU64(bits, 1, 0);
    WriteU64(bits, 1, 0);
    WriteU64(bits, 1, 0);
    WriteU64(bits, 1, 0);
    WriteU64(bits, 13, 7 + dataSize);
    WriteU64(bits, 11, 0x7FF);
    WriteU64(bits, 2, 0);
    // WriteU64执行为上述的操作，最高的8bit还没有被移位到，实际是使用7个字节
    _nMediaLen = 7 + dataSize;
    _pMedia = new uint8_t[_nMediaLen];
    uint8_t p64[8];
    p64[0] = (uint8_t)(bits >> 56); // 是bits的最高8bit，实际为0
    p64[1] = (uint8_t)(bits >> 48); // 才是ADTS起始头 0xfff的高8bit
    p64[2] = (uint8_t)(bits >> 40);
    p64[3] = (uint8_t)(bits >> 32);
    p64[4] = (uint8_t)(bits >> 24);
    p64[5] = (uint8_t)(bits >> 16);
    p64[6] = (uint8_t)(bits >> 8);
    p64[7] = (uint8_t)(bits);

    memcpy(_pMedia, p64+1, 7);  // ADTS header, p64+1是从ADTS起始头开始
    memcpy(_pMedia + 7, pTagData + 2, dataSize); // AAC body

    return 1;
}

CFlvParser::CMetaDataTag::CMetaDataTag(TagHeader *pHeader, uint8_t *pBuf, int nLeftLen, CFlvParser *pParser)
{
    Init(pHeader, pBuf, nLeftLen);

    uint8_t *pd = _pTagData;
    m_amf1_type = ShowU8(pd+0);
    m_amf1_size = ShowU16(pd+1);

    if(m_amf1_type != 2)
    {
        printf("no metadata\n");
        return;
    }
    // 解析script
    if(strncmp((const char *)"onMetaData", (const char *)(pd + 3), 10) == 0)
        parseMeta(pParser);
}
double CFlvParser::CMetaDataTag::hexStr2double(const uint8_t* hex,
                                               const uint32_t length) {

    double ret = 0;
    char hexstr[length * 2];
    memset(hexstr, 0, sizeof(hexstr));

    for(uint32_t i = 0; i < length; i++)
    {
        sprintf(hexstr + i * 2, "%02x", hex[i]);
    }

    sscanf(hexstr, "%llx", (unsigned long long*)&ret);

    return ret;
}
int CFlvParser::CMetaDataTag::parseMeta(CFlvParser *pParser)
{
    uint8_t *pd = _pTagData;
    int dataSize = _header.nDataSize;

    uint32_t arrayLen = 0;
    uint32_t offset = 13; // Type + Value_Size + Value占用13字节

    uint32_t nameLen = 0;
    double doubleValue = 0;
    string strValue = "";
    bool boolValue = false;
    uint32_t valueLen = 0;
    uint8_t u8Value = 0;
    if(pd[offset++] == 0x08)    // 0x8 onMetaData
    {
        arrayLen = ShowU32(pd + offset);
        offset += 4;    //跳过 [ECMAArrayLength]占用的字节
        printf("ArrayLen = %d\n", arrayLen);
    }
    else
    {
        printf("metadata format error!!!");
        return -1;
    }

    for(uint32_t i = 0; i < arrayLen; i++)
    {

        doubleValue = 0;
        boolValue = false;
        strValue = "";
        // 读取字段长度
        nameLen = ShowU16(pd + offset);
        offset += 2;            // 跳过2字节字段长度

        char name[nameLen + 1]; // 获取字段名称

        memset(name, 0, sizeof(name));
        memcpy(name, &pd[offset], nameLen);
        name[nameLen + 1] = '\0';
        offset += nameLen;      // 跳过字段名占用的长度

        uint8_t amfType = pd[offset++];
        switch(amfType)    // 判别值的类型
        {
        case 0x0: //Number type, 就是double类型, 占用8字节
            doubleValue = hexStr2double(&pd[offset], 8);
            offset += 8;        // 跳过8字节
            break;

        case 0x1: //Boolean type, 占用1字节
            u8Value = ShowU8(pd+offset);
            offset += 1;        // 跳过1字节
            if(u8Value != 0x00)
                 boolValue = true;
            else
                 boolValue = false;
            break;

        case 0x2: //String type
            valueLen = ShowU16(pd + offset);
            offset += 2;    // 跳过2字节 length
            strValue.append(pd + offset, pd + offset + valueLen);
            strValue.append("");
            offset += valueLen; // 跳过字段的值的长度
            break;

        default:
            printf("un handle amfType:%d\n", amfType);
            break;
        }

        if(strncmp(name, "duration", 8)	== 0)
        {
            m_duration = doubleValue;
        }
        else if(strncmp(name, "width", 5)	== 0)
        {
            m_width = doubleValue;
        }
        else if(strncmp(name, "height", 6) == 0)
        {
            m_height = doubleValue;
        }
        else if(strncmp(name, "videodatarate", 13) == 0)
        {
            m_videodatarate = doubleValue;
        }
        else if(strncmp(name, "framerate", 9) == 0)
        {
            m_framerate = doubleValue;
        }
        else if(strncmp(name, "videocodecid", 12) == 0)
        {
            m_videocodecid = doubleValue;
        }
        else if(strncmp(name, "audiodatarate", 13) == 0)
        {
            m_audiodatarate = doubleValue;
        }
        else if(strncmp(name, "audiosamplerate", 15) == 0)
        {
            m_audiosamplerate = doubleValue;
        }
        else if(strncmp(name, "audiosamplesize", 15) == 0)
        {
            m_audiosamplesize = doubleValue;
        }
        else if(strncmp(name, "stereo", 6) == 0)
        {
            m_stereo = boolValue;
        }
        else if(strncmp(name, "audiocodecid", 12) == 0)
        {
            m_audiocodecid = doubleValue;
        }
        else if(strncmp(name, "major_brand", 11) == 0)
        {
            m_major_brand = strValue;
        }
        else if(strncmp(name, "minor_version", 13) == 0)
        {
            m_minor_version = strValue;
        }
        else if(strncmp(name, "compatible_brands", 17) == 0)
        {
            m_compatible_brands = strValue;
        }
        else if(strncmp(name, "encoder", 7) == 0)
        {
            m_encoder = strValue;
        }
        else if(strncmp(name, "filesize", 8) == 0)
        {
            m_filesize = doubleValue;
        }
    }


    printMeta();
    return 1;
}

void CFlvParser::CMetaDataTag::printMeta()
{
    printf("\nduration: %0.2lfs, filesize: %.0lfbytes\n", m_duration, m_filesize);

    printf("width: %0.0lf, height: %0.0lf\n", m_width, m_height);
    printf("videodatarate: %0.2lfkbps, framerate: %0.0lffps\n", m_videodatarate, m_framerate);
    printf("videocodecid: %0.0lf\n", m_videocodecid);

    printf("audiodatarate: %0.2lfkbps, audiosamplerate: %0.0lfKhz\n",
           m_audiodatarate, m_audiosamplerate);
    printf("audiosamplesize: %0.0lfbit, stereo: %d\n", m_audiosamplesize, m_stereo);
    printf("audiocodecid: %0.0lf\n", m_audiocodecid);

    printf("major_brand: %s, minor_version: %s\n", m_major_brand.c_str(), m_minor_version.c_str());
    printf("compatible_brands: %s, encoder: %s\n\n", m_compatible_brands.c_str(), m_encoder.c_str());
}

CFlvParser::Tag *CFlvParser::CreateTag(uint8_t *pBuf, int nLeftLen)
{
    // 开始解析标签头部
    TagHeader header;
    header.nType = ShowU8(pBuf+0);  // 类型
    header.nDataSize = ShowU24(pBuf + 1);   // 标签body的长度
    header.nTimeStamp = ShowU24(pBuf + 4);  // 时间戳 低24bit
    header.nTSEx = ShowU8(pBuf + 7);        // 时间戳的扩展字段, 高8bit
    header.nStreamID = ShowU24(pBuf + 8);   // 流的id
    header.nTotalTS = (uint32_t)((header.nTSEx << 24)) + header.nTimeStamp;
    // 标签头部解析结束

//    cout << "total TS : " << header.nTotalTS << endl;
//    cout << "nLeftLen : " << nLeftLen << " , nDataSize : " << header.nDataSize << endl;
    if ((header.nDataSize + 11) > nLeftLen)
    {
        return NULL;
    }

    Tag *pTag;
    switch (header.nType) {
    case 0x09:  // 视频类型的Tag
        pTag = new CVideoTag(&header, pBuf, nLeftLen, this);
        break;
    case 0x08:  // 音频类型的Tag
        pTag = new CAudioTag(&header, pBuf, nLeftLen, this);
        break;
    case 0x12:  // script Tag
        pTag = new CMetaDataTag(&header, pBuf, nLeftLen, this);
        break;
    default:    // script类型的Tag
        pTag = new Tag();
        pTag->Init(&header, pBuf, nLeftLen);
    }

    return pTag;
}

int CFlvParser::DestroyTag(Tag *pTag)
{
    if (pTag->_pMedia != NULL)
        delete []pTag->_pMedia;
    if (pTag->_pTagData!=NULL)
        delete []pTag->_pTagData;
    if (pTag->_pTagHeader != NULL)
        delete []pTag->_pTagHeader;

    return 1;
}

int CFlvParser::CVideoTag::ParseH264Tag(CFlvParser *pParser)
{
    uint8_t *pd = _pTagData;
    /*
    ** 数据包的类型
    ** 视频数据被压缩之后被打包成数据包在网上传输
    ** 有两种类型的数据包：视频信息包（sps、pps等）和视频数据包（视频的压缩数据）
    */
    int nAVCPacketType = pd[1];
    int nCompositionTime = CFlvParser::ShowU24(pd + 2);

    // 如果是视频配置信息
    if (nAVCPacketType == 0)    // AVC sequence header
    {
        ParseH264Configuration(pParser, pd);
    }
    // 如果是视频数据
    else if (nAVCPacketType == 1) // AVC NALU
    {
        ParseNalu(pParser, pd);
    }
    else
    {

    }
    return 1;
}
/**
 * @brief
AVCDecoderConfigurationRecord {
    uint32_t(8) configurationVersion = 1;  [0]
    uint32_t(8) AVCProfileIndication;       [1]
    uint32_t(8) profile_compatibility;      [2]
    uint32_t(8) AVCLevelIndication;         [3]
    bit(6) reserved = '111111'b;            [4]
    uint32_t(2) lengthSizeMinusOne;         [4] 计算方法是 1 + (lengthSizeMinusOne & 3)，实际计算结果一直是4
    bit(3) reserved = '111'b;                   [5]
    uint32_t(5) numOfSequenceParameterSets; [5] SPS 的个数，计算方法是 numOfSequenceParameterSets & 0x1F，实际计算结果一直为1
    for (i=0; i< numOfSequenceParameterSets; i++) {
        uint32_t(16) sequenceParameterSetLength ;   [6,7]
        bit(8*sequenceParameterSetLength) sequenceParameterSetNALUnit;
    }
    uint32_t(8) numOfPictureParameterSets;      PPS 的个数，一直为1
    for (i=0; i< numOfPictureParameterSets; i++) {
        uint32_t(16) pictureParameterSetLength;
        bit(8*pictureParameterSetLength) pictureParameterSetNALUnit;
    }
}

_nNalUnitLength 这个变量告诉我们用几个字节来存储NALU的长度，如果NALULengthSizeMinusOne是0，
那么每个NALU使用一个字节的前缀来指定长度，那么每个NALU包的最大长度是255字节，
这个明显太小了，使用2个字节的前缀来指定长度，那么每个NALU包的最大长度是64K字节，
也不一定够，一般分辨率达到1280*720 的图像编码出的I帧，可能大于64K；3字节是比较完美的，
但是因为一些原因（例如对齐）没有被广泛支持；因此4字节长度的前缀是目前使用最多的方式
 * @param pParser
 * @param pTagData
 * @return
 */
int CFlvParser::CVideoTag::ParseH264Configuration(CFlvParser *pParser, uint8_t *pTagData)
{
    uint8_t *pd = pTagData;
    // 跨过 Tag Data的VIDEODATA(1字节) AVCVIDEOPACKET(AVCPacketType(1字节) 和CompositionTime(3字节) 4字节)
    // 总共跨过5个字节

    // NalUnit长度表示占用的字节
    pParser->_nNalUnitLength = (pd[9] & 0x03) + 1;  // lengthSizeMinusOne 9 = 5 + 4

    int sps_size, pps_size;
    // sps（序列参数集）的长度
    sps_size = CFlvParser::ShowU16(pd + 11);        // sequenceParameterSetLength 11 = 5 + 6
    // pps（图像参数集）的长度
    pps_size = CFlvParser::ShowU16(pd + 11 + (2 + sps_size) + 1);   // pictureParameterSetLength

    // 元数据的长度
    _nMediaLen = 4 + sps_size + 4 + pps_size;   // 添加start code
    _pMedia = new uint8_t[_nMediaLen];
    // 保存元数据
    memcpy(_pMedia, &nH264StartCode, 4);
    memcpy(_pMedia + 4, pd + 11 + 2, sps_size);
    memcpy(_pMedia + 4 + sps_size, &nH264StartCode, 4);
    memcpy(_pMedia + 4 + sps_size + 4, pd + 11 + 2 + sps_size + 2 + 1, pps_size);

    return 1;
}

int CFlvParser::CVideoTag::ParseNalu(CFlvParser *pParser, uint8_t *pTagData)
{
    uint8_t *pd = pTagData;
    int nOffset = 0;

    _pMedia = new uint8_t[_header.nDataSize+10];
    _nMediaLen = 0;
    // 跨过 Tag Data的VIDEODATA(1字节) AVCVIDEOPACKET(AVCPacketType和CompositionTime 4字节)
    nOffset = 5; // 总共跨过5个字节 132 - 5 = 127 = _nNalUnitLength(4字节)  + NALU(123字节)
    //                                           startcode(4字节)  + NALU(123字节) = 127
    while (1)
    {
        // 如果解析完了一个Tag，那么就跳出循环
        if (nOffset >= _header.nDataSize)
            break;
        // 计算NALU（视频数据被包装成NALU在网上传输）的长度,
        // 一个tag可能包含多个nalu, 所以每个nalu前面有NalUnitLength字节表示每个nalu的长度
        int nNaluLen;
        switch (pParser->_nNalUnitLength)
        {
        case 4:
            nNaluLen = CFlvParser::ShowU32(pd + nOffset);
            break;
        case 3:
            nNaluLen = CFlvParser::ShowU24(pd + nOffset);
            break;
        case 2:
            nNaluLen = CFlvParser::ShowU16(pd + nOffset);
            break;
        default:
            nNaluLen = CFlvParser::ShowU8(pd + nOffset);
        }
        // 获取NALU的起始码
        memcpy(_pMedia + _nMediaLen, &nH264StartCode, 4);
        // 复制NALU的数据
        memcpy(_pMedia + _nMediaLen + 4, pd + nOffset + pParser->_nNalUnitLength, nNaluLen);
        // 解析NALU
//        pParser->_vjj->Process(_pMedia+_nMediaLen, 4+nNaluLen, _header.nTotalTS);
        _nMediaLen += (4 + nNaluLen);
        nOffset += (pParser->_nNalUnitLength + nNaluLen);
    }

    return 1;
}

main.cpp

c 复制代码

#include <stdlib.h>
#include <string.h>

#include <iostream>
#include <fstream>

#include "FlvParser.h"

using namespace std;

void Process(fstream &fin, const char *filename);

int main(int argc, char *argv[])
{
    cout << "Hi, this is FLV parser test program!\n";

    if (argc != 3)
    {
        cout << "FlvParser.exe [input flv] [output flv]" << endl;
        return 0;
    }

    fstream fin;
    fin.open(argv[1], ios_base::in | ios_base::binary);
    if (!fin)
        return 0;

    Process(fin, argv[2]);

    fin.close();

    return 1;
}

void Process(fstream &fin, const char *filename)
{
    CFlvParser parser;

    int nBufSize = 2*1024 * 1024;
    int nFlvPos = 0;
    uint8_t *pBuf, *pBak;
    pBuf = new uint8_t[nBufSize];
    pBak = new uint8_t[nBufSize];

    while (1)
    {
        int nReadNum = 0;
        int nUsedLen = 0;
        fin.read((char *)pBuf + nFlvPos, nBufSize - nFlvPos);
        nReadNum = fin.gcount();
        if (nReadNum == 0)
            break;
        nFlvPos += nReadNum;

        parser.Parse(pBuf, nFlvPos, nUsedLen);
        if (nFlvPos != nUsedLen)
        {
            memcpy(pBak, pBuf + nUsedLen, nFlvPos - nUsedLen);
            memcpy(pBuf, pBak, nFlvPos - nUsedLen);
        }
        nFlvPos -= nUsedLen;
    }
    parser.PrintInfo();
    parser.DumpH264("parser.264");
    parser.DumpAAC("parser.aac");

    //dump into flv
    parser.DumpFlv(filename);

    delete []pBak;
    delete []pBuf;
}

FFmpeg--FlvPaser源码解析

文章目录

FLV文件解析器实现分析

整体架构

核心组件

FLV文件解析流程

视频标签处理(H.264)

音频标签处理(AAC)

元数据处理

数据导出功能

FLV文件头结构

FLV文件整体结构

解析全部过程

FLV文件结构解析与数据处理

FLV文件头解析

Tag解析流程

视频Tag处理（0x09）

音频Tag处理（0x08）

元数据Tag处理（0x12）

数据导出与封装

FLV Header解析 (9字节)

Tag头部解析 (11字节)

视频Tag解析 (类型0x09)

AVC序列头解析 (AVCPacketType=0)

AVC NALU解析 (AVCPacketType=1)

音频Tag解析 (类型0x08)

AAC序列头解析 (AACPacketType=0)

AAC原始数据解析 (AACPacketType=1)

完整代码

FlvParser.h

FlvParser.cpp

main.cpp