从OpenGL渲染的角度排查 creator native 局部换肤的问题

问题表现

我制作了一个最简单的龙骨动画，里面有骨骼嵌套，在父骨骼的时间轴上切换子骨骼的animation，正常的表现如下：

换肤后直接不动了

定位过程

因为代码是从js移植过来的，所有的经验都来自js引擎，其实native的c++引擎和js引擎比较像，但编码结构上也有差异，隐约感觉是这部分的问题。

问题比较难定位，所以我才做了这样的一个最小的demo，实际发现，排查起来也是无从下手，看来只能从最终提交顶点来反推问题了。

draw

下手的地方当然还是从OpenGL的draw函数下手。

c++ 复制代码

void DeviceGraphics::draw(size_t base, GLsizei count)
{
    commitVertexBuffer();// 顶点数据是在这里进行提交的
    if (nextIndexBuffer)
    {
        // 这里调用draw，但是数据是绑定在buffer里面的，如果对OpenGL不熟悉，需要补补基础
        GL_CHECK(glDrawElements(ENUM_CLASS_TO_GLENUM(_nextState->primitiveType),
           count,
           ENUM_CLASS_TO_GLENUM(nextIndexBuffer->getFormat()),
           (GLvoid *)(base * nextIndexBuffer->getBytesPerIndex())));
    }
}

分配顶点缓冲区的数据

c++ 复制代码

void DeviceGraphics::commitVertexBuffer()
{
    if (attrsDirty)
    {
        for (int i = 0; i < _caps.maxVertexAttributes; ++i)
            _newAttributes[i] = 0;

        for (int i = 0; i < _nextState->maxStream + 1; ++i)
        {
            auto vb = _nextState->getVertexBuffer(i);
            if (!vb)
                continue;
            // 仅仅做了 glBindBuffer的操作
            // vb->getHandle()仅仅是buffer的id
            GL_CHECK(ccBindBuffer(GL_ARRAY_BUFFER, vb->getHandle()));

            auto vboffset = _nextState->getVertexBufferOffset(i);
            const auto& attributes = _nextState->getProgram()->getAttributes();
            auto usedAttriLen = attributes.size();
            for (int j = 0; j < usedAttriLen; ++j)
            {
                const auto& attr = attributes[j];
                const auto* el = vb->getFormat().getElement(attr.hashName);
                if (!el || !el->isValid())
                {
                    RENDERER_LOGW("Can not find vertex attribute: %s", attr.name.c_str());
                    continue;
                }

                if (0 == _enabledAtrributes[attr.location])
                {
                    // 启用shader里面的attribute
                    GL_CHECK(ccEnableVertexAttribArray(attr.location));
                    _enabledAtrributes[attr.location] = 1;
                }
                _newAttributes[attr.location] = 1;

                // glVertexAttribPointer (GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const GLvoid *pointer);
                GL_CHECK(ccVertexAttribPointer(attr.location,
                                      el->num,
                                      ENUM_CLASS_TO_GLENUM(el->type),
                                      el->normalize,
                                      el->stride,
                                      (GLvoid*)(el->offset + vboffset * el->stride)));
            }
        }
    }
}

查到这里，其实就要追查glGenBuffers的逻辑，但是这里很明显是没有的，那glGenBuffers的操作就一定是在其他地方，这里仅仅是直接拿来使用。

这里需要补充一个基础知识，在creator的架构里面，是存在一个非常大的buffer，所有的顶点数据都是存放在这个大Buffer里面，这样有一个好处，就是不用频繁的开辟内存。

glBufferData

这个函数是我们下一步追踪的重点函数，因为顶点数据就是通过这个函数和buffer建立联系的（OpenGL的基础），这部分的逻辑也是放在jsb触发的

我们全局搜glBufferDta，使用的地方很少，很容易我们就能看到是VertexBuffer有在使用，其实真正的顶点数据也正是在这个函数里面绑定的。

c++ 复制代码

// 在MiddlewareMarco.h中定义的有最大容量相关的宏
// index buffer init capacity
#define INIT_INDEX_BUFFER_SIZE 1024000
// max vertex buffer size
#define MAX_VERTEX_BUFFER_SIZE 65535

// fill debug data max capacity
#define MAX_DEBUG_BUFFER_SIZE 409600
// type array pool min size
#define MIN_TYPE_ARRAY_SIZE 1024


bool VertexBuffer::init(DeviceGraphics* device, VertexFormat* format, Usage usage, const void* data, size_t dataByteLength, uint32_t numVertices)
{
    _bytes = _format->_bytes * numVertices;// 20*65535=1310700，会预申请一块非常大的内存
    // 我们看到了申请创建buffer的逻辑
    glGenBuffers(1, &_glID);
}
void VertexBuffer::update(uint32_t offset, const void* data, size_t dataByteLength)
{
    // 给vbo绑定数据，在init中申请了buffer
    glBufferData(GL_ARRAY_BUFFER, _bytes, (const GLvoid*)data, glUsage);
    // 这样绑定后，只需要持有这个buffer的handler，就能操作对应的内存数据
    // 这个data没有缓存，因为他是来自meshBuffer里面，所以排查方向转到了MeshBuffer
}

MeshBufer.h

c++ 复制代码

  IOBuffer _vb;
void MeshBuffer::uploadVB()
{
    auto length = _vb.length();
    if (length == 0) return;

    auto glVB = _glVBArr[_bufferPos];
    // 数据来自_vb，buffer的数据类型是uint8_t，可能是要和js的ArrayBuffer同步
    glVB->update(0, _vb.getBuffer(), length); 
}

IOBuffer.h

C++ 复制代码

class IOBuffer
{
public:
    uint8_t* _buffer = nullptr;
    IOBuffer (std::size_t defaultSize)
    {
        _bufferSize = defaultSize;
        _buffer = new uint8_t[_bufferSize]; // 在构造函数中有赋值
    }
    inline uint8_t* getBuffer () const
    {
        return _buffer; // 绑定数据的真正源头
    }
}

确定顶点数据的存放位置

一路反推glBufferData的data参数，发现就是来自上边的_buffer，再去看_buffer的赋值地方，就发现

顶点数据的来源如下：

c++ 复制代码

auto mgr = middleware::MiddlewareManager::getInstance();
middleware::MeshBuffer* mb = mgr->getMeshBuffer(VF_XYUVC);
auto glvb = mb->getGLVB();
auto vb = mb->getVB().getBuffer();
auto ib = mb->getIB().getBuffer();

再看ccVertexAttribPointer确定顶点数据布局

a_position 8字节 x,y， offset为0
a_uv0 8字节，uv， offset为8
a_color 4字节，rgba， offset为16

所以数据布局格式为:

a_position	a_uv0	a_color
8	8	4

一组数据一共20个字节

再次观察提交的顶点数量，发现内存顶点数据一个细节

正常的时候一直是6个，也就是2个三角形，发现换肤后，顶点变成了12个，

在上一步我们知道了顶点数据的内存地址，我观察了下该内存地址的数据

发现多了4个顶点，虽然我不知道这组数据是否和这个bug有关系，因为处理换肤顶点我有这样的逻辑：

c++ 复制代码

memcpy(worldVerts, triangles.verts, triangles.vertCount * sizeof(middleware::V2F_T2F_C4B));

感觉可能是worldVerts可能发生变化，可能需要从大的bufferdata里面找到对应的数据，然后直接修改。

后续排查了下，发现其实和这个是没有关系的，worldVerts并不是那个大的renderBuffer，这样copy并不会带来额外的数据错乱，worldVerts仅仅也是一个中转的数据地址，最终会合并到MeshBuffer里面，所以问题不在这里。

这里虽然会多copy一些数据过来，但是用不用还是受indices的控制。

为什么会多提交一个Sprite

问题还是出在提交的顶点数量，为什么多了？

顶点的数量来自，这个是和InputAssembler有关联的

c++ 复制代码

uint32_t InputAssembler::getPrimitiveCount() const
{
    if (-1 != _count)
        return _count;
    
    if (_indexBuffer)
        return _indexBuffer->getCount();
    
    assert(_vertexBuffer);
    return _vertexBuffer->getCount();
}

后来发现是inputAssembler的indices的长度发生了异常

c++ 复制代码

void CCArmatureDisplay::dbRender()
{
    _curISegLen = 0;// 这个数值发生了变化

    traverseArmature(_armature);
    if (_preISegWritePos != -1)
    {
        // 这里面会更新inputAssembler的indices长度
        _assembler->updateIARange(_materialLen - 1, _preISegWritePos, _curISegLen);
    }
}

而在traverseArmature里面，的确是有处理_curISegLen

c++ 复制代码

void CCArmatureDisplay::traverseArmature(Armature* armature, float parentOpacity)
{
    for (std::size_t i = 0, len = slots.size(); i < len; i++)
    {
        texture = slot->getTexture();// 问题就出在我hack了getTexture的实现
        if (!texture) continue;// 如果没有纹理就直接返回了，而我会一直返回我hack的纹理，导致indices被累加了
    
        // Record this turn index segmentation count,it will store in material buffer in the end.
        _curISegLen += triangles.indexCount;
    }
        
}

知道了原因，修复方法也变的很简单了，是否有纹理，应该以DragonBones的数据为基准，当有纹理了，再返回hack的纹理，修复后换肤就正常了

结尾

对Engine又更加熟悉了