websocket 内的操作码fin和opcode

websocket传输的头两个字节非常关键，提供了每一帧的基本信息，RFT6455中给出了帧头的格式说明：

复制代码

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-------+-+-------------+-------------------------------+
     |F|R|R|R| opcode|M| Payload len |    Extended payload length    |
     |I|S|S|S|  (4)  |A|     (7)     |             (16/64)           |
     |N|V|V|V|       |S|             |   (if payload len==126/127)   |
     | |1|2|3|       |K|             |                               |
     +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
     |     Extended payload length continued, if payload len == 127  |
     + - - - - - - - - - - - - - - - +-------------------------------+
     |                               |Masking-key, if MASK set to 1  |
     +-------------------------------+-------------------------------+
     | Masking-key (continued)       |          Payload Data         |
     +-------------------------------- - - - - - - - - - - - - - - - +
     :                     Payload Data continued ...                :
     + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
     |                     Payload Data continued ...                |
     +---------------------------------------------------------------+

第一个字节中最重要的就是fin位和opcode位，用c++的struct bit field配合union可以方便的对数据实现存取，这儿给出一个例子：

cpp 复制代码

#ifndef TAGWEBSCOKETFRAMEOPCODE_H
#define TAGWEBSCOKETFRAMEOPCODE_H
//Begin section for file tagWebScoketFrameOpCode.h
//TODO: Add definitions that you want preserved
//End section for file tagWebScoketFrameOpCode.h
#include "../../Common.h"



namespace boson
{



    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
    union tagWebScoketFrameOpCode
    {

        //Begin section for boson::tagWebScoketFrameOpCode
        //TODO: Add attributes that you want preserved
        //End section for boson::tagWebScoketFrameOpCode

        public:


            //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
            struct tagWebScoketFrameOpCodeBits
            {

                //Begin section for boson::tagWebScoketFrameOpCode::tagWebScoketFrameOpCodeBits
                //TODO: Add attributes that you want preserved
                //End section for boson::tagWebScoketFrameOpCode::tagWebScoketFrameOpCodeBits



                public:



                    #if BYTE_ORDER == LITTLE_ENDIAN



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 code : 4;



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 rsv3 : 1;



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 rsv2 : 1;



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 rsv1 : 1;



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 fin0 : 1;



                    #elif BYTE_ORDER == BIG_ENDIAN



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 fin0 : 1;



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 rsv1 : 1;



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 rsv2 : 1;



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 rsv3 : 1;



                    //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
                    U08 code : 4;



                    #endif 



            };  //end struct tagWebScoketFrameOpCodeBits




        public:


            //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
            U08 data;



            //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
            tagWebScoketFrameOpCodeBits bits;



            //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
            tagWebScoketFrameOpCode(); 



            //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
            tagWebScoketFrameOpCode(const tagWebScoketFrameOpCode & value); 



            //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
            ~tagWebScoketFrameOpCode(); 



            //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
            tagWebScoketFrameOpCode & operator=(const tagWebScoketFrameOpCode & value ); 



            //@generated "UML to C++ (com.ibm.xtools.transform.uml2.cpp.CPPTransformation)"
            void Clear(); 



    };  //end union tagWebScoketFrameOpCode



} //end namespace boson



#endif

在定义结构体的时候休要注意字节序的大小端区别，这篇文章要讲的就是就是这个字节数据所代表的含义。

这个字节由几部分构成：fin, rsv1, rsv2, rsv3, opcode，其中对 fin 的解释如下：

FIN: 1 bit

Indicates that this is the final fragment in a message. The first fragment MAY also be the final fragment.

fin 的含义很简单：这一段数据是否是最后一部分数据。因为websocket允许将一段长数据拆分成多个帧来发送，避免如果一帧内发送数据太长，对控制帧造成的阻塞。因此在实现的时候，必须考虑拆分较长的数据，每一帧发送不超过限度的长度，比如64K字节。

rsv1 - 3 不用考虑，被保留的位。

opcode 的解释如下：

Opcode: 4 bits

Defines the interpretation of the "Payload data". If an unknown opcode is received, the receiving endpoint MUST Fail the WebSocket Connection. The following values are defined.

* %x0 denotes a continuation frame

* %x1 denotes a text frame
* %x2 denotes a binary frame

* %x3-7 are reserved for further non-control frames

* %x8 denotes a connection close

* %x9 denotes a ping
* %xA denotes a pong

* %xB-F are reserved for further control frames

opcode 是帧的用途定义，这儿简单做一个解释：

0 表示连续帧，1 表示 payload 是文本，2 表示 payload 是二进制流，

3 - 7 保留，

8 连接关闭，9 心跳 ping，A 心跳 pong，

B - F 保留。

其中，0，1，2是正常传输数据用的代码，8，9，A是标准的控制帧。

在 webscoket 中，控制帧有一个限定：控制帧不允许分段（5.4. Fragmentation），意思就是，控制帧只允许作为一个完整的帧一次性发送。不能像数据帧那样分段发送，因此在微软实现的webscoket 类中，如关闭帧只允许带最长125字节的数据（还包括两字节的状态码）。

做出这样的限制的目的是，控制帧允许在分段的数据之间发送。因此，为什么将0，1，2和8，9，A分列开，也是处于这个原因。

在分段传递数据时, 会先收到一个Text / Binary opcode, 它的Fin位是0 (More Fragment), 后续的数据会以Continuation Frame的形式发送, 直到最后一片Fin位是1 (Last Fragment) 的Continuation Frame结束, 中间不会穿插其它的数据帧（控制帧除外, 例如Close, Ping, Pong）。

所以我们在处理webscoket帧的时候，如果Fin位为0，则不处理，将帧放入一个缓存，发现 Fin 位为1 的时候就可以处理帧了，根据帧的opcode来判断，如果是控制帧就直接处理，如果连续帧，就将之前缓存的帧拼接成一个完整的payload即可。