你真的了解 WebSocket 吗?

WebSocket协议是基于TCP的一种新的协议。WebSocket最初在HTML5规范中被引用为TCP连接,作为基于TCP的套接字API的占位符。它实现了浏览器与服务器全双工(full-duplex)通信。其本质是保持TCP连接,在浏览器和服务端通过Socket进行通信。

本文将使用Python编写Socket服务端,一步一步分析请求过程!!!

1. 启动服务端

启动Socket服务器后,等待用户【连接】,然后进行收发数据。

2. 客户端连接

当客户端向服务端发送连接请求时,不仅连接还会发送【握手】信息,并等待服务端响应,至此连接才创建成功!

3. 建立连接【握手】

请求和响应的【握手】信息需要遵循规则:

  • 从请求【握手】信息中提取 Sec-WebSocket-Key
  • 利用magic_string 和 Sec-WebSocket-Key 进行hmac1加密,再进行base64加密
  • 将加密结果响应给客户端

注:magic string为:258EAFA5-E914-47DA-95CA-C5AB0DC85B11

请求【握手】信息为:

提取Sec-WebSocket-Key值并加密:

4.客户端和服务端收发数据

客户端和服务端传输数据时,需要对数据进行【封包】和【解包】。客户端的JavaScript类库已经封装【封包】和【解包】过程,但Socket服务端需要手动实现。

第一步:获取客户端发送的数据【解包】

解包详细过程:

The MASK bit simply tells whether the message is encoded. Messages from the client must be masked, so your server should expect this to be 1. (In fact, section 5.1 of the spec says that your server must disconnect from a client if that client sends an unmasked message.) When sending a frame back to the client, do not mask it and do not set the mask bit. We’ll explain masking later. Note: You have to mask messages even when using a secure socket.RSV1-3 can be ignored, they are for extensions.

The opcode field defines how to interpret the payload data: 0x0 for continuation, 0x1 for text (which is always encoded in UTF-8), 0x2 for binary, and other so-called “control codes” that will be discussed later. In this version of WebSockets, 0x3 to 0x7 and 0xB to 0xF have no meaning.

The FIN bit tells whether this is the last message in a series. If it’s 0, then the server will keep listening for more parts of the message; otherwise, the server should consider the message delivered. More on this later.

Decoding Payload Length

To read the payload data, you must know when to stop reading. That’s why the payload length is important to know. Unfortunately, this is somewhat complicated. To read it, follow these steps:

  1. Read bits 9-15 (inclusive) and interpret that as an unsigned integer. If it’s 125 or less, then that’s the length; you’re done. If it’s 126, go to step 2. If it’s 127, go to step 3.
  2. Read the next 16 bits and interpret those as an unsigned integer. You’re done.
  3. Read the next 64 bits and interpret those as an unsigned integer (The most significant bit MUST be 0). You’re done.

Reading and Unmasking the Data

If the MASK bit was set (and it should be, for client-to-server messages), read the next 4 octets (32 bits); this is the masking key. Once the payload length and masking key is decoded, you can go ahead and read that number of bytes from the socket. Let’s call the data ENCODED, and the key MASK. To get DECODED, loop through the octets (bytes a.k.a. characters for text data) of ENCODED and XOR the octet with the (i modulo 4)th octet of MASK. In pseudo-code (that happens to be valid JavaScript):

 

var DECODED = “”;
for (var i = 0; i < ENCODED.length; i++) {
DECODED[i] = ENCODED[i] ^ MASK[i % 4];
}

 

Now you can figure out what DECODED means depending on your application.

第二步:向客户端发送数据【封包】

5. 基于Python实现简单示例

a. 基于Python socket实现的WebSocket服务端:

Tornado是一个支持WebSocket的优秀框架,其内部原理正如1~5步骤描述,当然Tornado内部封装功能更加完整。 以下是基于Tornado实现的聊天室示例:

示例源码下载

参考文献:https://developer.mozilla.org/en-US/docs/Web/API/WebSockets_API/Writing_WebSocket_servers

1 2 收藏 评论

可能感兴趣的话题



直接登录
跳到底部
返回顶部