浅析Node.js中http代理的实现

文章目录

前言

起因是狼书卷1中的API Proxy例子，代码如下：

js 复制代码

const http = require('http')
const fs = require('fs')

const app = http.createServer((req, res) => {
    if ('/remote' === req.url) {
        res.writeHead(200, { 'Content-Type': 'text/plain' })
        return res.end('Hello remote page!\n')
    } else {
        proxy(req, res)
    }
})


function proxy(req, res) {
    const options = {
        host: req.host,
        port: 3000,
        path: 'remote',
        method: req.method, //原文是GET
        headers: req.headers,
        agent: false
    }

    let httpProxy = http.request(options, (response) => {
        response.pipe(res)
    })

    req.pipe(httpProxy)
}

app.listen(3000, function () {
    const PORT = app.address().port
    console.log(`Server running at http://127.0.0.1:${PORT}/}`)
})

例子没有什么问题，很好的实现了一个简单的http代理. 但读代码的过程中，还是发现了一些可以略做深究的点，记录如下。

`ReadableStream`和`WritableStream`

我们从代码说起，这个proxy的核心方法是:

js 复制代码

function proxy(req, res) {
    const options = {
        host: req.host,
        port: 3000,
        path: 'remote',
        method: req.method, //原文是GET
        headers: req.headers,
        agent: false
    }

    let httpProxy = http.request(options, (response) => {
        response.pipe(res)
    })

    req.pipe(httpProxy)
}

这个方法中，创建了转发请求需要的options, 其中包含目标服务器的信息，请求地址，以及请求头headers.
proxy方法的两个参数分别是req,res. req是一个Readable Stream，res是一个Writable Stream. 这里要注意，readable还是writable是在server 的角度来看的：server需要从req中读取请求信息，把返回的内容写入到res中.

在整个代理的过程中，依靠的是pipe来连接，pipe实现的功能是连接Readable Stream到Writable Stream,反之亦然.
req和res的读写属性我们刚才分析了，现在来看httpProxy和方法回调中的response,但是这时，要从client 角度来看了，response是远程服务返回的信息，是一个ReadableStream. httpProxy是http.request返回的值，类型是http.ClientRequest, 继承自OutgoingMessage，也是一个Writable stream.

我们整理一下

Object	Read/Write
`req`	Readable
`res`	Writable
`httpProxy`	Writable
`response`	Readable

到这里，流程非常清楚了

req（readable） ⇒ httpProxy(writable)
response(readable) ⇒ res(writable)

整个代理的流程厘清了.

`req.pipe`

读代码的时候，还想到了一个问题， httpProxy = http.request 这一行，不是已经发起请求吗，为什么最后还要req.pipe?

这里涉及到http.request的请求过程，在调用这个方法的时候，实际上只是发出了请求头，此时并不能认为这个请求已经完成，例如POST请求就可能会要写入其它的数据到stream中。所以，这里req.pipe是将原始请求的流定向到了代理请求中，确保所有数据都写入。

用一个简单的例子就能看清楚

js 复制代码

const http = require('http');

// POST 请求选项
const options = {
  hostname: 'www.example.com',
  port: 80,
  path: '/submit-form',
  method: 'POST',
  headers: {
    'Content-Type': 'application/x-www-form-urlencoded'
  }
};

// 发送POST请求
const req = http.request(options, (res) => {
  let data = '';

  res.on('data', (chunk) => {
    data += chunk;
  });

  res.on('end', () => {
    console.log(data);
  });
});

// 发送请求数据
req.write('key1=value1&key2=value2');
req.end();

例子中，req 就要通过write 进行请求数据的写入.

所以req.pipe 是必须的，因为需要保证请求的stream中所有数据被转发.

小结

本文简单分析了Node.js实现proxy的一些容易忽略的知识点，涉及可读可写流，以及http请求的发起过程. 欢迎交流。

浅析Node.js中http代理的实现

文章目录

前言

ReadableStream和WritableStream

req.pipe

小结

`ReadableStream`和`WritableStream`

`req.pipe`