Node.JS多线程PromisePool之Async库实现

What's Async

Async is a utility module which provides straight-forward, powerful functions for working with asynchronous JavaScript. Although originally designed for use with Node.js and installable via npm i async, it can also be used directly in the browser. A ESM/MJS version is included in the main async package that should automatically be used with compatible bundlers such as Webpack and Rollup.

Async - NpmJS

async - npm (npmjs.com)https://www.npmjs.com/package/async

Async - Document

https://caolan.github.io/async/

Async异步是一个实用程序模块,它为使用异步JavaScript提供了直接的、功能强大的功能。

虽然最初设计用于Node.js,并且可以通过npm i异步安装,但它也可以直接在浏览器中使用。

的一个ESM/MJS版本包含在主异步包中,它应该自动与兼容的捆绑包一起使用,如Web包和汇总。

※Async多线程实战※

创建一个名为threadPool.js的文件,并添加以下代码:

javascript 复制代码
const async = require('async');

// 创建一个包含5个worker的线程池
const threadPool = async.queue((task, callback) => {
  // 模拟一个耗时操作
  setTimeout(() => {
    console.log('Task completed:', task);
    callback();
  }, 1000);
}, 5);

// 添加任务到线程池
for (let i = 0; i < 10; i++) {
  threadPool.push(i, (err) => {
    if (err) {
      console.error('Error:', err);
    } else {
      console.log('Task finished:', i);
    }
  });
}

在这个示例中,我们创建了一个包含5个worker的线程池。然后,我们向线程池添加了10个任务。线程池会并发执行这些任务,但最多只能有5个任务同时运行。当一个任务完成时,线程池会自动分配下一个任务给空闲的worker。

官方一些有用的例子

Quick Examples

javascript 复制代码
async.map(['file1','file2','file3'], fs.stat, function(err, results) {
    // results is now an array of stats for each file
});

async.filter(['file1','file2','file3'], function(filePath, callback) {
  fs.access(filePath, function(err) {
    callback(null, !err)
  });
}, function(err, results) {
    // results now equals an array of the existing files
});

async.parallel([
    function(callback) { ... },
    function(callback) { ... }
], function(err, results) {
    // optional callback
});

async.series([
    function(callback) { ... },
    function(callback) { ... }
]);

There are many more functions available so take a look at the docs below for a full list. This module aims to be comprehensive, so if you feel anything is missing please create a GitHub issue for it.

Common Pitfalls (StackOverflow)

Synchronous iteration functions

If you get an error like RangeError: Maximum call stack size exceeded. or other stack overflow issues when using async, you are likely using a synchronous iteratee. By synchronous we mean a function that calls its callback on the same tick in the javascript event loop, without doing any I/O or using any timers. Calling many callbacks iteratively will quickly overflow the stack. If you run into this issue, just defer your callback with async.setImmediate to start a new call stack on the next tick of the event loop.

This can also arise by accident if you callback early in certain cases:

javascript 复制代码
async.eachSeries(hugeArray, function iteratee(item, callback) {
    if (inCache(item)) {
        callback(null, cache[item]); // if many items are cached, you'll overflow
    } else {
        doSomeIO(item, callback);
    }
}, function done() {
    //...
});

Just change it to:

javascript 复制代码
async.eachSeries(hugeArray, function iteratee(item, callback) {
    if (inCache(item)) {
        async.setImmediate(function() {
            callback(null, cache[item]);
        });
    } else {
        doSomeIO(item, callback);
        //...
    }
});

Async does not guard against synchronous iteratees for performance reasons. If you are still running into stack overflows, you can defer as suggested above, or wrap functions with async.ensureAsync Functions that are asynchronous by their nature do not have this problem and don't need the extra callback deferral.

If JavaScript's event loop is still a bit nebulous, check out this article or this talk for more detailed information about how it works.

Multiple callbacks

Make sure to always return when calling a callback early, otherwise you will cause multiple callbacks and unpredictable behavior in many cases.

javascript 复制代码
async.waterfall([
    function(callback) {
        getSomething(options, function (err, result) {
            if (err) {
                callback(new Error("failed getting something:" + err.message));
                // we should return here
            }
            // since we did not return, this callback still will be called and
            // `processData` will be called twice
            callback(null, result);
        });
    },
    processData
], done)

It is always good practice to return callback(err, result) whenever a callback call is not the last statement of a function.

Using ES2017 async functions

Async accepts async functions wherever we accept a Node-style callback function. However, we do not pass them a callback, and instead use the return value and handle any promise rejections or errors thrown.

javascript 复制代码
async.mapLimit(files, 10, async file => { // <- no callback!
    const text = await util.promisify(fs.readFile)(dir + file, 'utf8')
    const body = JSON.parse(text) // <- a parse error here will be caught automatically
    if (!(await checkValidity(body))) {
        throw new Error(`${file} has invalid contents`) // <- this error will also be caught
    }
    return body // <- return a value!
}, (err, contents) => {
    if (err) throw err
    console.log(contents)
})

We can only detect native async functions, not transpiled versions (e.g. with Babel). Otherwise, you can wrap async functions in async.asyncify().

Binding a context to an iteratee

This section is really about bind, not about Async. If you are wondering how to make Async execute your iteratees in a given context, or are confused as to why a method of another library isn't working as an iteratee, study this example:

javascript 复制代码
// Here is a simple object with an (unnecessarily roundabout) squaring method
var AsyncSquaringLibrary = {
    squareExponent: 2,
    square: function(number, callback){
        var result = Math.pow(number, this.squareExponent);
        setTimeout(function(){
            callback(null, result);
        }, 200);
    }
};

async.map([1, 2, 3], AsyncSquaringLibrary.square, function(err, result) {
    // result is [NaN, NaN, NaN]
    // This fails because the `this.squareExponent` expression in the square
    // function is not evaluated in the context of AsyncSquaringLibrary, and is
    // therefore undefined.
});

async.map([1, 2, 3], AsyncSquaringLibrary.square.bind(AsyncSquaringLibrary), function(err, result) {
    // result is [1, 4, 9]
    // With the help of bind we can attach a context to the iteratee before
    // passing it to Async. Now the square function will be executed in its
    // 'home' AsyncSquaringLibrary context and the value of `this.squareExponent`
    // will be as expected.
});

Subtle Memory Leaks

There are cases where you might want to exit early from async flow, when calling an Async method inside another async function:

javascript 复制代码
function myFunction (args, outerCallback) {
    async.waterfall([
        //...
        function (arg, next) {
            if (someImportantCondition()) {
                return outerCallback(null)
            }
        },
        function (arg, next) {/*...*/}
    ], function done (err) {
        //...
    })
}

Something happened in a waterfall where you want to skip the rest of the execution, so you call an outer callack. However, Async will still wait for that inner next callback to be called, leaving some closure scope allocated.

As of version 3.0, you can call any Async callback with false as the error argument, and the rest of the execution of the Async method will be stopped or ignored.

javascript 复制代码
        function (arg, next) {
            if (someImportantCondition()) {
                outerCallback(null)
                return next(false) // ← signal that you called an outer callback
            }
        },

Mutating collections while processing them

If you pass an array to a collection method (such as each, mapLimit, or filterSeries), and then attempt to push, pop, or splice additional items on to the array, this could lead to unexpected or undefined behavior. Async will iterate until the original length of the array is met, and the indexes of items pop()ed or splice()d could already have been processed. Therefore, it is not recommended to modify the array after Async has begun iterating over it. If you do need to push, pop, or splice, use a queue instead.

相关推荐
一碗饭特稀7 小时前
NestJS入门(2)——数据库、用户、备忘录模块初始化
node.js·nestjs
切糕师学AI8 小时前
【多线程】阻塞等待(Blocking Wait)(以C++为例)
c++·多线程·并发编程·阻塞等待
你的电影很有趣12 小时前
lesson72:Node.js 安全实战:Crypto-Js 4.2.0 与 Express 加密体系构建指南
javascript·安全·node.js
玩代码12 小时前
使用 nvm(Node Version Manager) 高效管理Node.js
node.js·vue·nvm
JAVA学习通14 小时前
基本功 | 一文讲清多线程和多线程同步
java·开发语言·多线程
api_1800790546014 小时前
异步数据采集实践:用 Python/Node.js 构建高并发淘宝商品 API 调用引擎
大数据·开发语言·数据库·数据挖掘·node.js
_孤傲_15 小时前
webpack实现常用plugin
前端·webpack·node.js
小菜摸鱼1 天前
Node.js + vue3 大文件-切片上传全流程(视频文件)
前端·node.js
PaytonD1 天前
LoopBack 2 如何设置静态资源缓存时间
前端·javascript·node.js
许久'2 天前
环境搭建node.js gnvm
node.js