loader执行顺序，与loader的pitch介绍

一个loader除了本身的操作normal execution外，还可添加pitch函数

loader 总是从右到左被调用。在实际（从右到左）执行 loader 之前，会先 从左到右 调用 loader 上的 pitch(若有) 方法。

其中，如果loader的pitch阶段返回了内容，那么就会忽略其自身的normal execution以及剩下的loader（也就是右边loader）的pitch与normal execution阶段

总的流程为：pitch -> 文件被处理为依赖 -> normal execution

如下面的例子中，有三个loader：

js 复制代码

module.exports = {
  //...
  module: {
    rules: [
      {
        //...
        use: ['a-loader', 'b-loader', 'c-loader'],
      },
    ],
  },
};

将会发生这些步骤：

css 复制代码

|- a-loader `pitch`   - pitch开始
  |- b-loader `pitch`
    |- c-loader `pitch`   - pitch结束
      |- requested module is picked up as a dependency   - 处理为依赖
    |- c-loader normal execution
  |- b-loader normal execution
|- a-loader normal execution

如果在b-loader的pitch阶段，返回了内容，那么c-loader将会被忽略

javascript 复制代码

// b-loader.js
module.exports.pitch = function (remainingRequest, precedingRequest, data) {
  if (someCondition()) {
    return (
      'module.exports = require(' +
      JSON.stringify('-!' + remainingRequest) +
      ');'
    );
  }
};
//b-loader的normal execution
module.exports={
    loader: function(){
        ...
    }
}

实际执行过程将被缩短为

css 复制代码

|- a-loader `pitch`
  |- b-loader `pitch` returns a module
|- a-loader normal execution

为什么要有`pitch`阶段

对资源的流向和处理方式进行一些控制或修改
pitch阶段, 文件并没有被处理，Loader 可以在资源进入 Loader 链之前介入，并可能对资源的流向和处理方式进行一些控制或修改。

比如说，针对文件xxx.youdao写了youdao-loader,那么可以在pitch阶段判断文件名是否.youdao结尾，若不是则跳过。
共享信息。

传递给 pitch 方法的 data，在执行阶段也会暴露在 this.data 之下，并且可以用于在循环时，捕获并共享前面的信息。

js 复制代码

//pitch函数传参示例
function pitch(remainingRequest, prevRequest, data) {
    ···
    data.name='mgh'
    ···
}
// normal execution阶段
function loader(content){
    console.log(this.data.name) //打印mgh
}

cache-loader工作流程

cache-loader的作用为将文件处理的结果缓存，在下次构建时，如果文件没变化，则直接获取缓存的内容

需要注意的是，cache-loader中源文件的内容都是由webpack获取后,通过参数获得的，并不是在loader中通过fs.read读取

js 复制代码

function loader(content) {
    console.log('文件内容Buffer',content)
}
module.exports = loader

工作步骤如下：

pitch阶段
1. 获取到remainingRequest参数，会根据该参数生成唯一的hash,并通过该hash与cache-loader的缓存目录生成文件的缓存路径cacheKey，然后将这两个参数挂载到data中，便于normal execution阶段读取
js 复制代码
```
const findCacheDir = require('find-cache-dir');
const cacheIdentifier = `cache-loader:${pkg.version} ${env}`
// cacheDirectory一般是: node_modules/.cache/cache-loader
const cacheDirectory = findCacheDir({
name: 'cache-loader'
}) || os.tmpdir()
function pitch(remainingRequest, prevRequest, dataInput){
    const hash = digest(`${cacheIdentifier}\n${remainingRequest}`)
    // 向normal execution注入数据
    dataInput.remainingRequest = remainingRequest
    dataInput.cacheKey = path.join(cacheDirectory, `${hash}.json`)

}
```
可以看到，cacheKey的生成与文件的内容 无关,与文件的路径 以及cache-loader的版本 、环境（developmeng / production）有关
2. 读取通过cacheKey对应的缓存文件内容，如果文件不存在或读取失败，则结束pitch阶段。否则，根据缓存文件的内容，判断是否使用缓存。
其中缓存文件内容结构如下：

js 复制代码

    // 缓存文件内容结构示例
    {
      "remainingRequest": "/Users/maiguoheng/Desktop/code/dict-course-class/node_modules/babel-polyfill/node_modules/core-js/modules/es7.weak-map.from.js",
      "dependencies": [
         {
          "path": "/Users/maiguoheng/Desktop/code/dict-course-class/node_modules/babel-polyfill/node_modules/core-js/modules/es7.weak-map.from.js",
          "mtime": 499162500000
        },
        {
          "path": "/Users/maiguoheng/Desktop/code/dict-course-class/tiny-cache-loader.js",
          "mtime": 1698907103551
        }
      ],
      "contextDependencies": [],
      "result": [
        {
          "type": "Buffer",
          "data": "base64:Ly8gaHR0cHM6Ly90YzM5LmdpdGh1Yi5pby9wcm9wb3NhbC1zZXRtYXAtb2Zmcm9tLyNzZWMtd2Vha21hcC5mcm9tCnJlcXVpcmUoJy4vX3NldC1jb2xsZWN0aW9uLWZyb20nKSgnV2Vha01hcCcpOwo="
        }
      ]
    }

可以看到，缓存文件中存储了以下内容：

当前被处理文件remainingRequest的路径。
它的依赖dependencies与contextDependencies的path与mtime。其中依赖的mtime是该缓存文件生成时，这个依赖被修改的时间
文件的内容Result,通过buffer-json转换后的base64格式数据
1. 缓存文件是否修改的判断。根据dependencies与contextDependencies文件的path依次读取依赖，将每个依赖实际的mtime与缓存文件中存储的mtime做对比，当每个依赖的mtime都相等时，会在pitch阶段直接返回缓存文件的result内容，并通过this.addDependency与this.addContextDependency将缓存文件记录的依赖添加到loader中（因为pitch阶段返回内容后，不执行normal execution），使得依赖更改时会更新文件。如果不满足，则会记录当前时间startTime。供后续使用data.startTime = Date.now();
normal execution阶段
1. cache-loader中，该阶段是为了生成缓存文件。首先，通过getDependencies与getContextDependencies获取webpapck处理后的依赖，并依次读取这些依赖的内容。如果读取的时候出错，那么将会结束缓存的处理，直接返回文件内容。其次，会对每一个依赖的mtime做比较
js 复制代码
```
 const mtime = dependencyStats.mtime.getTime();

 if (mtime / 1000 >= Math.floor(data.startTime / 1000)) {
   // Don't trust mtime.
   // File was changed while compiling
   // or it could be an inaccurate filesystem.
   cache = false;
 }
```
这段代码的意思是，如果某个依赖在pitch阶段之后被修改过(因为可能被编译)，就不再缓存内容 5. 如果上述校验都通过了，则生成缓存文件。其中,args是[content,map,meta]，存储前文件的内容会由bufferJSON.stringify处理成文本
js 复制代码
```
function loader(...args){
    ...
    writeFn(data.cacheKey, {
       remainingRequest: pathWithCacheContext(options.cacheContext, data.remainingRequest),
       dependencies: deps,
       contextDependencies: contextDeps,
       result: args
     }, () => {
       // ignore errors here
       callback(null, ...args);
     });
    ...
    function writeFn(key, data, callback) {
       const dirname = path.dirname(key);
       const content = BJSON.stringify(data);
       if (directories.has(dirname)) {
         // for performance skip creating directory
         fs.writeFile(key, content, 'utf-8', callback);
       } else {
         mkdirp(dirname, mkdirErr => {
           if (mkdirErr) {
             callback(mkdirErr);
             return;
           }
           directories.add(dirname);
           fs.writeFile(key, content, 'utf-8', callback);
         });
       }
     }
    ...
 }
```

上面便是cache-loader详细的工作流程，是否使用缓存核心是依赖mtime的对比，另外就是通过buffer-json库将buffer文件内容转换成string（base64）类型。还有一点要注意的是，pitch阶段命中缓存时需要将记录到的dependencies添加到loader中，否则命中缓存后无法监听依赖的变化

`neo-async`库异步操作记录

javascript 复制代码

const async=require('neo-async')

async.parallel 功能类似Promise.All

javascript 复制代码

const tasks = [
    (callback) => {
      setTimeout(() => {
        console.log('Task 1 done');
        callback(null, 'Result 1');
      }, 2000);
    },
    (callback) => {
      setTimeout(() => {
        console.log('Task 2 done');
        callback(null, 'Result 2');
      }, 1000);
    },
  ];

  // 并行执行任务
  async.parallel(tasks, (err, results) => {
    if (err) {
      console.error('Error:', err);
    } else {
      console.log('All tasks are done, Results:', results);
    }
  });

async.mapLimit(arr, limit, asyncTask, (err, status)=>{}) 用于并发执行任务，但是控制最大执行任务数。超过时，只有前面的任务执行完了才会执行下一个

javascript 复制代码

const arr = [1, 2, 3, 4, 5];
const limit = 2;

function asyncTask(item, callback) {
  setTimeout(() => {
    console.log(`Processing item ${item}`);
    callback(null, item * 2);
  }, 1000);
}

async.mapLimit(arr, limit, asyncTask, (err, results) => {
  if (err) {
    console.error('Error:', err);
  } else {
    console.log('All tasks are done');
    console.log('Results:', results);
  }
});

async.each(arr, iterator, callback) 该函数类似于arr.forEach, 不过它是异步的

javascript 复制代码

const arr=[1, 2, 3, 4, 5]
const asyncTask = (item, callback) {
    setTimeout(() => {
        console.log('item',item)
        callback(null)
    })
}
async.each(arr, asyncTask, (err) => {
    if (err) {
        console.error('Error:', err);
    } else {
        console.log('All tasks are done'); 
    }
})

后话

由于webpack5已经直接支持cache配置，因此cache-loader已不在维护，目前仅支持在webpack4+使用，不过目前公司的前端项目还是以webpack4为主

以cache-loader为例，了解loader运行流程

loader执行顺序，与loader的pitch介绍

为什么要有pitch阶段

cache-loader工作流程

neo-async库异步操作记录

后话

为什么要有`pitch`阶段

`neo-async`库异步操作记录