Hexo + Github Pages 搭建个人博客2 优化篇 SEO及图床

SEO（搜索引擎优化）可以提升博客在 Google、Bing、百度等搜索引擎的排名，提高曝光度，吸引更多访客。在搜索引擎输入以下内容：

makefile 复制代码

# 查看是否被收录
site:你的网站

如果提示抱歉，未找到相关结果。即还未收录。

1 网站收录

理论上，只要我们的网站网站可被外部访问，内容有价值，搜索引擎都会通过爬虫自动发现并收录，但是对于我们的新网站可能会比较慢。手动提交 Sitemap 能加快收录。

1.1 在搜索引擎添加站点

百度

注册并登录百度站长平台
添加站点。用户中心->站点管理->添加站点，输入你的域名
验证域名以 HTML 标签验证为例子，复制标签到首页的header区域，由butterfly主题集成了这个功能，只需要将content 的值复制到主题配置文件_config.butterfly.yml 中的site_verification字段

yaml 复制代码

# Baidu Webmaster tools verification.
# See: https://ziyuan.baidu.com/site
site_verification:
  # - name: google-site-verification
  #   content: xxxxxx
  - name: baidu-site-verification
    content:  # 在这里填上面的字符串

网站重新部署完后，在百度站长平台完成 HTML 标签验证即完成添加。

谷歌收录

注册并登录Google Search Console
添加站点，大体同上。
输入网址并验证，以"网址前缀" 为例子，输入网址 youwebsite.top ，验证方式可选择 HTML 标签验证，方法同百度验证。

1.2 生成站点地图（sitemap）

站点地图可以告诉搜索引擎网站上有哪些可供抓取的网页，以便搜索引擎可以更加智能地抓取网站。

安装百度和 Google 的站点地图生成插件

bash 复制代码

npm install hexo-generator-baidu-sitemap --save
npm install hexo-generator-sitemap --save

在配置文件config.yml添加以下字段：

yaml 复制代码

# 站点地图
sitemap:
  path: sitemap.xml
baidusitemap:
  path: baidusitemap.xml

3 hexo g -d 重新部署并访问以下URL,看看网页中有没有出现代码。有的话就成功

arduino 复制代码

https://你的域名/sitemap.xml
https://你的域名/baidusitemap.xml

虽然搜索引擎会自动爬取网站内容，但手动提交 sitemap.xml 更快发现你的博客，并提高收录效率。

1.3 提交链接

谷歌

在 Search Console-> Sitemaps 页面，提交上一步获得sitemap，即https://你的域名.com/sitemap.xml

百度

百度站长平台的链接提交方式分为自动提交和手动提交两种，此处只讲自动提交，手动提交按照要求操作即可主动推送最为快速的提交方式，是被百度收录最快的推送方式。

主动推送可以通过安装插件实现

bash 复制代码

npm install hexo-baidu-url-submit --save

在 hexo 根目录配置文件_config.yml 中，添加

yaml 复制代码

# 主动推送百度，被百度收录
baidu_url_submit:
  count: 10 # 提交最新的10个链接
  host: # 百度站长平台中注册的域名
  token: # 秘钥，百度站长平台 > 普通收录 > 推送接口 > 接口调用地址中token字段
  path: baidu_urls.txt # 文本文档的地址， 新链接会保存在此文本文档里，不用改

其次，记得查看 hexo 根目录中_config.yml 文件中 url 的值，必须包含是百度站长平台注册的域名

yaml 复制代码

# URL
## Set your site url here. For example, if you use GitHub Page, set url as 'https://username.github.io/project'
url: https://kdsunset.top

deploy 加入新的 type

yaml 复制代码

# Deployment
## Docs: https://hexo.io/docs/one-command-deployment
deploy:
  - type: git 
    repo: git@github.com:kdsunset/kdsunset.github.io.git
    branch: main
  - type: baidu_url_submitter

若要实现手动提交，则把下面的代码粘贴到百度站长平台的 "手动收录" 地址窗口即可

arduino 复制代码

https://你的域名/sitemap.xml
https://你的域名/baidusitemap.xml

1.4 robots

robots.txt 是一个网站爬虫协议文件，用于告诉搜索引擎哪些页面可以爬取，哪些页面不能爬取，从而控制搜索引擎的行为。例如通常我们希望搜索引擎爬取我们的文章页面，而目录、关于页这些可以忽略，则可以在这个文件指定。

博客根目录下的 source 文件夹新建robots.txt文件，填写以下内容：

javascript 复制代码

# hexo robots.txt
# 允许所有用户代理的浏览器爬虫进行访问（爬取数据）
User-agent: *

Allow: /
Allow: /posts/

Disallow: /tags/
Disallow: /categories/
Disallow: /about/
Disallow: /archives/
# 如果js和fonts这些加了disallow的话，会出现谷歌抓取问题的话，就取消
Disallow: /js/
Disallow: /css/
Disallow: /fonts/
Disallow: /lib/

# 最后面两行是site-map
Sitemap: https://kdsunset.top/sitemap.xml
Sitemap: https://kdsunset.top/baidusitemap.xml

将robots.txt加到heox配置的skip_render字段，避免渲染。

通常Hexo会将解析 Markdown 文件并渲染成html，skip_render则可以指定需要忽略的内容，避免格式错误。在博客根目录下的配置文件_config.yml中的skip_render字段添加以下内容：

yaml 复制代码

# 跳过渲染
skip_render:
  - 'robots.txt'

2 图床

我们写的博客文章可能使用到很多图片，如果图片直接放在Hexo目录下，会占用仓库空间，而且GitHub Pages 有容量限制。另外每次 Hexo 部署都会重新上传所有图片，影响效率。因此比较好的方案是使用图床。图床方案可以分为第三方图床和自建图床两大类。国内诸如xx云的服务商大都提供了图片存储服务，但是质量好的基本都要付费。自建方面，由于gitee现在不支持图床外链作为公共仓库；jsdelivr 国内被墙。因此我们选择Github私有仓库+Cloudflare Workers搭建个人图床，Github私有仓库来负责存储图片，通过 Cloudflare Workers 代理访问GitHub Raw 文件，并且提供 CDN 缓存。

2.1 Github部分

2.1.1 Github建立一个仓库

新建仓库作为图床，由于使用了Cloudflare反向代理了github仓库，可以选择私有仓库，文件不会暴露，增加安全性。这里以建立imagehosting仓库为例

2.1.2 生成Github Tokens

GitHub Personal Access Tokens（个人访问令牌）是一种用于进行身份验证的安全凭证，允许您通过命令行或应用程序与 GitHub 进行交互。这是一种替代使用用户名和密码的方式，通常用于通过 API 访问 GitHub 资源或执行与 GitHub 相关的操作。

点击Github 用户设置setting->Developer Settings->Personal access tokens 进行仓库，字段可按如下填写，然后点击生成。注意这个token只展示一次，所以请复制下来并保存

yaml 复制代码

Note: # token 标识，随意
Expiration: # 过期时间，自行选择，可选No expiration（永久）
Select scopes: # 选择权限范,勾选 repo

2.2 Cloudflare 部分

2.2.1 创建用于代理的Worker

Cloudflare 的管理界面后，点击侧边栏的 "Workers" 选项，然后点击 "创建服务" 创建一个 Worker,修改名称并部署.
点击编辑，粘贴以下代码，并将代理路径和Github令牌修改为自己的值。

js 复制代码

//这段代码作用：1.反向代理了github仓库。
//2.使用令牌获取文件。
//3开启了缓存，避免重复请求图片。
// Website you intended to retrieve for users.
const upstream = "raw.githubusercontent.com";

// Custom pathname for the upstream website.
// (1) 填写代理的路径，格式为 /<用户>/<仓库名>/<分支>
const upstream_path = "****";

// github personal access token.
// (2) 填写github令牌
const github_token = "****";

// Website you intended to retrieve for users using mobile devices.
const upstream_mobile = upstream;

// Countries and regions where you wish to suspend your service.
const blocked_region = [];

// IP addresses which you wish to block from using your service.
const blocked_ip_address = ["0.0.0.0", "127.0.0.1"];

// Whether to use HTTPS protocol for upstream address.
const https = true;

// Whether to disable cache.
const disable_cache = false;

// Replace texts.
const replace_dict = {
 $upstream: "$custom_domain",
};

addEventListener("fetch", (event) => {
 event.respondWith(fetchAndApply(event.request));
});

async function fetchAndApply(request) {
 const region = request.headers.get("cf-ipcountry")?.toUpperCase();
 const ip_address = request.headers.get("cf-connecting-ip");
 const user_agent = request.headers.get("user-agent");

 let response = null;
 let url = new URL(request.url);
 let url_hostname = url.hostname;

 if (https == true) {
   url.protocol = "https:";
 } else {
   url.protocol = "http:";
 }

 if (await device_status(user_agent)) {
   var upstream_domain = upstream;
 } else {
   var upstream_domain = upstream_mobile;
 }

 url.host = upstream_domain;
 if (url.pathname == "/") {
   url.pathname = upstream_path;
 } else {
   url.pathname = upstream_path + url.pathname;
 }

 if (blocked_region.includes(region)) {
   response = new Response(
     "Access denied: WorkersProxy is not available in your region yet.",
     {
       status: 403,
     }
   );
 } else if (blocked_ip_address.includes(ip_address)) {
   response = new Response(
     "Access denied: Your IP address is blocked by WorkersProxy.",
     {
       status: 403,
     }
   );
 } else {
   let method = request.method;
   let request_headers = request.headers;
   let new_request_headers = new Headers(request_headers);

   new_request_headers.set("Host", upstream_domain);
   new_request_headers.set("Referer", url.protocol + "//" + url_hostname);
   new_request_headers.set("Authorization", "token " + github_token);

   let original_response = await fetch(url.href, {
     method: method,
     headers: new_request_headers,
     body: request.body,
   });

   let connection_upgrade = new_request_headers.get("Upgrade");
   if (connection_upgrade && connection_upgrade.toLowerCase() == "websocket") {
     return original_response;
   }

   let original_response_clone = original_response.clone();
   let original_text = null;
   let response_headers = original_response.headers;
   let new_response_headers = new Headers(response_headers);
   let status = original_response.status;

   if (disable_cache) {
     new_response_headers.set("Cache-Control", "no-store");
   } else {
     new_response_headers.set("Cache-Control", "max-age=43200000");
   }

   new_response_headers.set("access-control-allow-origin", "*");
   new_response_headers.set("access-control-allow-credentials", "true");
   new_response_headers.delete("content-security-policy");
   new_response_headers.delete("content-security-policy-report-only");
   new_response_headers.delete("clear-site-data");

   if (new_response_headers.get("x-pjax-url")) {
     new_response_headers.set(
       "x-pjax-url",
       response_headers
         .get("x-pjax-url")
         .replace("//" + upstream_domain, "//" + url_hostname)
     );
   }

   const content_type = new_response_headers.get("content-type");
   if (
     content_type != null &&
     content_type.includes("text/html") &&
     content_type.includes("UTF-8")
   ) {
     original_text = await replace_response_text(
       original_response_clone,
       upstream_domain,
       url_hostname
     );
   } else {
     original_text = original_response_clone.body;
   }

   response = new Response(original_text, {
     status,
     headers: new_response_headers,
   });
 }
 return response;
}

async function replace_response_text(response, upstream_domain, host_name) {
 let text = await response.text();

 var i, j;
 for (i in replace_dict) {
   j = replace_dict[i];
   if (i == "$upstream") {
     i = upstream_domain;
   } else if (i == "$custom_domain") {
     i = host_name;
   }

   if (j == "$upstream") {
     j = upstream_domain;
   } else if (j == "$custom_domain") {
     j = host_name;
   }

   let re = new RegExp(i, "g");
   text = text.replace(re, j);
 }
 return text;
}

async function device_status(user_agent_info) {
 var agents = [
   "Android",
   "iPhone",
   "SymbianOS",
   "Windows Phone",
   "iPad",
   "iPod",
 ];
 var flag = true;
 for (var v = 0; v < agents.length; v++) {
   if (user_agent_info.indexOf(agents[v]) > 0) {
     flag = false;
     break;
   }
 }
 return flag;
}

2.2.2 将域名 NS 转到 Cloudflare

Cloudflare Workers 的域名绑定仅支持托管在 Cloudflare 上的域名，所以得先将域名的 NS 转到 Cloudflare。如果使用了Cloudflare进行CDN加速，这一步已经设置过了，没有则参考Hexo教程第一篇。

2.2.3 给创建的 Worker 服务绑定自己的域名

域名 NS 转到 Cloudflare成功后，在 Worker 服务的详情页点击"触发器"，然后点击"添加自定义域"，输入想要绑定的域名后，点击"添加自定义域"。这里以kdsunset.top域名为例子，可以填写自定义域名为"img.kdsunset.top"

2.4 配置 picgo 图床软件

PicGo 是一款开源的图片上传工具，主要用于将本地图片上传到云存储服务，并生成可访问的链接。按以下描述填写Github仓库信息：

rust 复制代码

仓库名：<Github用户名/仓库名>
分支名：现在GitHub默认主分支是 "main"，以前是"master"
token：Github token，github->settings->tokens->Generate new token
存储路径：设置文件夹目录，可选,例如"img/"，markdown
自定义域名：自定义域名，可选，我的配置是https://img.kdsunset.top

成功上传图片后，图片链接为：

markdown 复制代码

![](https://img.kdsunset.top/img/markdown/20231230003624.png)

3 其他

3.1 给博客图片添加水印

使用hexo-images-watermark插件，这个插件不对原图产生任何影响，在网站静态页构建过程中将原图读取，输出添加了水印的图片。但是目前这个插件只支持本地的source/_post下的图片生成水印，不支持网络图片。 1. 安装sharp和hexo-images-watermark

bash 复制代码

# 这个插件依赖`sharp`
npm install sharp

npm install hexo-images-watermark

这个插件依赖sharp，npm 安装 sharp 时，必须先安装 windows-build-tools ，如果还没安装，可以选择从Visual Studio官网，选择"用于 Visual Studio 的工具"->Visual Studio 2022 生成工具，下载安装包，并选择安装C++ 桌面工具。

2 在hexo主配置文件，添加：

yaml 复制代码

watermark:
    enable: true
    textEnable: true
    rotate: -45
    gravity: centre

3.2 静态资源压缩

静态资源压缩是指对 Hexo生成如HTML、CSS、JavaScript和图片等静态文件进行压缩，以减小文件大小，提高网页加载速度，节省带宽资源的目的。这主要包括对、图片等静态文件的压缩。

Gulp 是一种流行的 JavaScript 任务自动化工具，广泛用于 Web 开发中的构建和自动化任务，如压缩 CSS、JavaScript 文件、优化图片、自动刷新浏览器等。

常用gulp 插件：
- gulp-htmlclean：清理html
- gulp-htmlmin：压缩html
- gulp-minify-css：压缩css
- gulp-uglify：混淆js
- gulp-imagemin：压缩图片

安装gulp

bash 复制代码

# 项目根目录安装gulp
npm install gulp --save-dev

# 安装gulp 模块（项目目录）
npm install gulp-htmlclean gulp-htmlmin gulp-minify-css gulp-uglify gulp-imagemin --save

在Hexo博客根目录下新建 gulpfile.js ，并填入以下内容

js 复制代码

var gulp = require('gulp');
var minifycss = require('gulp-minify-css');
var uglify = require('gulp-uglify');
var htmlmin = require('gulp-htmlmin');
var htmlclean = require('gulp-htmlclean');
// var imagemin = require('gulp-imagemin');

// 压缩html
gulp.task('minify-html', function() {
    return gulp.src('./public/**/*.html')
        .pipe(htmlclean())
        .pipe(htmlmin({
            collapseWhitespace: true, //从字面意思应该可以看出来，清除空格，压缩html，这一条比较重要，作用比较大，引起的改变压缩量也特别大
            collapseBooleanAttributes: true, //省略布尔属性的值，比如：<input checked="checked"/>,那么设置这个属性后，就会变成 <input checked/>
            removeComments: true, //清除html中注释的部分
            removeEmptyAttributes: true, //清除所有的空属性
            removeScriptTypeAttributes: true, //清除所有script标签中的type="text/javascript"属性。
            removeStyleLinkTypeAttributes: true, //清楚所有Link标签上的type属性。
            minifyJS: true,
            minifyCSS: true,
            minifyURLs: true,
        }))
        .pipe(gulp.dest('./public'));
});
// 压缩css
gulp.task('minify-css', function() {
    return gulp.src('./public/**/*.css')
        .pipe(minifycss({
            compatibility: 'ie8'
        }))
        .pipe(gulp.dest('./public'));
});
// 压缩js !代表排除的js,例如['!./public/js/**/*min.js']
gulp.task('minify-js', function() {
    return gulp.src(['./public/js/**/.js'])
        .pipe(uglify()) //压缩混淆
        .pipe(gulp.dest('./public'));
});
// 压缩图片
//已在在上传时压缩
// gulp.task('minify-images', function() {
//     return gulp.src('./public/images/**/*.*')
//         .pipe(imagemin(
//         [imagemin.gifsicle({'optimizationLevel': 3}),
//         imagemin.jpegtran({'progressive': true}),
//         imagemin.optipng({'optimizationLevel': 7}),
//         imagemin.svgo()],
//         {'verbose': true}))
//         .pipe(gulp.dest('./public/images'));
// });
// 默认任务
gulp.task('default',gulp.series(gulp.parallel('minify-html','minify-css','minify-js')));

3.生成博文时执行gulp命令就会根据 gulpfile.js 中的配置，对 public 目录中的静态资源文件进行压缩。重新部署即可看到css文件等被压缩了（不生效清一下浏览器缓存）

sh 复制代码

echo "Starting Hexo cmd"
hexo clean && hexo g && gulp && hexo d
echo "Hexo deploy success !"

3.3 文章目录禁用自动编号

默认情况下，butterfly会对文章目录自动编号，这可能跟我们自定义的目录标题冲突，可通过设置butterfly配置文件的top字段进行配置：

yaml 复制代码

toc:
  enable: true
  number: false  # 自动编号

3.4 开启内置的404页面

修改enable开启

yaml 复制代码

# A simple 404 page
error_404:
  enable: true
  subtitle: '页面沒有找到'
  background: https://i.loli.net/2020/05/19/aKOcLiyPl2JQdFD.png

修改页面的样式，打开butterfly主题文件/layout/include/404.pug

js 复制代码

- var top_img_404 = theme.error_404.background || theme.default_top_img

body(style='background-image: url(' + url_for(theme.default_top_img) + '); background-size: cover; background-position: center center;')

  #body-wrap.error404
    include ./header/index.pug

    #error-wrap
      .error-content
        .error-img
          img(src=url_for(top_img_404) alt='Page not found')
        .error-info
          h1.error_title= '404'
          .error_subtitle= theme.error_404.subtitle || _p('error404')

3.5 解决国内线路 valine 评论区无法加载的问题

经过F12控制台定位是Valine.min.js引用的av-min.js 无法下载导致，原来的url

bash 复制代码

//unpkg.com/leancloud-storage@3/dist/av-min.js

替换成其他cdn，对于butterfly主题，替换方式是修改主题配置文件中的inject字段，插入饿了么的cnd链接

yaml 复制代码

inject:
  head:
    # - <link rel="stylesheet" href="/xxx.css">
    - <link rel="stylesheet" href="/css/my_bg_color.css">
  bottom:
    - <script src="//github.elemecdn.com/leancloud-storage@3/dist/av-min.js"></script>
    # - <script src="xxxx"></script>

3.6 添加RSS订阅

RSS用来订阅网站的最新文章、新闻和博客更新，相比算法推荐，它相当的古老，但是既然博客还存在，RSS订阅也一定有存在的理由。 1 安装RSS生成插件

bash 复制代码

# 安装hexo-generator-feed（https://github.com/hexojs/hexo-generator-feed）
npm install hexo-generator-feed --save

2 配置hexo，添加feed字段 在 hexo-blog/_config.yml 文件中，找到 feed 配置部分（如果没有，请添加）

yaml 复制代码

feed:
  enable: true  # 启用 RSS 订阅功能
  type: atom  # 订阅源格式，可选值：atom、rss2
  path: atom.xml  # 订阅源的生成路径（默认 `atom.xml`，最终访问路径为 `https://your-blog.com/atom.xml`）
  limit: 20  # 订阅源中最多包含的文章数量，默认 20（可根据需求调整）
  hub:  # WebSub（以前的 PubSubHubbub）通知中心的 URL（通常留空）
  content: true  # 是否在订阅源中包含文章正文内容
  content_limit: 140  # 文章内容的最大字数限制（默认 140，可调整或设为 0 以显示全文）
  content_limit_delim: ' '  # 文章内容截断时使用的分隔符（默认为空格）
  order_by: -date  # 文章排序方式，`-date` 表示按发布时间倒序排列
  icon: /img/avatar.jpg  # RSS 订阅图标（建议使用博客的 LOGO 或头像）
  autodiscovery: true  # 启用自动发现功能，使浏览器或 RSS 阅读器可以自动检测订阅源

3 在 Butterfly 主题中添加 RSS 链接 在Butterfly 主题的配置文件_config.butterfly.yml中menu字段增加：

yaml 复制代码

menu:
  RSS: /atom.xml

4 发布更新 重新部署后，可以看到在首页menu菜单中增加了RSS选项，点击得到类似https://your-blog.com/atom.xml的网页。使用方式是，在RSS 订阅平台例如Feedly、inoreader（有国内版本）、The Old Reader。例如inoreader，注册完成后在add feed->Website中输入刚才的atom.xml网站，即可添加订阅。

3.7 发布原生网页

有时我们写的不是md文章，也不需要butterfly主题的网页布局，而是想直接发布一个自定义样式的html网页，可以通过指定跳过渲染实现。在hexo/source中创建or粘贴你需要的html网页，如果是子目录则对应网站二级目录，例如raw/index.html 然后再hexo配置中，添加跳过渲染：

yaml 复制代码

skip_render:
  - 'robots.txt'
  - 'raw/index.html'

注意如果网页使用到了外部js，需要从\themes\butterfly\source\js目录下引入。重新部署后可以通过https://your-blog.com/raw/index.html访问。

更多文章欢迎访问我的博客。