elasticsearch8.12.0安装分词

上篇说到,安装了es后正常运行

es分词下载地址

从 GitHub Release 下载(推荐)

👉 https://github.com/medcl/elasticsearch-analysis-ik/releases

https://release.infinilabs.com/analysis-ik/stable/

安装:

选择与你 ES 版本匹配的包,例如:

复制代码
elasticsearch-analysis-ik-8.12.0.zip

下载命令:

复制代码
cd /tmp
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v8.12.0/elasticsearch-analysis-ik-8.12.0.zip

⚠️ 注意:不要下载 source code,要下载 assets 里的 .zip 文件。


3. 创建 plugins 目录(如果不存在)

Elasticsearch 插件默认安装在:

复制代码
$ES_HOME/plugins/ik/

创建目录:

复制代码
mkdir -p $ES_HOME/plugins/ik

4. 解压插件到 plugins 目录

复制代码
unzip elasticsearch-analysis-ik-8.12.0.zip -d $ES_HOME/plugins/ik/

$ES_HOME 是你的 Elasticsearch 安装目录,例如 /data/isee/apps/elasticsearch-8.12.0


5. 检查目录结构

安装完成后,目录结构应如下:

复制代码
$ES_HOME/plugins/ik/
├── plugin-descriptor.properties
├── plugin-security.policy
├── config/
│   ├── IKAnalyzer.cfg.xml
│   ├── main.dic
│   └── stopword.dic
└── lib/
    ├── elasticsearch-analysis-ik-8.12.0.jar
    └── commons-codec-1.9.jar
    └── ...

6. 修改配置文件(可选)

配置文件路径:

复制代码
$ES_HOME/plugins/ik/config/IKAnalyzer.cfg.xml

你可以添加自定义词典:

复制代码
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <comment>IK Analyzer 扩展配置</comment>
    <entry key="ext_dict">custom.dic</entry>
    <entry key="ext_stopwords">stopwords.dic</entry>
</properties>

然后在 config/ 目录下创建 custom.dic,添加自定义词汇:

复制代码
人工智能
大模型
阿里云
Qwen

7. 设置权限(重要)

确保 Elasticsearch 用户有权限读取插件:

复制代码
chown -R isee:isee $ES_HOME/plugins/ik
# 或你运行 ES 的用户

8. 重启 Elasticsearch

复制代码
# 先停止
ps aux | grep elasticsearch
kill <pid>

# 启动
bin/elasticsearch -d

✅ 三、验证插件是否安装成功

1. 检查日志

查看 $ES_HOME/logs/isee_cluster.log,确认没有插件加载错误。

2. 调用分词 API 测试

bash 复制代码
# curl -X GET  -u elastic:9yZWp=3UnEVkBxYBhnlS "https://10.10.10.10:9200/_analyze" -H "Content-Type: application/json" -d'
> {
>   "analyzer": "ik_smart",
>   "text": "阿里巴巴推出通义千问大模型"
> }'
curl: (60) Peer's certificate issuer has been marked as not trusted by the user.
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). If the default
 bundle file isn't adequate, you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

才想起来,我们是https的服务,有ca证书,先不认证证书,-k

bash 复制代码
# curl -k -X GET  -u elastic:9yZWp=3UnEVkBxYBhnlS "https://10.10.10.10:9200/_analyze" -H "Content-Type: application/json" -d'
> {
>   "analyzer": "ik_smart",
>   "text": "阿里巴巴推出通义千问大模型"
> }'
{"tokens":[{"token":"阿里巴巴","start_offset":0,"end_offset":4,"type":"CN_WORD","position":0},{"token":"推出","start_offset":4,"end_offset":6,"type":"CN_WORD","position":1},{"token":"通义","start_offset":6,"end_offset":8,"type":"CN_WORD","position":2},{"token":"千","start_offset":8,"end_offset":9,"type":"TYPE_CNUM","position":3},{"token":"问","start_offset":9,"end_offset":10,"type":"CN_CHAR","position":4},{"token":"大模型","start_offset":10,"end_offset":13,"type":"CN_WORD","position":5}]}[isee@host-10-15-32-71 elasticsearch-8.12.0]$ 

分词安装成功。

相关推荐
Yeats_Liao14 分钟前
评估体系构建:基于自动化指标与人工打分的双重验证
运维·人工智能·深度学习·算法·机器学习·自动化
爱吃生蚝的于勒35 分钟前
【Linux】进程信号之捕捉(三)
linux·运维·服务器·c语言·数据结构·c++·学习
文艺理科生Owen1 小时前
Nginx 路径映射深度解析:从本地开发到生产交付的底层哲学
运维·nginx
期待のcode1 小时前
Redis的主从复制与集群
运维·服务器·redis
wangjialelele2 小时前
Linux下的IO操作以及ext系列文件系统
linux·运维·服务器·c语言·c++·个人开发
HypoxiaDream3 小时前
LINUX-Ext系列⽂件系统
linux·运维·服务器
小毛驴8503 小时前
Linux curl 命令用法
linux·运维·chrome
李斯啦果3 小时前
【Linux】Linux目录配置
linux·运维·服务器
AI+程序员在路上3 小时前
linux下线程中pthread_detach与pthread_join区别
linux·运维·服务器
logocode_li3 小时前
说透 Linux Shell:命令与语法的底层执行逻辑
linux·运维·ssh