elasticsearch的使用、api调用、更新、持久化

haogexiaole2025-09-24 21:29

Elasticsearch 的基本使用

Elasticsearch 是一个开源的分布式搜索和分析引擎，适用于处理大规模数据。以下是其核心使用场景：

索引创建：通过 RESTful API 创建索引，定义字段类型和映射。
文档操作：支持增删改查（CRUD）操作，文档以 JSON 格式存储。
搜索功能：支持全文搜索、聚合分析、模糊查询等高级搜索特性。

API 调用方法

Elasticsearch 提供 RESTful API，可通过 HTTP 请求交互：

索引文档（示例）：

bash 复制代码

curl -X POST "http://localhost:9200/my_index/_doc/1" -H 'Content-Type: application/json' -d'
{
  "title": "Elasticsearch Guide",
  "content": "Distributed search engine"
}
'

搜索文档（示例）：

bash 复制代码

curl -X GET "http://localhost:9200/my_index/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "match": { "title": "Elasticsearch" }
  }
}
'

数据更新机制

Elasticsearch 支持部分更新和全量替换：

部分更新 ：使用 _update API 修改特定字段。

bash 复制代码

curl -X POST "http://localhost:9200/my_index/_update/1" -H 'Content-Type: application/json' -d'
{
  "doc": { "content": "Distributed search and analytics" }
}
'

全量替换：直接重新索引文档，覆盖原有内容。

持久化配置

Elasticsearch 默认将数据持久化到磁盘，关键配置如下：

存储路径 ：在 elasticsearch.yml 中设置 path.data，指定数据目录。
yaml 复制代码
```
path.data: /var/lib/elasticsearch
```
副本分片 ：通过 index.number_of_replicas 确保数据冗余，提高容错性。
快照备份 ：使用 snapshot API 定期备份索引到外部存储（如 S3、HDFS）。

性能优化建议

合理设置分片数量（index.number_of_shards），避免分片过多或过少。
使用 refresh_interval 调整索引刷新频率，平衡实时性与性能。
启用 index.store.type: hybridfs 优化文件系统访问效率。