学英语学技术:Elasticsearch 线程池

单词 汉语意思 音标
allocate 分配 /ˈæləˌkeɪt/
coordination 协调 /koʊˌɔːrdɪˈneɪʃn/
deprecated 废弃的 /ˈdɛprəˌkeɪtɪd/
elasticsearch 弹性搜索(专有名词) /ˌɛlɪkˈsɜːrtʃ/
execute 执行 /ˈɛksɪˌkjuːt/
generic 通用的 /dʒəˈnɛrɪk/
initial 初始的 /ɪˈnɪʃəl/
metadata 元数据 /ˈmɛtəˌdeɪtə/
pending 待处理的 /ˈpɛndɪŋ/
proportional 比例的 /prəˈpɔːrʃənl/
queue 队列 /kjuː/
repository 仓库 /rɪˈpɑːzɪˌtɔːri/
scaling 扩展 /ˈskeɪlɪŋ/
snapshot 快照 /ˈsnæpˌʃɑːt/
synched 同步的 /sɪŋkt/
throttled 受限的 /ˈθrɑːtld/
translog 事务日志 /ˈtrænsˌlɔːɡ/
unbounded 无界限的 /ʌnˈbaʊndɪd/
warm-up 预热 /ˈwɔːrmˌʌp/
workload 工作负载 /ˈwɜːrkˌloʊd/

Thread pools

A node uses several thread pools to manage memory consumption. Queues associated with many of the thread pools enable pending requests to be held instead of discarded.

There are several thread pools, but the important ones include:

generic

For generic operations (for example, background node discovery). Thread pool type is scaling.

search

For count/search/suggest operations. Thread pool type is fixed_auto_queue_size with a size of int(([# of allocated processors](# of allocated processors) * 3) / 2) + 1, and initial queue_size of 1000.

search_throttled

For count/search/suggest/get operations on search_throttled indices. Thread pool type is fixed_auto_queue_size with a size of 1, and initial queue_size of 100.

search_coordination

For lightweight search-related coordination operations. Thread pool type is fixed with a size of a max of min(5, ([# of allocated processors](# of allocated processors)) / 2), and queue_size of 1000.

get

For get operations. Thread pool type is fixed with a size of [# of allocated processors](# of allocated processors), queue_size of 1000.

analyze

For analyze requests. Thread pool type is fixed with a size of 1, queue size of 16.

write

For single-document index/delete/update and bulk requests. Thread pool type is fixed with a size of [# of allocated processors](# of allocated processors), queue_size of 10000. The maximum size for this pool is 1 + [# of allocated processors](# of allocated processors).

snapshot

For snapshot/restore operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(5, ([# of allocated processors](# of allocated processors)) / 2).

snapshot_meta

For snapshot repository metadata read operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(50, ([# of allocated processors](# of allocated processors)* 3)).

warmer

For segment warm-up operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(5, ([# of allocated processors](# of allocated processors)) / 2).

refresh

For refresh operations. Thread pool type is scaling with a keep-alive of 5m and a max of min(10, ([# of allocated processors](# of allocated processors)) / 2).

listener

Mainly for java client executing of action when listener threaded is set to true. Thread pool type is scaling with a default max of min(10, ([# of allocated processors](# of allocated processors)) / 2).

fetch_shard_started

For listing shard states. Thread pool type is scaling with keep-alive of 5m and a default maximum size of 2 * [# of allocated processors](# of allocated processors).

fetch_shard_store

For listing shard stores. Thread pool type is scaling with keep-alive of 5m and a default maximum size of 2 * [# of allocated processors](# of allocated processors).

flush

For flush, synced flush, and translog fsync operations. Thread pool type is scaling with a keep-alive of 5m and a default maximum size of min(5, ( [# of allocated processors](# of allocated processors)) / 2).

force_merge

For force merge operations. Thread pool type is fixed with a size of 1 and an unbounded queue size.

management

For cluster management. Thread pool type is scaling with a keep-alive of 5m and a default maximum size of 5.

system_read

For read operations on system indices. Thread pool type is fixed with a default maximum size of min(5, ([# of allocated processors](# of allocated processors)) / 2).

system_write

For write operations on system indices. Thread pool type is fixed with a default maximum size of min(5, ([# of allocated processors](# of allocated processors)) / 2).

system_critical_read

For critical read operations on system indices. Thread pool type is fixed with a default maximum size of min(5, ([# of allocated processors](# of allocated processors)) / 2).

system_critical_write

For critical write operations on system indices. Thread pool type is fixed with a default maximum size of min(5, ([# of allocated processors](# of allocated processors)) / 2).

watcher

For watch executions. Thread pool type is fixed with a default maximum size of min(5 * ([# of allocated processors](# of allocated processors)), 50) and queue_size of 1000.

Thread pool settings are static and can be changed by editing elasticsearch.yml. Changing a specific thread pool can be done by setting its type-specific parameters; for example, changing the number of threads in the write thread pool

Thread pool types

The following are the types of thread pools and their respective parameters:

fixed

The fixed thread pool holds a fixed size of threads to handle the requests with a queue (optionally bounded) for pending requests that have no threads to service them.

The size parameter controls the number of threads.

The queue_size allows to control the size of the queue of pending requests that have no threads to execute them. By default, it is set to -1 which means its unbounded. When a request comes in and the queue is full, it will abort the request.

fixed_auto_queue_size

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

Deprecated in 7.7.0.

The experimental fixed_auto_queue_size thread pool type is deprecated and will be removed in 8.0.

The fixed_auto_queue_size thread pool holds a fixed size of threads to handle the requests with a bounded queue for pending requests that have no threads to service them. It's similar to the fixed threadpool, however, the queue_size automatically adjusts according to calculations based on Little's Law. These calculations will potentially adjust the queue_size up or down by 50 every time auto_queue_frame_size operations have been completed.

The size parameter controls the number of threads.

The queue_size allows to control the initial size of the queue of pending requests that have no threads to execute them.

The min_queue_size setting controls the minimum amount the queue_size can be adjusted to.

The max_queue_size setting controls the maximum amount the queue_size can be adjusted to.

The auto_queue_frame_size setting controls the number of operations during which measurement is taken before the queue is adjusted. It should be large enough that a single operation cannot unduly bias the calculation.

The target_response_time is a time value setting that indicates the targeted average response time for tasks in the thread pool queue. If tasks are routinely above this time, the thread pool queue will be adjusted down so that tasks are rejected.

scaling

The scaling thread pool holds a dynamic number of threads. This number is proportional to the workload and varies between the value of the core and max parameters.

The keep_alive parameter determines how long a thread should be kept around in the thread pool without it doing any work.

Allocated processors setting

The number of processors is automatically detected, and the thread pool settings are automatically set based on it. In some cases it can be useful to override the number of detected processors. This can be done by explicitly setting the node.processors setting.

There are a few use-cases for explicitly overriding the node.processors setting:

  1. If you are running multiple instances of Elasticsearch on the same host but want Elasticsearch to size its thread pools as if it only has a fraction of the CPU, you should override the node.processors setting to the desired fraction, for example, if you're running two instances of Elasticsearch on a 16-core machine, set node.processors to 8. Note that this is an expert-level use case and there's a lot more involved than just setting the node.processors setting as there are other considerations like changing the number of garbage collector threads, pinning processes to cores, and so on.
  2. Sometimes the number of processors is wrongly detected and in such cases explicitly setting the node.processors setting will workaround such issues.

In order to check the number of processors detected, use the nodes info API with the os flag.

中文总结:

以下是文章的关键信息,以中文展示:

  • Elasticsearch中的线程池: Elasticsearch使用多个线程池来管理内存消耗,线程池附带队列用于存储待处理的请求。

  • 线程池类型: 主要线程池包括 generic, search, search_throttled, search_coordination, get, analyze, write, snapshot, snapshot_meta, warmer, refresh, listener, fetch_shard_started, fetch_shard_store, flush, force_merge, management, system_read, system_write, system_critical_read, system_critical_write, watcher。

  • 线程池种类: 有三种线程池类型:

    • 固定型(fixed):使用固定数量的线程,具有可选的有界队列。

    • 自动调整队列大小(fixed_auto_queue_size):已废弃,根据工作负载动态调整队列大小。

    • 扩展型(scaling):根据工作负载动态调整线程数量。

  • 线程池配置: 线程池设置是静态的,可以通过编辑 elasticsearch.yml 文件进行修改。

  • 搜索线程池: 配置为 fixed_auto_queue_size 类型,线程数量计算为 int((分配的处理器数量 * 3) / 2) + 1,初始队列大小为1000。

  • 写入线程池: 使用固定型(fixed),线程数量等于分配的处理器数量,队列大小为10000。

  • 快照操作: 使用扩展型线程池(scaling),保持存活时间为5分钟,最大线程数量基于处理器数量计算。

  • 强制合并操作: 使用固定型线程池(fixed),只有一个线程,队列大小为无限制。

  • 监控线程池: 使用固定型(fixed),线程数量为 min(5 * (分配的处理器数量), 50),队列大小为1000。

  • 分配处理器: 处理器数量自动检测,但可以通过 node.processors 设置手动调整,以优化性能或修正检测问题。

相关推荐
TDengine (老段)8 分钟前
TDengine 选择函数 Max() 用户手册
大数据·数据库·物联网·时序数据库·tdengine·涛思数据
乐迪信息20 分钟前
乐迪信息:AI摄像机在智慧煤矿人员安全与行为识别中的技术应用
大数据·人工智能·算法·安全·视觉检测
AIGC小火龙果1 小时前
OpenAI的开源王牌:gpt-oss上手指南与深度解析
人工智能·经验分享·gpt·搜索引擎·aigc·ai编程
万邦科技Lafite3 小时前
实战演练:通过API获取商品详情并展示
大数据·数据库·python·开放api接口
在未来等你3 小时前
Elasticsearch面试精讲 Day 14:数据写入与刷新机制
大数据·分布式·elasticsearch·搜索引擎·面试
黄焖鸡能干四碗3 小时前
智慧教育,智慧校园,智慧安防学校建设解决方案(PPT+WORD)
java·大数据·开发语言·数据库·人工智能
phac1234 小时前
git 如何直接拉去远程仓库的内容且忽略本地与远端不一致的commit
大数据·git·elasticsearch
在未来等你4 小时前
Elasticsearch面试精讲 Day 11:索引模板与动态映射
大数据·分布式·elasticsearch·搜索引擎·面试
正在走向自律4 小时前
国产时序数据库选型指南-从大数据视角看透的价值
大数据·数据库·清华大学·时序数据库·iotdb·国产数据库
在未来等你4 小时前
Kafka面试精讲 Day 14:集群扩容与数据迁移
大数据·分布式·面试·kafka·消息队列