Elasticsearch8容器化部署

介绍

Elasticsearch 是一个开源的分布式搜索和分析引擎,基于 Apache Lucene 构建。它被设计用于处理海量数据,并提供近实时的搜索和分析能力。

特性类别 核心要点 说明与应用价值
核心架构 分布式 & 高可用 数据自动分片并在集群中多节点存储,易于水平扩展,并能容忍节点故障。
近实时 (NRT) 搜索 数据写入后通常在 1 秒内即可被检索,非常适合监控和日志分析等场景。
RESTful API 通过简单的 HTTP 请求和 JSON 数据格式进行操作,极大降低了使用门槛。
核心技术 倒排索引 通过建立"词汇 -> 文档"的映射,实现了全文检索的高速查询。
强大的查询 DSL 支持全文搜索、模糊查询、范围过滤、地理位置查询等复杂查询需求。
聚合分析 提供强大的数据分组统计和计算能力,用于深度数据分析。
主要应用场景 应用程序/网站搜索 为电商平台、内容网站等提供高效、精准的搜索功能。
日志与指标分析 (ELK Stack) 与 Logstash、Kibana 等组成 ELK 技术栈,是日志集中管理和分析的经典方案。
业务智能与安全分析 通过对业务数据或安全事件数据进行聚合分析,发现趋势和异常。

核心概念速览

理解以下几个关键术语,能帮助你更好地掌握 Elasticsearch:

  • 索引:类似于关系型数据库中的"数据库",是存储相关文档的集合。
  • 文档:是信息的基本数据单元,以 JSON 格式表示,类似于数据库中的"一行记录"。
  • 节点与集群:一个节点是一个 Elasticsearch 实例,多个节点组成一个集群,共同提供服务。
  • 分片与副本:索引可以被分成多个分片,分布到不同节点上。每个分片都可以有副本,同时提供数据冗余和高可用性。

优势与挑战

  • 主要优势高性能 (尤其擅长全文检索和复杂分析)、高可扩展性 (可处理 PB 级别数据)、灵活性(支持各种数据类型)。
  • 需要注意的方面资源消耗 相对较高(内存和磁盘 I/O)。学习曲线 有一定坡度,尤其在分布式集群管理和性能调优方面。它不提供完整的事务支持(ACID),因此不适合像金融交易这类需要强一致性的场景。

生态与演进

Elasticsearch 通常与 Logstash (数据采集处理)和 Kibana (数据可视化)等组件共同构成 Elastic Stack(旧称 ELK Stack),提供一个完整的数据处理解决方案。另外,需要注意的是,在新版本中(7.x 以后),索引内的"类型"概念已被废弃,现在更推荐使用独立的索引来组织不同结构的数据。

快速开始

目录结构如下,详见

bash 复制代码
# .env.example
ELASTIC_PASSWORD=dAUl36wBVN
KIBANA_PASSWORD=CfHhxf8fzv
yml 复制代码
# docker-compose.yml
services:
  elasticsearch:
    container_name: elasticsearch
    image: elasticsearch:8.12.2
    #image: docker.cnb.cool/jinriyaojia_huigu/yaocai/devops/elasticsearch:8.12.2
    restart: unless-stopped
    user: 1000:1000
    environment:
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g -XX:+UseG1GC"
      - "TZ=Asia/Shanghai"
      # 禁用 HTTPS,启用 HTTP
      - "xpack.security.http.ssl.enabled=false"
      - "xpack.security.enabled=true"  # 保持安全启用,但使用 HTTP
      - "discovery.type=single-node"
      - "ELASTIC_PASSWORD=${ELASTIC_PASSWORD}"
    ports:
      - "19200:9200"  # Elasticsearch 默认的 HTTP 端口,用于接收 REST API 请求,比如搜索、索引文档等操作
      # - "19300:9300"  # Elasticsearch 默认的 Transport 端口,用于节点间通信和 Java 客户端的传输层连接
    volumes:
       - ./runtime/es/config:/usr/share/elasticsearch/config
       - ./runtime/es/data:/usr/share/elasticsearch/data
       - ./runtime/es/plugins:/usr/share/elasticsearch/plugins
       - ./runtime/es/logs:/usr/share/elasticsearch/logs  # 挂载日志目录
    networks:
      - elastic_net
    ulimits:
      nofile:
        soft: 65535
        hard: 65535
      nproc:
        soft: 4096
        hard: 4096
      memlock:
        soft: -1
        hard: -1
    deploy:
      resources:
        limits:
          memory: 2G
          cpus: '2.0'
        reservations:
          memory: 1G
          cpus: '1.0'
    logging:
      driver: json-file
      options:
        max-size: "30m"
        max-file: "10"
  kibana:
    container_name: kibana
    image: kibana:8.12.2
    #image: docker.cnb.cool/jinriyaojia_huigu/yaocai/devops/kibana:8.12.2
    restart: unless-stopped
    environment:
      - "TZ=Asia/Shanghai"
      - "I18N_LOCALE=zh-CN"
      - "ELASTICSEARCH_SSL_VERIFICATIONMODE=none"  # 开发环境禁用证书验证
      # 禁用不需要的功能
      - "XPACK_SECURITY_LOGIN_HELP_ENABLED=false"
      - "XPACK_SECURITY_PASSWORD_RESET_ENABLED=false"
      - "XPACK_SECURITY_SESSION_IDLE_TIMEOUT=1h"
      - "SERVER_HOST=0.0.0.0"
      - "SERVER_NAME=kibana"
      - "NODE_OPTIONS=--max-old-space-size=800"  # 设置 Node.js 最大堆内存为 800MB
      - "ELASTICSEARCH_HOSTS=http://elasticsearch:9200"
      # 1.1账号验证
      - "ELASTICSEARCH_USERNAME=kibana_system"
      - "ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD}"
      # 1.2服务账户令牌
      # - "ELASTICSEARCH_SERVICEACCOUNTTOKEN=AAEAAWVsYXN0aWMva2liYW5hL2tpYmFuYS10b2tlbjpWckFhYWl4U1EyNmNRMGR4OGtXYmVR"
    ports:
      - "5601:5601"
    deploy:
      resources:
        limits:
          memory: 1G
          cpus: '0.5'
        reservations:
          memory: 512M
          cpus: '0.25'
    logging:
      driver: json-file
      options:
        max-size: "30m"
        max-file: "10"
    networks:
      - elastic_net
    depends_on:
      - elasticsearch

# 网络配置
networks:
  elastic_net:
    driver: bridge

准备基础配置文件

bash 复制代码
mkdir -p runtime/es/plugins && mkdir -p runtime/es/data && mkdir -p runtime/es/config
# 修改文件权限,解决配置文件复制和IK中文分词插件安装权限问题
chown -R 1000:1000 runtime
docker run --rm -v ./runtime/es/config:/temp-config elasticsearch:8.12.2 cp -r /usr/share/elasticsearch/config/. /temp-config/
# 创建同义词文件
mkdir -p runtime/es/config/certs && touch runtime/es/config/certs/synonym.txt

启动

bash 复制代码
# 设置环境变量,修改.env文件中的变量
cp .env.example .env
cat .env
docker compose up -d

首次启动初始化配置

安装插件

安装IK中文分词插件、拼音分词插件

bash 复制代码
[root@yaocai-local-dev elasticsearch]# docker compose exec elasticsearch bash
elasticsearch@33d5877b32ba:~$ ./bin/elasticsearch-plugin install https://get.infini.cloud/elasticsearch/analysis-ik/8.12.2
-> Installing https://get.infini.cloud/elasticsearch/analysis-ik/8.12.2
-> Downloading https://get.infini.cloud/elasticsearch/analysis-ik/8.12.2
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@     WARNING: plugin requires additional permissions     @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
* java.net.SocketPermission * connect,resolve
See https://docs.oracle.com/javase/8/docs/technotes/guides/security/permissions.html
for descriptions of what these permissions allow and the associated risks.

Continue with installation? [y/N]y
-> Installed analysis-ik
-> Please restart Elasticsearch to activate any plugins installed
elasticsearch@33d5877b32ba:~$ 
elasticsearch@33d5877b32ba:~$ 
elasticsearch@33d5877b32ba:~$ ./bin/elasticsearch-plugin install https://get.infini.cloud/elasticsearch/analysis-pinyin/8.12.2
-> Installing https://get.infini.cloud/elasticsearch/analysis-pinyin/8.12.2
-> Downloading https://get.infini.cloud/elasticsearch/analysis-pinyin/8.12.2
-> Installed analysis-pinyin
-> Please restart Elasticsearch to activate any plugins installed
elasticsearch@33d5877b32ba:~$ exit
exit
[root@yaocai-local-dev elasticsearch]# 

配置Kibana 连接 Elasticsearch 的身份验证

账号或token均可

1.1创建kibana需要的es账号

设置docker-compose中指的es内置账号(kibana_system)用于kibana访问es

密码:CfHhxf8fzv,修改密码需同步修改docker-compose.yml中kibana指定的账号密码

bash 复制代码
[root@VM-4-10-centos elasticsearch]# docker compose exec elasticsearch bash
elasticsearch@0a5d8d64466e:~$ elasticsearch-reset-password -i -u kibana_system
This tool will reset the password of the [kibana_system] user.
You will be prompted to enter the password.
Please confirm that you would like to continue [y/N]y


Enter password for [kibana_system]: 
Re-enter password for [kibana_system]: 
Password for the [kibana_system] user successfully reset.
elasticsearch@0a5d8d64466e:~$ 

创建账号示例

bash 复制代码
[root@yaocai-local-dev elasticsearch]# docker compose exec elasticsearch bash
elasticsearch@33d5877b32ba:~$ 
elasticsearch@33d5877b32ba:~$ elasticsearch-users useradd elastic-test -p 123456 -r kibana_system
WARNING: Group of file [/usr/share/elasticsearch/config/users] used to be [root], but now is [elasticsearch]
WARNING: Group of file [/usr/share/elasticsearch/config/users_roles] used to be [root], but now is [elasticsearch]
elasticsearch@33d5877b32ba:~$ exit
exit
[root@yaocai-local-dev elasticsearch]#
1.2创建 Kibana 服务账户令牌

创建docker-compose.yml中kibana指定的账户

bash 复制代码
PS D:\workspace\devops\elasticsearch> docker compose exec -it elasticsearch elasticsearch-service-tokens create elastic/kibana kibana-token
SERVICE_TOKEN elastic/kibana/kibana-token = AAEAAWVsYXN0aWMva2liYW5hL2tpYmFuYS10b2tlbjpWckFhYWl4U1EyNmNRMGR4OGtXYmVR
PS D:\workspace\devops\elasticsearch>

配置完成后重启es

bash 复制代码
docker compost researt elasticsearch

访问服务

ES

bash 复制代码
http://127.0.0.1:19200

kibana

bash 复制代码
http://127.0.0.1:5601
  • 开发工具控制台
  • 索引管理

操作示例

kibana创建索引

开发工具控制台执行

bash 复制代码
# 为所有 .kibana*索引(如 .kibana, .kibana_1等)统一应用配置,解决磁盘空间问题:通过减少分片和副本数量(1主分片+0副本),显著降低存储开销
PUT _index_template/.kibana
{
  "index_patterns": [".kibana*"],
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 0,
      "auto_expand_replicas": false
    }
  },
  "priority": 1
}


# 新建索引
PUT /yaocai_drug
{
    "settings": {
        "number_of_shards": 1,
        "number_of_replicas": 0,
        "analysis": {
            "filter": {  
              "synonym_filter": {  
                "type": "synonym",
                "synonyms_path": "certs/synonym.txt"  
              }
            }, 
            "analyzer": {
                "pinyin_analyzer": {
                    "tokenizer": "my_pinyin"
                },
                "synonym_analyzer": {  
                  "tokenizer": "standard",  
                  "filter": ["lowercase", "synonym_filter"]  
                }
            },
            "tokenizer": {
                "my_pinyin": {
                    "type": "pinyin",
                    "keep_first_letter": false,
                    "keep_separate_first_letter": true,
                    "keep_full_pinyin": true,
                    "keep_original": false,
                    "keep_none_chinese_together": true,
                    "limit_first_letter_length": 16,
                    "lowercase": true,
                    "remove_duplicated_term": true
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "update_time": {
                "type": "date",
                "format": "strict_date_optional_time||epoch_millis",
                "store": true
            },
            "is_delete": {
                "type": "boolean",
                "store": true
            },
            "name": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                    },
                    "pinyin": {
                        "type": "text",
                        "store": false,
                        "term_vector": "with_offsets",
                        "analyzer": "pinyin_analyzer"
                    }
                }
            },
            "trade_name": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                    },
                    "pinyin": {
                        "type": "text",
                        "store": false,
                        "term_vector": "with_offsets",
                        "analyzer": "pinyin_analyzer"
                    }
                }
            },
            "alias_name": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                    },
                    "pinyin": {
                        "type": "text",
                        "store": false,
                        "term_vector": "with_offsets",
                        "analyzer": "pinyin_analyzer"
                    }
                }
            },
            "specification": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                    },
                    "pinyin": {
                        "type": "text",
                        "store": false,
                        "term_vector": "with_offsets",
                        "analyzer": "pinyin_analyzer"
                    }
                }
            },
            "price": {
                "type": "float",
                "store": true
            },
            "factory": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                    },
                    "pinyin": {
                        "type": "text",
                        "store": false,
                        "term_vector": "with_offsets",
                        "analyzer": "pinyin_analyzer"
                    }
                }
            },
            "barcode": {
                "type": "keyword",
                "store": true
            },
            "classify_id": {
                "type": "keyword",
                "store": true
            },
            "status": {
                "type": "integer",
                "store": true
            }
        }
    }
}


# 删除索引
DELETE /yaocai_drug


# 查询文档总数
GET /yaocai_drug/_count


# 查询示例
GET /yaocai_drug/_search
{  
  "query": {  
    "bool": {  
      "should": [  
        {  
          "match_phrase": {  
            "name": {  
              "query": "感冒颗粒",  
              "boost": 10,
              "slop": 15   
            }  
          }  
        },  
        {
          "match_phrase": {  
            "factory": {  
              "query": "感冒颗粒",  
              "boost": 1,
              "slop": 15   
            }  
          }  
        },
        {  
          "match_phrase": {  
            "factory.pinyin": {  
              "query": "感冒颗粒",  
              "boost": 1,
              "slop": 15   
            }  
          }  
        },
        {  
          "match_phrase": {  
            "name.pinyin": {  
              "query": "感冒颗粒",  
              "boost": 1,
              "slop": 15   
            }  
          }  
        },
        {  
          "match_phrase": {  
            "brand": {  
              "query": "感冒颗粒",  
              "boost": 1,
              "slop": 15   
            }  
          }  
        }
      ],  
      "minimum_should_match": 1 
    }  
  },
  "aggs": {  
    "group_by_name_keyword": {  
      "terms": {  
        "field": "name.keyword",  
        "size": 100 
      }  
    },
    "group_by_factory_keyword": {  
      "terms": {  
        "field": "factory.keyword",  
        "size": 100  
      }  
    },
    "group_by_specification_keyword": {  
      "terms": {  
        "field": "specification.keyword",  
        "size": 100
      }  
    }
  }
}

数据同步(logstash)

参考

全量

bash 复制代码
/usr/share/logstash/bin/logstash -f all_yaocai_drug.conf
config 复制代码
# all_yaocai_drug.conf
# pipeline.ecs_compatibility: disabled
input {
  # 全量同步
  jdbc {
    jdbc_connection_string => "jdbc:postgresql://192.168.10.230:5432/yaocai_dev"
    jdbc_user => "yaocai_dev"
    jdbc_password => "hEFWbfsEkzEEMcj6"
    jdbc_driver_library => "/root/logstash/postgresql-42.7.3.jar"
    jdbc_driver_class => "org.postgresql.Driver"  
    jdbc_page_size => 1000
    jdbc_fetch_size => 1000

    statement => "SELECT id,update_time,is_delete,name,specification,price,factory,classify_id,barcode,status,is_otc,in_medical_insurance,medical_insurance_code,brand,unit,approval,business_type_text,images,upc,trade_name,alias_name FROM yaocai_drug where is_delete = false order by id asc;"

    # 重复执行的周期,每分钟执行1次
    #schedule => "* * * * *"
    
    # 是否将字段名小写化
    #lowercase_column_names => false
      
    # 增量同步的参数
    #use_column_value => true
    #tracking_column => "id"
    #record_last_run => true
    #last_run_metadata_path => "/root/logstash/your_last_run_metadata_file"

    # 启用调试日志
    #jdbc_validate_connection => true
    #jdbc_pagination_enabled => false
  }
}

# 规格添加一个新的格式,数字与英文分隔
filter {
  if [specification] {
    ruby {
      code => "
        spec = event.get('specification')
        # 使用正则表达式将所有的数字和字母之间插入空格
        spec_modified = spec.gsub(/(\d+(\.\d+)?)([a-zA-Z*]+)/, '\\1 \\3')
        
        event.set('specification2', spec_modified)
      "
    }
  }
}

output {
  elasticsearch {
    hosts => ["http://127.0.0.1:19200"]
    index => "yaocai_drug"

    document_id => "%{id}"
    
    manage_template => false

    user => "elastic"
    password => "dAUl36wBVN"
  }

  # 可以添加stdout输出用于调试
  #stdout { codec => rubydebug }
}

增量

config 复制代码
# pipeline.ecs_compatibility: disabled
input {
  # 增量同步
  jdbc {  
    jdbc_connection_string => "jdbc:postgresql://192.168.10.230:5432/yaocai_dev2"   
    jdbc_user => "yaocai_dev"  
    jdbc_password => "hEFWbfsEkzEEMcj6"  
    jdbc_driver_library => "/root/logstash/postgresql-42.7.3.jar"  
    jdbc_driver_class => "org.postgresql.Driver"
    jdbc_page_size => 1000
    jdbc_fetch_size => 1000

    statement => "SELECT id,update_time,is_delete,name,specification,price,factory,classify_id,barcode,status,is_otc,in_medical_insurance,medical_insurance_code,brand,unit,approval,business_type_text,images,upc,trade_name,alias_name FROM yaocai_drug where update_time > :sql_last_value order by update_time asc;"

    # 重复执行的周期,每分钟执行1次
    schedule => "*/3 * * * *"  
    
    # 是否将字段名小写化
    #lowercase_column_names => false  
      
    # 增量同步的参数  
    use_column_value => true  
    tracking_column => "update_time"  
    tracking_column_type => "timestamp"
    record_last_run => true  
    last_run_metadata_path => "/root/logstash/your_last_run_metadata_file"  

    # 启用调试日志  
    #jdbc_validate_connection => true  
    #jdbc_pagination_enabled => false  
  }  
}

# 规格添加一个新的格式,数字与英文分隔
filter {  
  if [specification] {
    ruby {  
      code => "
        spec = event.get('specification')
        # 使用正则表达式将所有的数字和字母之间插入空格
        spec_modified = spec.gsub(/(\d+(\.\d+)?)([a-zA-Z*]+)/, '\\1 \\3')
        
        event.set('specification2', spec_modified)  
      "  
    }  
  }  
}

output {  
  elasticsearch {  
    hosts => ["http://127.0.0.1:19200"]  
    index => "yaocai_drug"  

    document_id => "%{id}"  
    
    manage_template => false

    user => "elastic"
    password => "dAUl36wBVN"
  }

  # 可以添加stdout输出用于调试  
  #stdout { codec => rubydebug }  
}
相关推荐
1024find5 小时前
Docker安装(Centos/Ubuntu/UOS)
ubuntu·docker·centos·debian
cdg==吃蛋糕8 小时前
solr cloud集群搭建 solr5+zookeeper
tcp/ip·docker·容器
段振轩9 小时前
Docker nginx容器部署前端项目。
前端·nginx·docker
conkl9 小时前
在ARM64 Ubuntu 20.04上部署Mailu邮件系统:深度排查Docker Bridge网络通信失败问题
linux·ubuntu·docker
云游11 小时前
IP地址管理:docker方式部署phpIPAMv1.7.3
运维·docker·ip·ipv4·ipv6
小闫BI设源码12 小时前
Docker Swarm主机编排
运维·docker·容器·容器编排·docker compose·依赖管理·多服务启动
Reicher12 小时前
Docker的介绍和使用
运维·docker·容器
退役小学生呀13 小时前
二十一、DevOps:从零建设基于K8s的DevOps平台(二)
运维·docker·云原生·容器·kubernetes·devops
小样还想跑14 小时前
UniApp ConnectSocket连接websocket
websocket·elasticsearch·uni-app