大数据-178 Elasticsearch Query - Java API 索引操作 & 文档操作

点一下关注吧！！！非常感谢！！持续更新！！！

目前已经更新到了：

Hadoop（已更完）
HDFS（已更完）
MapReduce（已更完）
Hive（已更完）
Flume（已更完）
Sqoop（已更完）
Zookeeper（已更完）
HBase（已更完）
Redis （已更完）
Kafka（已更完）
Spark（已更完）
Flink（已更完）
ClickHouse（已更完）
Kudu（已更完）
Druid（已更完）
Kylin（已更完）
Elasticsearch（正在更新...）

章节内容

上节我们完成了如下的内容：

聚合分析
指标聚合
桶聚合

索引操作

创建索引：创建索引是存储数据的第一步。在 Elasticsearch 中，索引相当于关系数据库中的表。创建索引时，你可以指定映射（Mapping），定义字段类型（如 text、keyword、date、geo_point 等）。可以通过 Java API 传递索引设置（Settings）和映射来灵活定义索引的结构。
获取索引信息：通过 Java API 可以获取现有索引的详细信息，例如索引的元数据、字段映射、分片数量、副本数量等。这有助于用户分析和优化索引的性能。
索引存在性检查：在执行某些操作之前，检查索引是否存在是常见需求。例如，在插入数据前确保索引已经创建，或在删除索引之前确认它的存在性。
删除索引：删除不再需要的索引可以节省磁盘空间。需要小心的是，删除索引会清除该索引中的所有数据，操作不可逆，因此通常建议在执行此操作前进行备份。
更新索引设置：当集群扩展或数据增长时，你可能需要动态调整索引的分片数量或副本数量。Java API 提供了修改索引设置的功能，可以对现有索引进行优化调整。

文档操作

插入文档：文档是 Elasticsearch 中的最小数据存储单元，类似于关系数据库中的行。每个文档以 JSON 格式存储在索引中。通过 Java API，可以向特定索引插入单个文档，并指定文档的 ID（如果不指定，Elasticsearch 会自动生成一个 ID）。
获取文档：Java API 可以根据文档 ID 从索引中获取单个文档，返回的结果会包含文档的元数据信息，如 _id、_index、_version 等。获取文档操作通常用于精确查询和显示某个特定数据。
更新文档：更新文档时，Elasticsearch 并不会直接修改原始文档，而是通过创建一个新版本的文档来完成。Java API 支持部分更新（Partial Update），即只更新文档中的某些字段，而不必重新提交整个文档。
删除文档：删除文档同样基于文档 ID 进行操作。如果文档需要从集群中移除，可以通过 Java API 进行删除操作。此外，删除文档时也可以基于查询条件进行批量删除。
批量操作：在处理大量文档时，批量操作（Bulk API）非常重要。Java API 提供了批量插入、更新、删除文档的功能，可以提高大规模数据处理的效率。批量操作通常应用于数据迁移、批量更新、或者从其他系统同步数据到 Elasticsearch。

文件工程

IDEA新建Maven工程，开始对Elasticsearch的学习。

由于重复度很高，这里就跳过了，大家自行创建即可。

导入依赖

xml 复制代码

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>org.example</groupId>
    <artifactId>study-es</artifactId>
    <version>1.0-SNAPSHOT</version>

    <properties>
        <maven.compiler.source>17</maven.compiler.source>
        <maven.compiler.target>17</maven.compiler.target>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.3.0</version>
            <exclusions>
                <exclusion>
                    <groupId>org.elasticsearch</groupId>
                    <artifactId>elasticsearch</artifactId>
                </exclusion>
            </exclusions>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>7.3.0</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
            <scope>compile</scope>
        </dependency>
        <dependency>
            <groupId>org.testng</groupId>
            <artifactId>testng</artifactId>
            <version>6.14.3</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.11.1</version>
        </dependency>
    </dependencies>
</project>

配置文件

我们要在Resource目录下，新建 log4j2.xml

xml 复制代码

<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="WARN">
  <Appenders>
    <Console name="Console" target="SYSTEM_OUT">
      <PatternLayout pattern="%d{yyyy-mm-dd HH:mm:ss} [%t] %-5p %c{1}:%L - %msg%n" />
    </Console>
  </Appenders>
  <Loggers>
    <Root level="info">
      <AppenderRef ref="Console" />
    </Root>
  </Loggers>
</Configuration>

创建Client

java 复制代码

package icu.wzk;


import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.junit.After;
import org.junit.Before;

import java.io.IOException;

public class ElasticsearchTest {

    RestHighLevelClient client;

    @Before
    public void init() throws Exception {
        RestClientBuilder builder = RestClient.builder(
                new HttpHost("h121.wzk.icu", 9200, "http"),
                new HttpHost("h122.wzk.icu", 9200, "http"),
                new HttpHost("h123.wzk.icu", 9200, "http")
        );
        final RestHighLevelClient highLevelClient = new RestHighLevelClient(builder);
        System.out.println(highLevelClient.cluster().toString());
        client = highLevelClient;
    }

    @After
    public void destroy() throws IOException {
        if (null != client) {
            client.close();
        }
    }

}

索引操作

创建索引

JSON方式

java 复制代码

@Test
public void createIndex() throws Exception {
    final CreateIndexRequest indexRequest = new CreateIndexRequest("wzk-icu-es-test");
    // mapping 信息
    // mapping 信息
    String mapping = "{\n" +
            "  \"settings\": {},\n" +
            "  \"mappings\": {\n" +
            "    \"properties\": {\n" +
            "      \"description\": {\n" +
            "        \"type\": \"text\",\n" +
            "        \"analyzer\": \"ik_max_word\"\n" +
            "      },\n" +
            "      \"name\": {\n" +
            "        \"type\": \"text\"\n" +
            "      },\n" +
            "      \"pic\": {\n" +
            "        \"type\": \"text\",\n" +
            "        \"index\": false\n" +
            "      },\n" +
            "      \"studymodel\": {\n" +
            "        \"type\": \"text\"\n" +
            "      }\n" +
            "    }\n" +
            "  }\n" +
            "}";
    indexRequest.source(mapping, XContentType.JSON);
    // 创建索引
    CreateIndexResponse indexResponse = client.indices().create(indexRequest, RequestOptions.DEFAULT);
    boolean acknowledged = indexResponse.isAcknowledged();
    System.out.println("创建结果: " + acknowledged);
}

执行结果如下图所示，创建成功！

我们通过 Elasticsearch-Head 工具，可以看到如下的内容：

对象方式

java 复制代码

@Test
public void createIndex2() throws Exception {
    CreateIndexRequest createIndexRequest = new CreateIndexRequest("wzk-icu-es-2");
    createIndexRequest.settings(Settings
            .builder()
                    .put("index.number_of_shards", 5)
                    .put("index.number_of_replicas", 1)
            .build());
    // 指定 mapping
    XContentBuilder xContentBuilder = XContentFactory.jsonBuilder();
    xContentBuilder.startObject();
    xContentBuilder.startObject("properties");
    xContentBuilder.startObject("description")
            .field("type", "text")
            .field("analyzer", "ik_max_word")
            .endObject();
    xContentBuilder.startObject("name")
            .field("type", "text")
            .endObject();
    xContentBuilder.startObject("pic")
            .field("type", "text")
            .field("index", "false")
            .endObject();
    xContentBuilder.startObject("studymodel")
            .field("type", "text")
            .endObject();
    xContentBuilder.endObject();
    xContentBuilder.endObject();

    // mapping塞进去
    createIndexRequest.mapping(xContentBuilder);
    final CreateIndexResponse createIndexResponse = client
            .indices()
            .create(createIndexRequest, RequestOptions.DEFAULT);
    boolean acknowledged = createIndexResponse.isAcknowledged();
    System.out.println("创建结果2: " + acknowledged);
}

执行的结果的如下图所示：

Elasticsearch-Head 查看，可以看到刚才创建的ES索引，分片的分布情况如下：

删除索引

java 复制代码

@Test
public void deleteIndex() throws Exception {
    DeleteIndexRequest deleteRequest = new DeleteIndexRequest("wzk-icu-es-test");
    AcknowledgedResponse deleteResponse = client
    .indices()
    .delete(deleteRequest, RequestOptions.DEFAULT);
    boolean acknowledged = deleteResponse.isAcknowledged();
    System.out.println("删除索引: " + acknowledged);
}

执行结果如下图所示：

对应的Elasticsearch-Head查看，可以看到索引已经移除了：

文档操作

添加文档

java 复制代码

@Test
public void addDoc() throws Exception {
    IndexRequest indexRequest = new IndexRequest("wzk-icu-es-2").id("1");
    String str = " {\n" +
            " \"name\": \"spark添加文档\",\n" +
            " \"description\": \"spark技术栈\",\n" +
            " \"studymodel\":\"online\",\n" +
            " \"pic\": \"http://www.baidu.com\"\n" +
            " }";
    indexRequest.source(str, XContentType.JSON);
    // 新增
    IndexResponse index = client.index(indexRequest, RequestOptions.DEFAULT);
    System.out.println("新增的结果:" + index.status());
}

执行代码的结果如下图所示：

查询文档

java 复制代码

@Test
public void getDoc() throws Exception {
    GetRequest getRequest = new GetRequest("wzk-icu-es-2");
    getRequest.id("1");
    GetResponse getResponse = client.get(getRequest, RequestOptions.DEFAULT);
    Map<String, Object> sourceMap = getResponse.getSourceAsMap();
    System.out.println("查询结果:" + sourceMap);
}

执行结果如下图：

查询所有

java 复制代码

@Test
public void getAllDoc() throws Exception {
    SearchRequest searchRequest = new SearchRequest();
    // 指定索引
    searchRequest.indices("wzk-icu-es-2");
    SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
    sourceBuilder.query(QueryBuilders.matchAllQuery());
    searchRequest.source(sourceBuilder);

    SearchResponse searchResponse = client.search(searchRequest, RequestOptions.DEFAULT);
    RestStatus status = searchResponse.status();
    System.out.println("查询结果状态: " + status);
    SearchHits hits = searchResponse.getHits();
    SearchHit[] hits1 = hits.getHits();
    for (SearchHit sh : hits1) {
        System.out.println("---");
        Map<String, Object> map = sh.getSourceAsMap();
        System.out.println("查询的结果: " + map);

    }
}

执行的结果如下图所示：