technology-note/java/springboot系列/数据库/3.springboot整合elasticsearch.md

---
id: "2019-07-29-14-59"
date: "2019/07/29 14:59:00"
title: "springboot整合elasticsearch(基于es7.2和官方high level client)"
tag: ["java", "spring-boot", "elasticsearch", "es"]
categories:
  - "java"
  - "spring boot学习"
---

# 前言

最近写的一个个人项目(传送门：[全终端云书签](https://github.com/FleyX/bookmark))中需要用到全文检索功能，目前 mysql，es 都可以做全文检索，mysql 胜在配置方便很快就能搞定上线([参考这里](https://www.tapme.top/blog/detail/2019-03-27-09-48)),不考虑上手难度，es 在全文检索方面是完胜 mysql 的。

最后决定使用 es。使用最新的 7.2 版本。java 客户端使用 es 官方的 high level client([官方文档](https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-high-getting-started.html)),为什么用这个有以下几点原因：

- jest 毕竟不是官方的，更新速度较慢
- transportClient,速度太慢，连官方都嫌弃它了。在 7.x 中已经被弃用，8.x 中将完全删除
- high level client 的官方文档写的很清楚明了，虽然目前相关的中文资料还很少，也能够上手用起来

本文主要内容如下：

- docker 部署 es(支持 ik 中文分词)
- 在 springboot 中进行增删改查

<!-- more -->

# docker 部署 es(基于 linux)

es 的中文分词目前比较流行的分词插件为 ik([github 地址](https://github.com/medcl/elasticsearch-analysis-ik))。由于手写 docker 命令太繁杂，这里用 docker-compose 来管理。假定当前在/root 目录下

1.  下载 ik release 到/root/es/ik 目录下，并解压到当前文件夹。
2.  创建/root/es/data 目录，并将读写权限给所有用户.本目录用于存放 es 数据。由于 es 不能以 root 用户执行，所以对于此目录需要将读写权限给其他用户。
3.  编写 es 配置文件，7.2 的配置文件变化还是较大的（之前用的是 2.x 版本），一个简单的配置如下：

    ```yml
    cluster.name: elasticsearch

     # 配置的集群名称，默认是 elasticsearch，es 服务会通过广播方式自动连接在同一网段下的 es 服务，通过多播方式进行通信，同一网段下可以有多个集群，通过集群名称这个属性来区分不同的集群。

     node.name: bookmark-world

     # 当前配置所在机器的节点名，你不设置就默认随机指定一个 name 列表中名字，该 name 列表在 es 的 jar 包中 config 文件夹里 name.txt 文件中，其中有很多作者添加的有趣名字。

     node.master: true

     # 指定该节点是否有资格被选举成为 node（注意这里只是设置成有资格， 不代表该 node 一定就是 master），默认是 true，es 是默认集群中的第一台机器为 master，如果这台机挂了就会重新选举 master。

     node.data: true

     # 指定该节点是否存储索引数据，默认为 true。

     bootstrap.memory_lock: false

     # 设置为 true 来锁住内存不进行 swapping。因为当 jvm 开始 swapping 时 es 的效率 会降低，所以要保证它不 swap，可以把 ES_MIN_MEM 和 ES_MAX_MEM 两个环境变量设置成同一个值，并且保证机器有足够的内存分配给 es。 同时也要允许 elasticsearch 的进程可以锁住内存，linux 下启动 es 之前可以通过`ulimit -l unlimited`命令设置。
     # 设置为 true，会导致报警告实际未锁定内存，进而退出进程(es在生产模式下有警告就会退出)

     network.bind_host: 0.0.0.0

     # 设置绑定的 ip 地址，可以是 ipv4 或 ipv6 的，默认为 0.0.0.0，绑定这台机器的任何一个 ip。
     # 集群配置
     discovery.seed_hosts:
       - bookmark-es
     cluster.initial_master_nodes:
       - bookmark-world

    ```

4.  编写/root/docker-compose.yml

    ```yml
    version: "2"
    services:
      bookmark-es:
      image: docker.elastic.co/elasticsearch/elasticsearch:7.2.0
    container_name: bookmark-es
    volumes:
      - /etc/localtime:/etc/localtime
      - ./es/data:/usr/share/elasticsearch/data
      - ./es/ik:/usr/share/elasticsearch/plugins/ik
      - ./es/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
    ports:
      - 9200:9200
      - 9300:9300
    ```

5.  执行 `docker-compose up -d`启动 es

详细可参考这里：[云书签 docker 部署](https://github.com/FleyX/bookmark/tree/master/bookMarkDocker)。

# springboot 整合

## 创建 springboot 项目

首先创建一个 springboot 项目，然后引入`high level client`的依赖，pom 文件如下：

```xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>
    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.1.6.RELEASE</version>
        <relativePath/> <!-- lookup parent from repository -->
    </parent>
    <groupId>com.fanxb</groupId>
    <artifactId>es-demo</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>es-demo</name>
    <description>Elasticsearch Demo project for Spring Boot</description>

    <properties>
        <java.version>1.8</java.version>
    </properties>

    <!--注意：如果使用了parent那么需要在此定义es版本号,因为spring-boot-start-parent中已经定义了es相关依赖的版本号
    ，high-level-client中的部分依赖会被覆盖成低版本的,导出出现莫名其妙的错误-->
    <dependencyManagement>
        <dependencies>
            <dependency>
                <groupId>org.elasticsearch.client</groupId>
                <artifactId>elasticsearch-rest-high-level-client</artifactId>
                <version>7.2.0</version>
            </dependency>
            <!-- https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch -->
            <dependency>
                <groupId>org.elasticsearch</groupId>
                <artifactId>elasticsearch</artifactId>
                <version>7.2.0</version>
            </dependency>
            <!--&lt;!&ndash; https://mvnrepository.com/artifact/org.elasticsearch.client/elasticsearch-rest-client &ndash;&gt;-->
            <dependency>
                <groupId>org.elasticsearch.client</groupId>
                <artifactId>elasticsearch-rest-client</artifactId>
                <version>7.2.0</version>
            </dependency>
        </dependencies>
    </dependencyManagement>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>

        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.2.0</version>
        </dependency>

        <dependency>
            <groupId>com.alibaba</groupId>
            <artifactId>fastjson</artifactId>
            <version>1.2.56</version>
        </dependency>

        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>

</project>
```

**注意:这里有一个依赖的大坑，要注意!**

如果定义了`<parent>`，就必须在`<dependencyManagement>`中指定部分依赖的版本，否则会因为依赖版本不对出现各种莫名其妙的错误,上面注释中已经指出。

## 创建 util/EsUtil.java 工具类

主要功能函数如下：

### 预创建 index

虽然 es 在插入数据时会自动根据字段类型来创建字段定义，但是自动创建并不总是和需要相符的，比如想让某个字段不分词，或者使用其他的分词器。所以在代码中先判断 index(es7 中已经废弃了 mapping，也就是一个 index 相当于一个表)是否存在，如果不存在就创建 index.

主要代码如下：

```java
//被@PostConstruct注释的方法将会在对应类注入到Spring后调用,确保index的生成
@PostConstruct
public void init() {
    try {
        if (client != null) {
            client.close();
        }
        client = new RestHighLevelClient(RestClient.builder(new HttpHost(host, port, scheme)));
        if (this.indexExist(INDEX_NAME)) {
            return;
        }
        CreateIndexRequest request = new CreateIndexRequest(INDEX_NAME);
        request.settings(Settings.builder().put("index.number_of_shards", 3).put("index.number_of_replicas", 2));
        request.mapping(CREATE_INDEX, XContentType.JSON);
        CreateIndexResponse res = client.indices().create(request, RequestOptions.DEFAULT);
        if (!res.isAcknowledged()) {
            throw new RuntimeException("初始化失败");
        }
    } catch (Exception e) {
        e.printStackTrace();
        System.exit(0);
    }
}
```

### 插入或者更新一个对象

通过指定 id，如果此 id 存在那么就是更新，否则是插入。

```java
public void insertOrUpdateOne(String index, EsEntity entity) {
    IndexRequest request = new IndexRequest(index);
    request.id(entity.getId());
    request.source(JSON.toJSONString(entity.getData()), XContentType.JSON);
    try {
        client.index(request, RequestOptions.DEFAULT);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}
```

### 批量插入

high level client 提供了方便的批量操作接口，如下所示：

```java
public void insertBatch(String index, List<EsEntity> list) {
    BulkRequest request = new BulkRequest();
    list.forEach(item -> request.add(new IndexRequest(index).id(item.getId())
            .source(JSON.toJSONString(item.getData()), XContentType.JSON)));
    try {
        client.bulk(request, RequestOptions.DEFAULT);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}
```

### 批量删除

和上面一样同样用到了`BulkRequest`

```java
public <T> void deleteBatch(String index, Collection<T> idList) {
    BulkRequest request = new BulkRequest();
    idList.forEach(item -> request.add(new DeleteRequest(index, item.toString())));
    try {
        client.bulk(request, RequestOptions.DEFAULT);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}
```

### 搜索

通过构建`SearchSourceBuilder`查询参数

```java
public <T> List<T> search(String index, SearchSourceBuilder builder, Class<T> c) {
    SearchRequest request = new SearchRequest(index);
    request.source(builder);
    try {
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        SearchHit[] hits = response.getHits().getHits();
        List<T> res = new ArrayList<>(hits.length);
        for (SearchHit hit : hits) {
            res.add(JSON.parseObject(hit.getSourceAsString(), c));
        }
        return res;
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}
```

### delete by query

es 插入数据容易，删除就比较麻烦了，特别是根据条件删除。

```java
public void deleteByQuery(String index, QueryBuilder builder) {
    DeleteByQueryRequest request = new DeleteByQueryRequest(index);
    request.setQuery(builder);
    //设置批量操作数量,最大为10000
    request.setBatchSize(10000);
    request.setConflicts("proceed");
    try {
        client.deleteByQuery(request, RequestOptions.DEFAULT);
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}
```

# 结束

可通过测试类`com.fanxb.esdemo.service.BookServiceTest`查看运行结果。

源码地址：[github](https://github.com/FleyX/demo-project/tree/master/es-demo)

**本文原创发布于：**[https://www.tapme.top/blog/detail/2019-07-29-14-59](https://www.tapme.top/blog/detail/2019-07-29-14-59)