base elasticsearch

1.概述

1.1 基本概念

documents ：可以建立索引的基本单位
index：具有相似特征的 documents 集合
type：曾经是 index 的逻辑类别或分区，准许在 index 中创建不同类型的 documents, 但是 type在6.0以后被弃用了
mapping: 处理数据的方式和规则
Near Realtime (NRT)：轻微的延迟(通常是 1 秒以内)

1.2 安装

1.2.1 准备

下载：官方下载页面
- elasticsearch : elasticsearch-6.8.15
- kibana: kibana-6.8.15-linux-x86_64
服务器准备：
- 虚拟机（服务器）*3
环境准备：
- jdk 1.8+

1.2.2 单节点部署

1.2.2.1 elasticsearch

解压：

$ tar -zxf elasticsearch-6.8.15.tar.gz

直接启动：

$ cd elasticsearch-6.8.15

$ ./bin/elasticsearch

直接启动异常：

txt
[2021-03-26T16:32:58,140][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [unknown] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.RuntimeException: can not run elasticsearch as root
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:163) ~[elasticsearch-6.8.15.jar:6.8.15]
	at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:150) ~[elasticsearch-6.8.15.jar:6.8.15]
	at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.8.15.jar:6.8.15]
	at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.8.15.jar:6.8.15]
	at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.8.15.jar:6.8.15]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:116) ~[elasticsearch-6.8.15.jar:6.8.15]
	at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) ~[elasticsearch-6.8.15.jar:6.8.15]
Caused by: java.lang.RuntimeException: can not run elasticsearch as root
	at org.elasticsearch.bootstrap.Bootstrap.initializeNatives(Bootstrap.java:103) ~[elasticsearch-6.8.15.jar:6.8.15]
	at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:170) ~[elasticsearch-6.8.15.jar:6.8.15]
	at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:333) ~[elasticsearch-6.8.15.jar:6.8.15]
	at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:159) ~[elasticsearch-6.8.15.jar:6.8.15]
	... 6 more

根据日志提示，添加用户

// 添加用户elasticsearch-user

$ adduser elasticsearch-user

// 修改用户 elasticsearch-user 的密码

$ passwd elasticsearch-user Changing password for user elasticseach-user. New password: Retype new password: passwd: all authentication tokens updated successfully.

// 修改目录权限

$ chown -R elasticsearch-user /data/elasticsearch

// 切换用户

$ su elasticsearch-user

再次尝试启动

nohup ./bin/elasticsearch &

log
[elasticsearch-user@elasticsearch1 elasticsearch-6.8.15]$ nohup ./bin/elasticsearch &
[1] 9755
[elasticsearch-user@elasticsearch1 elasticsearch-6.8.15]$ nohup: ignoring input and appending output to ‘nohup.out’
^C

[elasticsearch-user@elasticsearch1 elasticsearch-6.8.15]$ tail -100f nohup.out 
[2021-03-31T09:17:08,259][INFO ][o.e.e.NodeEnvironment    ] [No5lu90] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [41.9gb], net total_space [49.9gb], types [rootfs]
[2021-03-31T09:17:08,274][INFO ][o.e.e.NodeEnvironment    ] [No5lu90] heap size [989.8mb], compressed ordinary object pointers [true]
[2021-03-31T09:17:08,277][INFO ][o.e.n.Node               ] [No5lu90] node name derived from node ID [No5lu90XRxiKNN-5R0aXfQ]; set [node.name] to override
[2021-03-31T09:17:08,278][INFO ][o.e.n.Node               ] [No5lu90] version[6.8.15], pid[9755], build[default/tar/c9a8c60/2021-03-18T06:33:32.588487Z], OS[Linux/3.10.0-693.el7.x86_64/amd64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_151/25.151-b12]
[2021-03-31T09:17:08,278][INFO ][o.e.n.Node               ] [No5lu90] JVM arguments [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Des.networkaddress.cache.ttl=60, -Des.networkaddress.cache.negative.ttl=10, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -XX:-OmitStackTraceInFastThrow, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Djava.io.tmpdir=/tmp/elasticsearch-4819315221049591280, -XX:+HeapDumpOnOutOfMemoryError, -XX:HeapDumpPath=data, -XX:ErrorFile=logs/hs_err_pid%p.log, -XX:+PrintGCDetails, -XX:+PrintGCDateStamps, -XX:+PrintTenuringDistribution, -XX:+PrintGCApplicationStoppedTime, -Xloggc:logs/gc.log, -XX:+UseGCLogFileRotation, -XX:NumberOfGCLogFiles=32, -XX:GCLogFileSize=64m, -Des.path.home=/data/elasticsearch/elasticsearch-6.8.15, -Des.path.conf=/data/elasticsearch/elasticsearch-6.8.15/config, -Des.distribution.flavor=default, -Des.distribution.type=tar]
[2021-03-31T09:17:10,670][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [aggs-matrix-stats]
[2021-03-31T09:17:10,670][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [analysis-common]
[2021-03-31T09:17:10,670][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [ingest-common]
[2021-03-31T09:17:10,670][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [ingest-geoip]
[2021-03-31T09:17:10,671][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [ingest-user-agent]
[2021-03-31T09:17:10,671][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [lang-expression]
[2021-03-31T09:17:10,671][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [lang-mustache]
[2021-03-31T09:17:10,671][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [lang-painless]
[2021-03-31T09:17:10,671][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [mapper-extras]
[2021-03-31T09:17:10,671][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [parent-join]
[2021-03-31T09:17:10,671][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [percolator]
[2021-03-31T09:17:10,671][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [rank-eval]
[2021-03-31T09:17:10,671][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [reindex]
[2021-03-31T09:17:10,672][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [repository-url]
[2021-03-31T09:17:10,672][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [transport-netty4]
[2021-03-31T09:17:10,672][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [tribe]
[2021-03-31T09:17:10,672][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-ccr]
[2021-03-31T09:17:10,672][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-core]
[2021-03-31T09:17:10,672][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-deprecation]
[2021-03-31T09:17:10,672][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-graph]
[2021-03-31T09:17:10,673][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-ilm]
[2021-03-31T09:17:10,673][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-logstash]
[2021-03-31T09:17:10,673][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-ml]
[2021-03-31T09:17:10,673][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-monitoring]
[2021-03-31T09:17:10,673][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-rollup]
[2021-03-31T09:17:10,673][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-security]
[2021-03-31T09:17:10,673][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-sql]
[2021-03-31T09:17:10,674][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-upgrade]
[2021-03-31T09:17:10,674][INFO ][o.e.p.PluginsService     ] [No5lu90] loaded module [x-pack-watcher]
[2021-03-31T09:17:10,674][INFO ][o.e.p.PluginsService     ] [No5lu90] no plugins loaded
[2021-03-31T09:17:15,589][INFO ][o.e.x.s.a.s.FileRolesStore] [No5lu90] parsed [0] roles from file [/data/elasticsearch/elasticsearch-6.8.15/config/roles.yml]
[2021-03-31T09:17:16,329][INFO ][o.e.x.m.p.l.CppLogMessageHandler] [No5lu90] [controller/9931] [Main.cc@114] controller (64 bit): Version 6.8.15 (Build 4cd69e3d9d9390) Copyright (c) 2021 Elasticsearch BV
[2021-03-31T09:17:17,068][DEBUG][o.e.a.ActionModule       ] [No5lu90] Using REST wrapper from plugin org.elasticsearch.xpack.security.Security
[2021-03-31T09:17:17,393][INFO ][o.e.d.DiscoveryModule    ] [No5lu90] using discovery type [zen] and host providers [settings]
[2021-03-31T09:17:18,543][INFO ][o.e.n.Node               ] [No5lu90] initialized
[2021-03-31T09:17:18,544][INFO ][o.e.n.Node               ] [No5lu90] starting ...
[2021-03-31T09:17:23,820][INFO ][o.e.t.TransportService   ] [No5lu90] publish_address {127.0.0.1:9300}, bound_addresses {[::1]:9300}, {127.0.0.1:9300}
[2021-03-31T09:17:23,917][WARN ][o.e.b.BootstrapChecks    ] [No5lu90] max file descriptors [4096] for elasticsearch process is too low, increase to at least [65535]
[2021-03-31T09:17:27,015][INFO ][o.e.c.s.MasterService    ] [No5lu90] zen-disco-elected-as-master ([0] nodes joined), reason: new_master {No5lu90}{No5lu90XRxiKNN-5R0aXfQ}{EauWlcgjRVaCWUOC9LDE3Q}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=16824958976, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}
[2021-03-31T09:17:27,023][INFO ][o.e.c.s.ClusterApplierService] [No5lu90] new_master {No5lu90}{No5lu90XRxiKNN-5R0aXfQ}{EauWlcgjRVaCWUOC9LDE3Q}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=16824958976, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true}, reason: apply cluster state (from master [master {No5lu90}{No5lu90XRxiKNN-5R0aXfQ}{EauWlcgjRVaCWUOC9LDE3Q}{127.0.0.1}{127.0.0.1:9300}{ml.machine_memory=16824958976, xpack.installed=true, ml.max_open_jobs=20, ml.enabled=true} committed version [1] source [zen-disco-elected-as-master ([0] nodes joined)]])
[2021-03-31T09:17:27,105][INFO ][o.e.h.n.Netty4HttpServerTransport] [No5lu90] publish_address {127.0.0.1:9200}, bound_addresses {[::1]:9200}, {127.0.0.1:9200}
[2021-03-31T09:17:27,106][INFO ][o.e.n.Node               ] [No5lu90] started
[2021-03-31T09:17:27,490][WARN ][o.e.x.s.a.s.m.NativeRoleMappingStore] [No5lu90] Failed to clear cache for realms [[]]
[2021-03-31T09:17:27,557][INFO ][o.e.l.LicenseService     ] [No5lu90] license [91784dfe-a371-44cd-b9e1-0bb3181a49ee] mode [basic] - valid
[2021-03-31T09:17:27,571][INFO ][o.e.g.GatewayService     ] [No5lu90] recovered [0] indices into cluster_state

测试

$ curl http://127.0.0.1:9200

json
{
  "name" : "No5lu90",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "cDaupHYNQRWw7vlocIbGoQ",
  "version" : {
    "number" : "6.8.15",
    "build_flavor" : "default",
    "build_type" : "tar",
    "build_hash" : "c9a8c60",
    "build_date" : "2021-03-18T06:33:32.588487Z",
    "build_snapshot" : false,
    "lucene_version" : "7.7.3",
    "minimum_wire_compatibility_version" : "5.6.0",
    "minimum_index_compatibility_version" : "5.0.0"
  },
  "tagline" : "You Know, for Search"
}

1.2.2.2 kibana

解压：

$ tar -zxf kibana-6.8.15-linux-x86_64.tar.gz

$ mv kibana-6.8.15-linux-x86_64 kibana-6.8.15
修改配置

$ vim config/kibana.yml
```
yaml
server.host: "192.168.242.121"
```

2. 检索

2.1 初步检索

2.1.1 _cat

命令：

GET /_cat/nodes: 查看所有节点

GET /_cat/health: 查看 es 健康状况

GET /_cat/master: 查看主节点

GET /_cat/indices: 查看所有索引
示例：

json
$ GET _cat/nodes
192.168.240.63 16 81 0 0.04 0.04 0.05 cdhilmrstw * node-1

$ GET _cat/health
1617068246 01:37:26 elasticsearch yellow 1 1 11 11 0 0 1 0 - 91.7%

$ GET _cat/master
fU1EPHXAT-uVDhz05ctvmQ 192.168.240.63 192.168.240.63 node-1

$ GET _cat/indices
green  open .apm-custom-link                HRtTvn9jTXOCP-7RqZj7BQ 1 0  0      0   208b   208b
green  open .kibana_task_manager_1          KlmGX3mZQaW_vHPctcOy-g 1 0  5 346287 27.2mb 27.2mb
green  open .apm-agent-configuration        aB0fHKdnTRO14368O55PtQ 1 0  0      0   208b   208b
green  open .kibana-event-log-7.10.1-000002 Ysid-__OToyXmIWgUAxKHw 1 0  0      0   208b   208b
green  open .kibana-event-log-7.10.1-000001 KU0GM5MfT5ixtdrzEUVC_g 1 0  3      0 11.3kb 11.3kb
green  open .kibana_1                       WZlQtWvoRxOT7dmMj7z1qg 1 0 39      5  4.2mb  4.2mb
yellow open class                           cq30cyBHTEe437bX5-TyXA 1 1  2      0    8kb    8kb
green  open .kibana-event-log-7.10.1-000003 E9yWvm7dTvCugOCPt3nKVg 1 0  0      0   208b   208b

2.1.2 索引一个文档

说明： 在7.x版本没有 type 这个定义，所以不建议再使用 PUT <index>/<type>/<id> 这种方式去添加数据 type 统一为_doc即可

命令

markdown
PUT /<target>/_doc/<_id>
POST /<target>/_doc/
PUT /<target>/_create/<_id>
POST /<target>/_create/<_id>

示例

json
$ PUT customer/_doc/2
{
  "name":"小明2"
}

// result
{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 1,
  "_primary_term" : 1
}

//----
// 再次执行
$ PUT customer/_doc/2
{
  "name":"小明2"
}
// result
{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 2,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 4,
  "_primary_term" : 1
}

//----

$ POST customer/_doc
{
  "name":"小明3"
}
{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "KCbJhXgBy-3NOEnaoBlO",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 7,
  "_primary_term" : 1
}

2.1.3 查找 & 乐观锁修改

json
$ GET customer/_doc/2

{
  "_index" : "customer",	// 索引
  "_type" : "_doc",
  "_id" : "2",				// id
  "_version" : 4,			// 版本号，更新时候会变
  "_seq_no" : 6,			// 并发控制字段，每次更新就会+1，用来做乐观锁
  "_primary_term" : 1,		// 同上，主分片重新分配，如重启，就会变化
  "found" : true,
  "_source" : {				// 真正内容
    "name" : "小明2"
  }
}

乐观锁修改

json
$ PUT customer/_doc/2?if_seq_no=10&if_primary_term=1
{
"name":"小明2"
}
{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 6,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 11,
  "_primary_term" : 1
}
// 再次执行
$ PUT customer/_doc/2?if_seq_no=10&if_primary_term=1
{
  "name":"小明2"
}
{
  "error" : {
    "root_cause" : [
      {
        "type" : "version_conflict_engine_exception",
        "reason" : "[2]: version conflict, required seqNo [10], primary term [1]. current document has seqNo [11] and primary term [1]",
        "index_uuid" : "p1v3s28RSempuLo3UBQyOg",
        "shard" : "0",
        "index" : "customer"
      }
    ],
    "type" : "version_conflict_engine_exception",
    "reason" : "[2]: version conflict, required seqNo [10], primary term [1]. current document has seqNo [11] and primary term [1]",
    "index_uuid" : "p1v3s28RSempuLo3UBQyOg",
    "shard" : "0",
    "index" : "customer"
  },
  "status" : 409
}

2.1.4 更新文档

命令
```
http
POST /<target>/_update/<id> : 对比已经保存的数据，如果发生变化才会修改
(旧版命令：POST /<target>/<type>/<id>/_update)
```
- 对于没有变化的数据，result 为 noop，并且 result 和 _seq_no 不会发生变化
- 数据需要置于 doc 属性下
```
json
{
  "doc":{
    ${data}
  }
}
```
- 而 2.1.1 中提到的 POST不带 update 和PUT，每次都会修改数据，不会进行对比

示例

json
$ POST customer/_update/2
{
  "doc":{
    "name":"小明3"
  }
}
{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 7,
  "result" : "updated",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 12,
  "_primary_term" : 1
}

// 再执行一次
$ POST customer/_update/2
{
  "doc":{
    "name":"小明3"
  }
}
{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "2",
  "_version" : 7,  		
  "result" : "noop",
  "_shards" : {
    "total" : 0,
    "successful" : 0,
    "failed" : 0
  },
  "_seq_no" : 12,			
  "_primary_term" : 1
}

2.1.5 删除索引&文档

命令
```
http
delete /<target>/
```

示例

json
$ DELETE customer/_doc/1
{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "result" : "deleted",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 13,
  "_primary_term" : 1
}


GET customer/_doc/1
{
  "_index" : "customer",
  "_type" : "_doc",
  "_id" : "1",
  "found" : false
}

2.1.6 批量 API: _bulk

语法

http
POST <target>/_doc/_bulk
{"action":{metadata}}
{request body}
{"action":{metadata}}
{request body}
{"action":{metadata}}
{request body}

action:
- 创建： create index
- 删除：delete
- 更新: updat

示例

json
POST /_bulk
{"create":{"_index":"mytest","_id":1}}
{"name":"test1","age":11}
{"index":{"_index":"mytest","_id":2}}
{"name":"test2","age":12}
{"delete":{"_index":"mytest","_id":1}}
{"update":{"_index":"mytest","_id":2}}
{"doc":{"name":"test3","age":13}}
{"update":{"_index":"mytest","_id":1}}
{"doc":{"name":"test1","age":10}}

{
  "took" : 160,
  "errors" : true,
  "items" : [
    {
      "create" : {
        "_index" : "mytest",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 0,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "index" : {
        "_index" : "mytest",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 1,
        "result" : "created",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 1,
        "_primary_term" : 1,
        "status" : 201
      }
    },
    {
      "delete" : {
        "_index" : "mytest",
        "_type" : "_doc",
        "_id" : "1",
        "_version" : 2,
        "result" : "deleted",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 2,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "update" : {
        "_index" : "mytest",
        "_type" : "_doc",
        "_id" : "2",
        "_version" : 2,
        "result" : "updated",
        "_shards" : {
          "total" : 2,
          "successful" : 1,
          "failed" : 0
        },
        "_seq_no" : 3,
        "_primary_term" : 1,
        "status" : 200
      }
    },
    {
      "update" : {
      "_index" : "mytest",
        "_type" : "_doc",
        "_id" : "1",
        "status" : 404,
        "error" : {
          "type" : "document_missing_exception",
          "reason" : "[_doc][1]: document missing",
          "index_uuid" : "1L8DZPSITY6Xs91xR0xDbA",
          "shard" : "0",
          "index" : "mytest"
        }
      }
    }
  ]
}

导入 elastic 官方样本数据
- 下载：account.json
- 使用 POST {yourIndexName}/_bulk 请求执行样本数据 : 这里使用 POST bank/_bulk

2.2 进阶检索

ES 支持两种基本方式检索
1. 通过使用 REST request URI 发送搜索参数（url+检索参数）
2. 通过使用 REST request Body 来发送他们（url+请求体）：query-dsl
主要使用 query-dsl 进行检索
参考文档：getting-started-search , query-dsl，search-aggregations

2.2.1 query-dsl:基本语法

query-dsl 由 Leaf query clauses 和 Compound query clauses 两种子句组成

Leaf query clauses 叶子查询字句：Leaf query clauses 在指定的字段上查询指定的值, 如：match, term or range queries. 叶子字句可以单独使用. Compound query clauses 复合查询字句：以逻辑方式组合多个叶子、复合查询为一个查询

Query clauses behave differently depending on whether they are used in query context or filter context.

json
GET /bank/_search
{
  "query": {
    "match_all": {}
  },
  "from": 0,
  "size": 3,
  "sort": [
    {
      "age":"asc"
    }
  ]
}

2.2.2 query-dsl: match

json
{
  "query": {
    "match": {
      "FIELD": "TEXT"
    }
  }
}

用于匹配非字符串的字段，则为精确查询
用于匹配字符串的字段，则为全文检索
- 默认结果按评分排序

2.2.3 query-dsl: multi_match

语法

json
GET /<target>/_search
{
  "query": {
    "multi_match": {
      "query": "查询关键字",
      "fields": ["查询的字段1", "查询的字段2"]
    }
  }
}

示例

json
GET /bank/_search
{
  "query": {
    "multi_match": {
      "query": "mill",
      "fields": ["address", "city"]
    }
  }
}
// result
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 4,
      "relation" : "eq"
    },
    "max_score" : 5.4032025,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "970",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 970,
          "balance" : 19648,
          "firstname" : "Forbes",
          "lastname" : "Wallace",
          "age" : 28,
          "gender" : "M",
          "address" : "990 Mill Road",
          "employer" : "Pheast",
          "email" : "forbeswallace@pheast.com",
          "city" : "Lopezo",
          "state" : "AK"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "136",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 136,
          "balance" : 45801,
          "firstname" : "Winnie",
          "lastname" : "Holland",
          "age" : 38,
          "gender" : "M",
          "address" : "198 Mill Lane",
          "employer" : "Neteria",
          "email" : "winnieholland@neteria.com",
          "city" : "Urie",
          "state" : "IL"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "345",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 345,
          "balance" : 9812,
          "firstname" : "Parker",
          "lastname" : "Hines",
          "age" : 38,
          "gender" : "M",
          "address" : "715 Mill Avenue",
          "employer" : "Baluba",
          "email" : "parkerhines@baluba.com",
          "city" : "Blackgum",
          "state" : "KY"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "472",
        "_score" : 5.4032025,
        "_source" : {
          "account_number" : 472,
          "balance" : 25571,
          "firstname" : "Lee",
          "lastname" : "Long",
          "age" : 32,
          "gender" : "F",
          "address" : "288 Mill Street",
          "employer" : "Comverges",
          "email" : "leelong@comverges.com",
          "city" : "Movico",
          "state" : "MT"
        }
      }
    ]
  }
}

2.2.4 query-dsl: bool

复合查询，协助合并多个查询条件

语法

json
GET <target>/_search
{
    "query":{
        "bool":{
            "must":[
                // 必须条件
                {"match":{...}}
            ],
            "must_not":[
                // 必须否条件
                {"match":{...}}
            ],
            "should":[
                // 有更好条件
                {"match":{...}}
            ]
                 
        }
    }
}

示例

json
GET /bank/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "gender": "M"
          }
        },
        {
          "match": {
            "age": "38"
          }
        },
        {
          "match": {
            "address": "Mill"
          }
        }
      ],
      "should": [
        {
          "match": {
            "address": "Avenue"
          }
        }
      ]
    }
  }
}
// result
{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 2,
      "relation" : "eq"
    },
    "max_score" : 8.622485,
    "hits" : [
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "345",
        "_score" : 8.622485,
        "_source" : {
          "account_number" : 345,
          "balance" : 9812,
          "firstname" : "Parker",
          "lastname" : "Hines",
          "age" : 38,
          "gender" : "M",
          "address" : "715 Mill Avenue",
          "employer" : "Baluba",
          "email" : "parkerhines@baluba.com",
          "city" : "Blackgum",
          "state" : "KY"
        }
      },
      {
        "_index" : "bank",
        "_type" : "_doc",
        "_id" : "136",
        "_score" : 7.0824604,
        "_source" : {
          "account_number" : 136,
          "balance" : 45801,
          "firstname" : "Winnie",
          "lastname" : "Holland",
          "age" : 38,
          "gender" : "M",
          "address" : "198 Mill Lane",
          "employer" : "Neteria",
          "email" : "winnieholland@neteria.com",
          "city" : "Urie",
          "state" : "IL"
        }
      }
    ]
  }
}

2.2.5 query-dsl: filter

结果过滤

json
GET <target>/_search
{
    "query":{
        "bool":{
            "filter":{
                ....
            }
        }
    }
}

filter 不会计算相关性得分

2.2.6 query-dsl: term

精确查找

与 match 用法一致
只能用于 非text 字段的查找
文本的精确查找可以使用：match_phrase 或 <filed>.keyword

2.2.7 aggregations

提供了从数据中分组和提取数据的能力。最简单的聚合方法大致等于 SQL GROUP BY 和 SQL 聚合函数。

查询和多次聚合可以在同一个请求中，有利于简化api来避免网络多次往返。

语法

json
GET <target>/_search
{
  "aggs": {
    "agg-name": {
      "agg-type": {
        ...
      }
    }
  }
}

可以使用多个聚合并列
聚合里面可以再使用聚合

示例

json
// 查出所有年龄分布，并且这些年龄中M的平均薪资和F的平均薪资，以及在这个年龄段的总体平均薪资
GET bank/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "ageTerm": {
      "terms": {
        "field": "age",
        "size": 100
      },
      "aggs": {
        "genderTerm": {
          "terms": {
            "field": "gender.keyword",
        	"size": 100
          },
          "aggs": {
            "balanceAvg": {
              "avg": {
                "field": "balance"
              }
            }
          }
        },
        "balanceAvg":{
          "avg": {
            "field": "balance"
          }
        }
      }
    }
  },
  "size": 0
}
// result
{
  "took" : 6,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1000,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "ageTerm" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 31,
          "doc_count" : 61,
          "genderTerm" : {
            "doc_count_error_upper_bound" : 0,
            "sum_other_doc_count" : 0,
            "buckets" : [
              {
                "key" : "M",
                "doc_count" : 35,
                "balanceAvg" : {
                  "value" : 29565.628571428573
                }
              },
              {
                "key" : "F",
                "doc_count" : 26,
                "balanceAvg" : {
                  "value" : 26626.576923076922
                }
              }
            ]
          },
          "balanceAvg" : {
            "value" : 28312.918032786885
          }
        },
          ...
      ]
    }
  }
}

2.3 Mapping

参考文档：mapping
查询各个字段的类型
```
json
GET <target>/_mapping
```

创建 mapping

json
PUT /<target>
{
  "mappings": {
    "properties": {
      "filed1":    { "type": "integer" },  
      "filed2":  { "type": "keyword"  }, 
      "filed3":   { "type": "text"  }     
    }
  }
}

已经存在的不能修改

添加新的字段

json
PUT /<target>/_mapping
{
  "properties": {
    "filed4": {
      "type": "keyword",
      "index": false
    }
  }
}

修改映射

已经存在的字段不能修改映射
实在要修改映射，需要用正确的映射规则创建新的索引，并通过 reindex 迁移数据

json
POST _reindex
{
  "source": {
    "index": "my-index-000001",
    "type":"旧版可能会有类型这个字段"
  },
  "dest": {
    "index": "my-new-index-000001"
  }
}

参考：reindex

2.4 分词

使用 tokenizer(分词器) 将接受的字符流，切割成独立的 tokens （词元，通常是独立的单纯），容后输出 tokens 流
tokenizer 还负责记录各个 term(词条) 的顺序或 position 位置（用于 phrase 短语和 word proximity 词近邻查询）。以及 term 所代表的原始 word 的 start 和 end 的 character offsets （字符偏移量）（用于高亮显示搜索的内容）
Elasticsearch 提供了很多内置分词器，用以构建 custorn analyzers（自定义分词器）
参考文档：analysis

测试分词器

json
POST _analyze
{
  "analyzer": "standard",
  "text": "Hello World 2."
}

{
  "tokens" : [
    {
      "token" : "hello",
      "start_offset" : 0,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "world",
      "start_offset" : 6,
      "end_offset" : 11,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "2",
      "start_offset" : 12,
      "end_offset" : 13,
      "type" : "<NUM>",
      "position" : 2
    }
  ]
}

3. java-api

3.1 依赖

xml
<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>7.12.0</version>
</dependency>
<dependency>
    <groupId>org.elasticsearch</groupId>
    <artifactId>elasticsearch</artifactId>
    <version>7.12.0</version>
</dependency>

测试代码：MyClient.java

java
package com.study.elasticsearch.client;

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;

import java.io.IOException;

/**
 * @author XuZhuohao
 * @date 2021/4/7
 */
public class MyClient {
    public static void main(String[] args) throws IOException {
        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("192.168.240.63", 9200, "http")));
        client.close();
    }
}

输出

text
10:22:13.302 [main] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection manager is shutting down
10:22:13.309 [main] DEBUG org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager - Connection manager shut down