Elastic Search
- Docker中安装ElasticSearch
- Elastic Search API得使用
  - 创建Index:
  - 修改Index Mapping:
  - 修改Index Settings:
  - 创建Index模板:
  - 单条件查询: [query->term]
  - 多条件查询: [query->bool->must->term]
  - 对查询结果进行折叠去重一个字段: [collapse]
  - 对查询结果进行折叠去重两个字段: [collapse->inner_hits->collapse]
  - 对查询结果进行聚合实现Group BY: [aggerations]
  - 对查询结果进行聚合最大值/最小值: [aggs->min/max]
  - 对查询结果进行聚合时，需要使用其他数据: [aggs->top_hits]
  - 在查询Payload中写逻辑运算: [script]
  - 在更新Payload中写逻辑运算: [script]
  - 依据查询条件进行更新
  - 依据查询条件进行删除
  - 简单得分页查询：[from size]
  - 复杂得分页查询：[scroll]
  - 多条插入数据：[_bulk]
  - 重新索引：[reindex]
  - 查看所有index：[_cat/indices/]
  - 设置Cluster：[_cluster]
  - 删除所有生成的Scroll
  - 已知文档Id情况下存在更新，不存在插入数据[update]
- 提高 ES效率

Elastic Search

Docker中安装ElasticSearch

需要Java环境

下载tar.gz并解压,并移动

mv elasticsearch-7.1.0 /usr/local/elasticsearch

修改配置

vi /usr/local/elasticsearch/config/elasticsearch.yml

yml文件

network.host: 0.0.0.0
http.port: 9200
discovery.seed_hosts: ["127.0.0.1", "[::1]"]
# 7.1 版本即便不是多节点也需要配置一个单节点，否则
#the default discovery settings are unsuitable for production use; at least one of [discovery.seed_hosts, discovery.seed_providers, cluster.initial_master_nodes] must be configured
cluster.initial_master_nodes: ["node-1"]
# 配置indices fielddata得内存，超过80%就会释放
indices.fielddata.cache.size: 80%
# request数量使用内存限制，默认为JVM堆的40%。
indices.breaker.request.limit: 80%

创建一个非root用户elsearch来执行elasticsearch脚本。ES不能用root用户启动

# elasticsearch can not run elasticsearch as root
adduser elsearch # 会自动建组 test
# 将文件夹以及子文件夹全部该为test用户
chown -R elsearch:elsearch elasticsearch
ll
# drwxr-xr-x 1 elsearch elsearch 4096 May 28 16:54 elasticsearch

7.X新特性

removal mapping types官方：https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html

目前版本有一个默认的type _doc，使用api对文档操作的时候，也不需要在url上加入 type了，直接index即可，具体的api可以大部分都可以通过在url去掉type进行操作。

not_analyzed不存在了，如果需要不拆分

可以对index进行analyzer设置，将默认的analyzer设置成keyword就不会拆分了。
----------------------------------------------------------------
设置analyzer：需要先关闭index
1. POST http://server_ip/index_name/_close?pretty
2. PUT ： http://server_ip/index_name/_settings?prettyBODY:{"index":{"analysis" : {"analyzer" : {"default" : {"type" : "keyword"}}}}}
3. POST http://server_ip/index_name/_open?pretty

没有string这个 column type了。可以换成text或者keyword
在查询中，新增{"track_total_hits":true}，可以查询出total得总数。不会被限制成10000

Elastic Search API得使用

介绍本次BGP项目中使用到得API得使用方法以及某些特定Payload得写法

注：没有使用顺序，每个payload不是唯一得写法。

创建Index:

PUT your_server_ip:9200/index_name

Payload

{"settings": {"number_of_shards": 5,"analysis": {"analyzer": {"default": {"type": "keyword"}}},"refresh_interval": "30s","max_result_window" : "1000000","max_rescore_window": "1000000"},"mappings": {"properties": {"test": {"type": "keyword"}}
}

说明

settings设定index，mappings设置index得column
number_of_shards：分片数量，
analysis：此处是为了不适用分词，这个是7.x版本新的设置方式
refresh_interval：设置刷新时间，为了最大化_bulk得效率，最好设置30s左右
max_result_window：ES默认只能查询10000条数据，使用scroll API可以查询到max_result_window得数量得数据
max_rescore_window：rescore API使用，本次没有使用到

修改Index Mapping:

PUT/POST your_server_ip:9200/index_name/_mappings

Payload

{"properties": {"test": {"type": "keyword"}
}

说明

可以新增column
有一些字段类型得更改是不被允许得，只能使用_reindexAPI
直接在Payload中传入properties即可

修改Index Settings:

PUT/POST your_server_ip:9200/index_name/_settings

Payload

{"index":{"analysis" : {"analyzer" : {"default" : {"type" : "keyword"}}},"refresh_interval": "30s","max_result_window" : "1000000","max_rescore_window": "1000000"}
}

说明

可以一次性设置多个index，url中 index_name= index1,index2,index3
某些更改设置，必须使用_closeAPI关闭index，比如analysis

创建Index模板:

PUT your_server_ip:9200/_template/template_name

Payload

{"index_patterns": ["test*"],"settings": {"number_of_shards": 1,"analysis": {"analyzer": {"default": {"type": "keyword"}}},"refresh_interval": "30s","max_result_window" : "1000000","max_rescore_window": "1000000"},"mappings": {"properties": {"state": {"type": "keyword"}}}
}

说明

index_patterns表示以 test开头得index都拥有如下得setting与mapping
模板得意义在于，我们基于时间创建index时，不需要每次都添加index得setting和mapping