ElasticSearch的DSL

文章目录

查询各个节点状态
查看服务的健康状态
查询索引状态/查看有哪些索引
新增数据（幂等，即指定docid）
查询某索引的所有数据
查询索引的结构
删除索引
新增数据（非幂等，即不指定docid）
修改数据
删除一条数据（document）
查询某条数据（document）
分词查询
分词子属性查询
短语匹配
条件过滤
范围过滤
排序
分页查询
高亮显示
综合演练
中文分词
- 查看分词效果
- 构建中文分索引
索引别名
索引模板
- 创建索引模板
- 查看索引模板
- 删除索引模板

查询各个节点状态

GET _cat/nodes?v

查看服务的健康状态

GET _cat/health?v

查询索引状态/查看有哪些索引

GET _cat/indices?v

新增数据（幂等，即指定docid）

有如下对象关系

public class  Movie {String id;String name;Double doubanScore;List<Actor> actorList;
}public class Actor{String id;
String name;
}

数据如下

电影id	电影名称	豆瓣评分	演员信息
id	name	doubanScore	演员id	演员名称name
1	operation red sea	8.5	1	zhang yi
			2	hai qing
			3	zhang han yu
2	operation mekong river	8.0	3	zhang han yu
3	incident red sea	5.0	4	lao wang

operation red sea：红海行动
operation mekong river：湄公河行动
incident red sea：红海事件

需求：现给定索引名movie_index，把以上数据插入索引movie_index中(docid要求从1开始自增)

注意：这里的docid和数据中的id无任何关系

put /movie_index/_doc/1
{"id":1,"name":"operation red sea","doubanScore":8.5,"actorList":[{"id":1,"name":"zhang yi"},{"id":2,"name":"hai qing"},{"id":3,"name":"zhang han yu"}]
}
put /movie_index/_doc/2
{"id":2,"name":"operation mekong river","doubanScore":8.0,"actorList":[{"id":3,"name":"zhang han yu"}]
}
put /movie_index/_doc/3
{"id":3,"name":"incident red sea","doubanScore":5.0,"actorList":[{"id":4,"name":"lao wang"}]
}

查询某索引的所有数据

# 省略写法
GET movie_index/_search# 完整写法
GET movie_index/_search
{"query": {"match_all": {}}
}

结果

{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "movie_index","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"id" : 1,"name" : "operation red sea","doubanScore" : 8.5,"actorList" : [{"id" : 1,"name" : "zhang yi"},{"id" : 2,"name" : "hai qing"},{"id" : 3,"name" : "zhang han yu"}]}},{"_index" : "movie_index","_type" : "_doc","_id" : "2","_score" : 1.0,"_source" : {"id" : 2,"name" : "operation mekong river","doubanScore" : 8.0,"actorList" : [{"id" : 3,"name" : "zhang han yu"}]}},{"_index" : "movie_index","_type" : "_doc","_id" : "3","_score" : 1.0,"_source" : {"id" : 3,"name" : "incident red sea","doubanScore" : 5.0,"actorList" : [{"id" : 4,"name" : "lao wang"}]}}]}
}

查询索引的结构

类似于mysql的desc

GET movie_index/_mapping

结果

{"movie_index" : {"mappings" : {"properties" : {"actorList" : {"properties" : {"id" : {"type" : "long"},"name" : {"type" : "text","fields" : {"keyword" : {"type" : "keyword","ignore_above" : 256}}}}},"doubanScore" : {"type" : "float"},"id" : {"type" : "long"},"name" : {"type" : "text","fields" : {"keyword" : {"type" : "keyword","ignore_above" : 256}}}}}}
}

从索引的结构当中可以得到的信息有：

如果在插入数据时，不指定字段的数据类型，es会默认按照第一条数据进行推断，然后给出它认为的数据类型
es 默认会对字符串数据做倒排索引，不会对其它数据类型做倒排索引

删除索引

DELETE /movie_index

新增数据（非幂等，即不指定docid）

数据

电影id	电影名称	豆瓣评分	演员信息
id	name	doubanScore	演员id	演员名称
3	incident red sea	5.0	5	cui hua

注意：es默认会给个随机id

POST /movie_index/_doc/
{"id":3,"name":"incident red sea","doubanScore":5.0,"actorList":[{"id":5,"name":"cui hua"}]
}

修改数据

仅修改指定字段，其他字段保留原样

需求：把电影名为 incident red sea 的豆瓣评分修改为6.0分

POST movie_index/_update/3
{"doc": {"doubanScore":6.0}
}

删除一条数据（document）

需求：把 cui hua 参演电影的数据给删除掉

docid:wr0Bl4UBTlMIkXSYx4iC

DELETE movie_index/_doc/wr0Bl4UBTlMIkXSYx4iC

查询某条数据（document）

查询docid=1数据

GET movie_index/_doc/1

分词查询

查询电影名为 operation red sea 的电影

GET movie_index/_search
{"query": {"match": {"name": "operation red sea"}}
}

倒排索引过程解析

operation 在1号、2号文档中出现过
red 1 3
sea 1 3
mekong 2
river 2
incident 3

查询operation red sea是会进行分词 -> operation | red | sea

匹配：

operation -> 在1 2 号文档
red -> 1 3
sea -> 1 3

结果： 1 1 1 3 3 2 （operation red sea 该词条在1号文档中匹配成功3次、3号文档中匹配成功2次、2号文档中匹配成功1次，这个次数在es中对应着评分，评分越高的，越先被展示）

注意：es中评分机制很复杂，没有必要去深入了解

分词子属性查询

查询 zhang yi 参演的电影

GET movie_index/_search
{"query": {"match": {"actorList.name": "zhang yi"}}
}

短语匹配

查询电影名 red sea 的电影（要求：red sea 是一个整体，不能被分词）

GET movie_index/_search
{"query": {"match_phrase": {"name": "operation red"}}
}

条件过滤

eg1：过滤电影名字为 operation red sea 的电影

# 标准写法
GET movie_index/_search
{"query": {"bool": {"filter": [{"term": {"name.keyword": "operation red sea"}}]}}
}# 简略写法
GET movie_index/_search
{"query": {"term": {"name.keyword": {"value": "operation red sea"}}}
}

eg2：分词匹配电影名 red sea , 条件过滤演员名为 zhang han yu

分析：条件过滤演员名为 zhang han yu，会过滤出来2条数据，即1号、2号文档；在加上分词匹配电影名 red sea，那么就只会有一条数据了

GET movie_index/_search
{"query": {"bool": {"must": [{"match": {"name": "red sea"}}],"filter": [{"term": {"actorList.name.keyword": "zhang han yu"}}]}}
}

eg3：分词匹配电影名 red sea , 条件过滤演员名为 zhang han yu（只要满足过滤条件即可，满不满足分词匹配无所谓）

GET movie_index/_search
{"query": {"bool": {"should": [{"match": {"name": "red sea"}}],"filter": [{"term": {"actorList.name.keyword": "zhang han yu"}}]}}
}

范围过滤

筛选出豆瓣评分>=8.0&<=9.0分的电影

GET movie_index/_search
{"query": {"bool": {"filter": [{"range": {"doubanScore": {"gte": 8.0,"lte": 9.0}}}]}}
}

排序

筛选出豆瓣评分>=8.0&<=9.0分的电影，并且按照豆瓣评分升序排序

GET movie_index/_search
{"query": {"bool": {"filter": [{"range": {"doubanScore": {"gte": 8.0,"lte": 9.0}}}]}},"sort": [{"doubanScore": {"order": "asc"}}]
}

分页查询

参数解析

from：从哪条数据开始取
size：取多少条数据

从第一条数据开始取，每页取2条数据

GET movie_index/_search
{"from": 1,"size": 2
}

高亮显示

搜索电影名为 red sea的电影，并高亮显示

GET movie_index/_search
{"query": {"match": {"name": "red sea"}},"highlight": {"fields": {"name": {}}}
}

综合演练

eg1：查询doubanScore>=5.0、关键词搜索red sea、关键词高亮显示、结果显示第一页，每页2条数据，并按照doubanScore从小到大排序

GET movie_index/_search
{"query": {"bool": {"must": [{"match": {"name": "red sea"}}],"filter": [{"range": {"doubanScore": {"gte": 5.0}}}]}},"highlight": {"fields": {"name": {}}},"from": 0,"size": 2,"sort": [{"doubanScore": {"order": "asc"}}]
}

结果

{"took" : 4,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 2,"relation" : "eq"},"max_score" : null,"hits" : [{"_index" : "movie_index","_type" : "_doc","_id" : "3","_score" : null,"_source" : {"id" : 3,"name" : "incident red sea","doubanScore" : 5.0,"actorList" : [{"id" : 4,"name" : "lao wang"}]},"highlight" : {"name" : ["incident <em>red</em> <em>sea</em>"]},"sort" : [5.0]},{"_index" : "movie_index","_type" : "_doc","_id" : "1","_score" : null,"_source" : {"id" : 1,"name" : "operation red sea","doubanScore" : 8.5,"actorList" : [{"id" : 1,"name" : "zhang yi"},{"id" : 2,"name" : "hai qing"},{"id" : 3,"name" : "zhang han yu"}]},"highlight" : {"name" : ["operation <em>red</em> <em>sea</em>"]},"sort" : [8.5]}]}
}

eg2：聚合-计算每个演员共参演多少部电影

GET movie_index/_search
{"aggs": {"groupByAcotrName": {"terms": {"field": "actorList.name.keyword","size": 10}}},"size": 0
}# 参数解释：
# 我们指定分组后，es会自动帮我们计算每个分组中数据的个数，不用我们写countGET movie_index/_search
{"aggs": {"groupByAcotrName": { # 自定义结果字段名称"terms": {"field": "actorList.name.keyword", # 必须是keyword"size": 10 # 结果分为多少个组，这个数字必须必实际组数要大（es的分组不像mysql中，会自动计算分为多少个组，需要我们进行预估）}}
},
"size": 0 # 屏蔽明细数据
}

eg3：每个演员参演电影的平均分是多少，并按平均分降序排序

GET movie_index/_search
{"aggs": {"groupByActorName": {"terms": {"field": "actorList.name.keyword","size": 10,"order": {"avgDoubanScore": "desc"}},"aggs": {"avgDoubanScore": {"avg": {"field": "doubanScore"}}}}
},
"size": 0
}

中文分词

查看分词效果

GET _analyze
{"text": "红海战役"
}

结果

{"tokens": [{"token": "红","start_offset": 0,"end_offset": 1,"type": "<IDEOGRAPHIC>","position": 0},{"token": "海","start_offset": 1,"end_offset": 2,"type": "<IDEOGRAPHIC>","position": 1},{"token": "战","start_offset": 2,"end_offset": 3,"type": "<IDEOGRAPHIC>","position": 2},{"token": "役","start_offset": 3,"end_offset": 4,"type": "<IDEOGRAPHIC>","position": 3}]
}

默认是按照每个字进行分词，如果想要按照词组进行分词，那需要借助第三方的的词库 elasticsearch-analysis-ik-7.8.0.zip

安装第三方插件ik后，进行效果测试

查看分词效果：指定分析器为 ik_smart

GET _analyze
{"text": "我是中国人","analyzer": "ik_smart"
}

结果

{"tokens": [{"token": "我","start_offset": 0,"end_offset": 1,"type": "CN_CHAR","position": 0},{"token": "是","start_offset": 1,"end_offset": 2,"type": "CN_CHAR","position": 1},{"token": "中国人","start_offset": 2,"end_offset": 5,"type": "CN_WORD","position": 2}]
}

查看分词效果：指定分析器为 ik_smart

GET _analyze
{"text": "我是中国人","analyzer": "ik_max_word"
}

结果

{"tokens" : [{"token" : "我","start_offset" : 0,"end_offset" : 1,"type" : "CN_CHAR","position" : 0},{"token" : "是","start_offset" : 1,"end_offset" : 2,"type" : "CN_CHAR","position" : 1},{"token" : "中国人","start_offset" : 2,"end_offset" : 5,"type" : "CN_WORD","position" : 2},{"token" : "中国","start_offset" : 2,"end_offset" : 4,"type" : "CN_WORD","position" : 3},{"token" : "国人","start_offset" : 3,"end_offset" : 5,"type" : "CN_WORD","position" : 4}]
}

构建中文分索引

如果是中文的话，在建立索引时，就不能不去指定字段的类型了

1）建立索引时，需要先指定数据中字段的类型

PUT movie_index_cn
{ "mappings": {"properties": {"id":{"type": "long"},"name":{"type": "text", "analyzer": "ik_smart" },"doubanScore":{"type": "double"},"actorList":{"properties": {"id":{"type":"long"},"name":{"type":"keyword" }}}}   }
}

2）插入数据

PUT /movie_index_cn/_doc/1
{ "id":1,"name":"红海行动","doubanScore":8.5,"actorList":[  {"id":1,"name":"张译"},{"id":2,"name":"海清"},{"id":3,"name":"张涵予"}]
}
PUT /movie_index_cn/_doc/2
{"id":2,"name":"湄公河行动","doubanScore":8.0,"actorList":[  {"id":3,"name":"张涵予"}]
}PUT /movie_index_cn/_doc/3
{"id":3,"name":"红海事件","doubanScore":5.0,"actorList":[{"id":4,"name":"老王"}]
}

3）数据查询

PUT /movie_index_cn/_doc/1
{ "id":1,"name":"红海行动","doubanScore":8.5,"actorList":[  {"id":1,"name":"张译"},{"id":2,"name":"海清"},{"id":3,"name":"张涵予"}]
}
PUT /movie_index_cn/_doc/2
{"id":2,"name":"湄公河行动","doubanScore":8.0,"actorList":[  {"id":3,"name":"张涵予"}]
}PUT /movie_index_cn/_doc/3
{"id":3,"name":"红海事件","doubanScore":5.0,"actorList":[{"id":4,"name":"老王"}]
}

4）结果

索引别名

1）给多个索引分组

分割索引可以解决数据结构变更的场景，但是在频繁的分割索引这种情况下，如果想要统计一个大周期的数据(例如季度、年)，数据是分散到不同的索引中的，统计比较麻烦。为了解决这个问题，我们可以为分割的索引取相同的别名，这样我们在统计时直接指定别名即可

2）给索引的一个子集创建视图

如果在某部电影中，电影演员-张涵予，它的演技得到了众人的认可，那么众人就会搜该演员都演过哪些电影，针对此种场景，我们就可以为张涵予参演的所有电影建立一个索引，来提升查询速度

POST _aliases
{"actions": [{"add": {"index": "movie_index","alias": "movie_index_zhy","filter": {"term": {"actorList.name.keyword": "zhang han yu"}}}}]
}# 数据查询
GET movie_index_zhy/_search

结果

3）在运行的集群中可以无缝的从一个索引切换到另一个索引

为已存在的索引增加别名，即为先为英文数据的索引movie_index建立别名

POST _aliases
{"actions": [{"add": {"index": "movie_index","alias": "movie_index_b"}}]
}

无缝的从一个索引切换到另一个索引

现在李四的英文水平不好，它只想看到中文的数据，那么我们就需要把索引别名movie_index_b从原来指向movie_index，修改为movie_index_cn

POST _aliases
{"actions": [{"remove": {"index": "movie_index","alias": "movie_index_b"}},{"add": {"index": "movie_index_cn","alias": "movie_index_b"}}]
}# 数据查询
GET movie_index_b/_search

结果

{"took" : 3,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 3,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "movie_index_cn","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"id" : 1,"name" : "红海行动","doubanScore" : 8.5,"actorList" : [{"id" : 1,"name" : "张译"},{"id" : 2,"name" : "海清"},{"id" : 3,"name" : "张涵予"}]}},{"_index" : "movie_index_cn","_type" : "_doc","_id" : "2","_score" : 1.0,"_source" : {"id" : 2,"name" : "湄公河行动","doubanScore" : 8.0,"actorList" : [{"id" : 3,"name" : "张涵予"}]}},{"_index" : "movie_index_cn","_type" : "_doc","_id" : "3","_score" : 1.0,"_source" : {"id" : 3,"name" : "红海事件","doubanScore" : 5.0,"actorList" : [{"id" : 4,"name" : "老王"}]}}]}
}

索引模板

创建索引模板

PUT _template/template_movie1111
{"index_patterns": ["movie_test*"],"settings": {"number_of_shards": 1}, "aliases" : { "{index}-query": {},"movie_test2-query":{}},"mappings": {                                 "properties": {"id": {"type": "keyword"},"movie_name": {"type": "text","analyzer": "ik_smart"}}}
}PUT movie_test1/_doc/111
{"id":"1001","movie_name":"大宅门"
}# 参数解释：
PUT _template/template_movie1111
{"index_patterns": ["movie_test*"], # 新建索引时，只要索引名以movie_test开头，后边任意字符，都会遵循这个模板，例如movie_test1、movie_testj、movie_test_aa都要遵循这个模板"settings": {"number_of_shards": 1}, "aliases" : { "{index}-query": {}, # 索引别名1"movie_test2-query":{} # 索引别名2},"mappings": {  # 在mappings中指定各字段的类型                           "properties": {"id": {"type": "keyword"},"movie_name": {"type": "text","analyzer": "ik_smart"}}}
}

查询数据的几种方式

查看索引模板

GET  _cat/templates?v

删除索引模板

我的索引目标名template_movie

DELETE _template/template_movie