Elasticsearch文档高级操作
文章目录
- ES条件查询
- URL传参查询
- 请求体传参查询
- ES分页查询
- ES查询排序
- ES多条件查询
- 同时满足must
- 部分满足should
- 范围条件filter
- ES全文检索
- 全文检索match
- 完全匹配match_phrase
- 高亮查询highlight
- ES聚合查询
- 分组统计terms
- 取平均值avg
- ES文档映射关系
- 准备测试数据
- 创建映射关系
- 插入测试数据
- 测试
- 根据name查询
- 根据`keyword`类型的`sex`进行检索
- 总结
ES条件查询
在之前的文档操作中,仅使用了id
和全查
的方式,但是这样的查询在实践中往往不能够满足多样化的检索需求。根据条件检索是非常必须的。
URL传参查询
条件查询的URL地址格式为http://127.0.0.1:9200/索引/_search?q=查询条件
.
例如:直接查询author
为PPPsych
的所有数据,具体步骤为:
URL地址为
http://127.0.0.1:9200/golang/_search?q=author:PPPsych
.查询的请求方式为
GET
发起请求,获取响应结果:
{"took": 16,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 3,"relation": "eq"},"max_score": 0.25613075,"hits": [{"_index": "golang","_type": "_doc","_id": "003","_score": 0.25613075,"_source": {"name": "Psych","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}},{"_index": "golang","_type": "_doc","_id": "001","_score": 0.25613075,"_source": {"name": "ROOT","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}},{"_index": "golang","_type": "_doc","_id": "002","_score": 0.25613075,"_source": {"name": "root","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}}]} }
请求体传参查询
上面的例子可以看出,在url地址中发送请求条件格式过于复杂,而且还存在在url地址中传入中文等字符,这样的操作很容易出现乱码等问题导致查询出错。所以我们可以考虑使用请求体传参的方式来规避这些问题,ES的条件查询也是支持这样的操作的。
URL地址就比较固定了,URL地址为:http://127.0.0.1:9200/golang/_search
请求的body
结构如下:
查询条件的最外层为query
, 匹配条件的字段为match
, match
下是一个对象,可以天上响应的字段内容信息。
发起请求,获取响应结果:
{"took": 2,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 3,"relation": "eq"},"max_score": 0.25613075,"hits": [{"_index": "golang","_type": "_doc","_id": "003","_score": 0.25613075,"_source": {"name": "Psych","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}},{"_index": "golang","_type": "_doc","_id": "001","_score": 0.25613075,"_source": {"name": "ROOT","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}},{"_index": "golang","_type": "_doc","_id": "002","_score": 0.25613075,"_source": {"name": "root","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}}]}
}
和普通的条件相同,结果相同。
ES分页查询
分页的方式其实就是在条件查询的内容体中加入分页数据。
在body体中添加两个关键字段:
from
:起始的下标值,从0
开始。size
: 一页的数量
例如,查询author
为PPPsych
的所有数据,其中请求体body
为:
{"query":{"match": {"author":"PPPsych"}},"from":1,"size":1
}
响应结果为:
{"took": 3,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 3,"relation": "eq"},"max_score": 0.25613075,"hits": [{"_index": "golang","_type": "_doc","_id": "001","_score": 0.25613075,"_source": {"name": "ROOT","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}}]}
}
因为使用的是size =1
,所以只返回匹配的一条数据。其实页的计算就应该是:(页码-1)*单页数量
。
在此基础上我们可以修改返回的内容
比如我们期望查询部分字段,而并非所有字段的时候,可以通过_source
字段进行控制。
比如仅查询author
为PPPsych
的所有数据中的name
字段:
请求体body
为:
{"query":{"match": {"author":"PPPsych"}},"from":1,"size":1,"_source":"name"
}
运行结果:
这样就可以得到我们想要查询的字段
ES查询排序
根据文档的部分字段信息排序。在查询条件中加入字段:sort
例如通过id
排序,请求体为:
{"query":{"match_all": {}},"sort":{"_id":{"order":"asc"}}
}
运行结果:
{"took": 22,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 4,"relation": "eq"},"max_score": null,"hits": [{"_index": "golang","_type": "_doc","_id": "001","_score": null,"_source": {"name": "ROOT","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"},"sort": ["001"]},{"_index": "golang","_type": "_doc","_id": "002","_score": null,"_source": {"name": "root","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"},"sort": ["002"]},{"_index": "golang","_type": "_doc","_id": "003","_score": null,"_source": {"name": "Psych","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"},"sort": ["003"]},{"_index": "golang","_type": "_doc","_id": "005","_score": null,"_source": {"name": "Morax","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "钟离"},"sort": ["005"]}]}
}
其中排序字段下可以指定排序的类型,递增、递减。
asc
: 递增排序desc
:递减排序
ES多条件查询
同时满足must
查询的body体的查询不能在使用match
关键字了,而需要使用bool
然后要多个条件同时成立,接下来要填入must
。
例如请求体body
为:
{"query":{"bool": {"must":[{"match":{"author":"PPPsych"}}]}}
}
以上看起来就相当于是单条件要一样的效果。然后我们再加一个条件。如下:
{"query":{"bool": {"must":[{"match":{"author":"PPPsych"}},{"match":{"name":"Psych"}}]}}
}
运行结果:
{"took": 6,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 1,"relation": "eq"},"max_score": 0.9492779,"hits": [{"_index": "golang","_type": "_doc","_id": "003","_score": 0.9492779,"_source": {"name": "Psych","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}}]}
}
返回结果就同时满足了 "author":"PPPsych"
和"name":"Psych"
。
部分满足should
部分满足类似于sql语句中的or
.
比如我们要查author
为PPPsych
和钟离
的.
请求体body
为:
{"query":{"bool": {"should":[{"match":{"author":"PPPsych"}},{"match":{"author":"钟离"}}]}}
}
运行结果:
{"took": 192,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 4,"relation": "eq"},"max_score": 2.3842063,"hits": [{"_index": "golang","_type": "_doc","_id": "005","_score": 2.3842063,"_source": {"name": "Morax","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "钟离"}},{"_index": "golang","_type": "_doc","_id": "003","_score": 0.25613075,"_source": {"name": "Psych","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}},{"_index": "golang","_type": "_doc","_id": "001","_score": 0.25613075,"_source": {"name": "ROOT","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}},{"_index": "golang","_type": "_doc","_id": "002","_score": 0.25613075,"_source": {"name": "root","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}}]}
}
范围条件filter
还可以为查询条件添加filter
来控制查询的范围
例如在一个商品索引中查询品类为床上用品和3C数码的商品,并且期望查询的价格是500以上的产品,则此时的body
请求体为:
{"query":{"bool": {"should":[{"match":{"category":"床上用品"}},{"match":{"category":"3C数码"}}],"filter":{"range":{"price":{"gt":500}}}}}
}
ES全文检索
全文检索match
在之前我们查询过author
为PPPsych
和钟离
的字段,那当我们查询字段查author=PPPsych钟离
呢,是否还能查询出结果?我们知道
如果是传统的关系型数据库应该是无法查询到结果的。
请求体body为:
{"query":{"match": {"author":"PPPsych钟离"}}
}
运行结果:
{"took": 2,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 4,"relation": "eq"},"max_score": 2.3842063,"hits": [{"_index": "golang","_type": "_doc","_id": "005","_score": 2.3842063,"_source": {"name": "Morax","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "钟离"}},{"_index": "golang","_type": "_doc","_id": "003","_score": 0.25613075,"_source": {"name": "Psych","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}},{"_index": "golang","_type": "_doc","_id": "001","_score": 0.25613075,"_source": {"name": "ROOT","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}},{"_index": "golang","_type": "_doc","_id": "002","_score": 0.25613075,"_source": {"name": "root","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "PPPsych"}}]}
}
从上面可以看出,查询出了4条结果,其中包含了author
为PPPsych
和钟离
两大类的数据。 从现象来看,其实我们的查询条件也被做了全文检索的分词处理,执行查询的操作上把分词后的结果再到倒排索引中去做一次匹配检索得出最后的匹配结果,这就是全文检索的效果了。
完全匹配match_phrase
如果我们不期望使用全文检索,而是直接完全匹配我们要查询的字段的话,可以使用完全匹配
。 也就是不能使用match
字段,而改用match_phrase
。那把检索的内容体改为:
{"query":{"match_phrase": {"author":"PPPsych钟离"}}
}
运行结果:
{"took": 75,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 0,"relation": "eq"},"max_score": null,"hits": []}
}
这个时候ES
中没有author
为PPPsych钟离
的文档了,所以我们检索不到数据。
那如果我们把author
改为钟
呢?是否可以像关系型数据库那样做一个like
模糊查询?
{"query":{"match_phrase": {"author":"钟"}}
}
运行结果:
{"took": 1,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 1,"relation": "eq"},"max_score": 1.1921031,"hits": [{"_index": "golang","_type": "_doc","_id": "005","_score": 1.1921031,"_source": {"name": "Morax","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "钟离"}}]}
}
从执行结果上看已经查询出了相应结果。那说明这里的完全匹配只是不对我们的查询条件做分词处理。在后端查询的时候其实和关系型数据库的like '%钟%'
的效果是一致的。
高亮查询highlight
如果我们查询的结果期望像百度谷歌那样,对匹配的关键字进行高亮显示的话,可以使用查询条件中的higlight
条件。
请求体body
:
{"query":{"match_phrase": {"author":"钟"}},"highlight": {"fields": {"author": {}}}
}
运行结果:
{"took": 27,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 1,"relation": "eq"},"max_score": 1.1921031,"hits": [{"_index": "golang","_type": "_doc","_id": "005","_score": 1.1921031,"_source": {"name": "Morax","url": "https://blog.csdn.net/qq_39280718?type=blog","author": "钟离"},"highlight": {"author": ["<em>钟</em>离"]}}]}
}
从上可以看出,查询的响应内容上多出了一个highlight
字段。并且在该字段中将匹配的关键词加入了em
标签。
ES聚合查询
对查询的结果进行统计,分组等操作的时候就需要用的聚合操作, 聚合操作需要用到聚合操作对应的参数。参数字段名叫:aggs
。
分组统计terms
现有一个索引shopping,文档内容为:
{"took": 654,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 3,"relation": "eq"},"max_score": 1.0,"hits": [{"_index": "shopping","_type": "_doc","_id": "1","_score": 1.0,"_source": {"title": "sk2爽肤水","category": "护肤品","image": "https://www.tb.com","price": 899.0}},{"_index": "shopping","_type": "_doc","_id": "2","_score": 1.0,"_source": {"title": "眼霜","category": "化妆品","image": "https://www.tb.com","price": 249.0}},{"_index": "shopping","_type": "_doc","_id": "3","_score": 1.0,"_source": {"title": "拍立得","category": "数码产品","image": "https://www.tb.com","price": 299.0}}]}
}
此时我们对价格进行分组统计,则请求体body
为:
{"aggs": { // 聚合操作"category_group":{ // 名称,随意取名"terms":{"field":"price"}}}
}
进行get请求运行结果:
{"took": 16,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 3,"relation": "eq"},"max_score": 1.0,"hits": [{"_index": "shopping","_type": "_doc","_id": "1","_score": 1.0,"_source": {"title": "sk2爽肤水","category": "护肤品","image": "https://www.tb.com","price": 899.0}},{"_index": "shopping","_type": "_doc","_id": "2","_score": 1.0,"_source": {"title": "眼霜","category": "化妆品","image": "https://www.tb.com","price": 249.0}},{"_index": "shopping","_type": "_doc","_id": "3","_score": 1.0,"_source": {"title": "拍立得","category": "数码产品","image": "https://www.tb.com","price": 299.0}}]},"aggregations": {"category_group": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": 249.0,"doc_count": 1},{"key": 299.0,"doc_count": 1},{"key": 899.0,"doc_count": 1}]}}
}
则会生成针对价格进行的分组信息。
从返回结果中还包含了元素的数据信息。如果要取消原数据的获取,那再添加一个size
参数即可。
运行结果:
{"took": 4,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 3,"relation": "eq"},"max_score": null,"hits": []},"aggregations": {"category_group": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": 249.0,"doc_count": 1},{"key": 299.0,"doc_count": 1},{"key": 899.0,"doc_count": 1}]}}
}
取平均值avg
获取价格的平均值,请求体body
为:
运行结果为:
{"took": 2,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 3,"relation": "eq"},"max_score": null,"hits": []},"aggregations": {"category_group": {"value": 482.3333333333333}}
}
ES文档映射关系
mapping
:映射关系。
类比关系型数据库,我们在插入数据之前我们需要首先去创建表结构, 而我们以上对文档的操作却一路没有进行结构的创建,其实在ES
中确实可以不创建类似于表结构的东西,但是他也是可以创建表结构的。
在ES
中这个表结构叫着映射。它主要的作用就是用于定义字段是否被分词
和被检索
。
准备测试数据
首先创建一个新的索引student
.
创建映射关系
创建新索引之后我们再新索引上建立映射关系。
建立映射关系同样要使用PUT
请求,请求的URL地址:http://127.0.0.1:9200/student/_mapping
插入测试数据
插入三条数据:
数据一:
{"name": "可莉","sex": "女","tel": "14100000000"
}
数据二:
{"name": "魈","sex": "男","tel": "16100000000"
}
数据三:
{"name": "一斗","sex": "男","tel": "19100000000"
}
查询一下已有数据:
{"took": 554,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 3,"relation": "eq"},"max_score": 1.0,"hits": [{"_index": "student","_type": "_doc","_id": "1","_score": 1.0,"_source": {"name": "可莉","sex": "女","tel": "14100000000"}},{"_index": "student","_type": "_doc","_id": "2","_score": 1.0,"_source": {"name": "魈","sex": "男","tel": "16100000000"}},{"_index": "student","_type": "_doc","_id": "3","_score": 1.0,"_source": {"name": "一斗","sex": "男","tel": "19100000000"}}]}
}
从结果上看已经将测试数据插入成功了。
测试
根据name查询
根据查询的返回结果可以看出name
字段支持全量查询,即验证了text
类型是支持全量查询的。
根据keyword
类型的sex
进行检索
首先将sex查询的内容设置为男
:
{"query":{"match":{"sex":"男"}}
}
查询结果:
{"took": 1,"timed_out": false,"_shards": {"total": 1,"successful": 1,"skipped": 0,"failed": 0},"hits": {"total": {"value": 2,"relation": "eq"},"max_score": 0.4700036,"hits": [{"_index": "student","_type": "_doc","_id": "2","_score": 0.4700036,"_source": {"name": "魈","sex": "男","tel": "16100000000"}},{"_index": "student","_type": "_doc","_id": "3","_score": 0.4700036,"_source": {"name": "一斗","sex": "男","tel": "19100000000"}}]}
}
从结果上可以看出, 该字段精准的匹配了为男
的性别,而没匹配上另外数据。也就是说,keyword
类型的字段是不会进行分词存储的。
index
为true
的字段是能被检索的
其实,当index
设置为了false
的字段就不能被检索,这里就不再测试。
总结
text
类型- 会进行分词,分词后建立索引。
- 支持模糊查询,支持精准查询。
- 不支持聚合查询。
keyword
类型- 不分词,直接建立索引。
- 支持模糊查询, 支持准确查询。
- 支持聚合查询。
index
- 控制是否可以被用于检索
false
, 不能被用于检索true
, 可以被用于检索
Elasticsearch文档高级操作相关推荐
- Elasticsearch文档CURD操作
一: 新增文档POST /{index}/{type} 或 PUT /{index}/{type}/{id} 注意:新增文档时可以显式指定id,id可以是数字也可以是字符串,如果不显示指定id,系统会 ...
- ElasticSearch 文档的添加、获取、更新、删除_05
文章目录 新建文档 获取文档 批量获取 文档更新 查询更新 删除文档 批量操作 新建文档 首先新建一个索引. 然后向索引中添加一个文档: PUT blog/_doc/1 {"title&qu ...
- 财务软件应该如何搭建产品常见问题文档/用户操作手册?
财务软件是比较常见的企业管理软件,主要针对企业的财务账目.资金账户.收支状况等进行管理,随着互联网发展,在线的财务软件能够帮助企业更好的进行管理,许多企业都选择采购相应的财务软件,提高企业内部财务管理 ...
- MSDN Visual系列:创建Feature扩展SharePoint列表项或文档的操作菜单项
原文:http://msdn2.microsoft.com/en-us/library/bb418731.aspx 在SharePoint中我们可以通过创建一个包含CustomAction元素定义的F ...
- python 写入excel 日期_Python实例:excel文档写入操作
来自PythonABC.org老师的课程很好,但是每个视频都蛮长的,听着听着就有些晕乎,所以根据视频自己整理了一下,以便记录 学习使用Python实现excel的文档写操作 import openpy ...
- python 读取word_教你怎么使用 Python 对 word文档 进行操作
使用Python对word文档进行操作 一.安装Python-docx Python-docx是专门针对于word文档的一个模块,只能读取docx 不能读取doc文件.说白了,python就相当于wi ...
- MongoDB文档查询操作(三)
关于MongoDB中的查询,我们已经连着介绍了两篇文章了,本文我们来介绍另外一个查询概念游标. 本文是MongoDB系列的第七篇文章,了解前面的文章有助于更好的理解本文: 1.Linux上安装Mong ...
- MongoDB文档查询操作(一)
上篇文章我们主要介绍了MongoDB的修改操作,本文我们来看看查询操作. 本文是MongoDB系列的第五篇文章,了解前面的文章有助于更好的理解本文: 1.Linux上安装MongoDB 2.Mongo ...
- python排版word文档命令方法大全_教你怎么使用Python对word文档进行操作
使用Python对word文档进行操作 一.安装Python-docx Python-docx是专门针对于word文档的一个模块,只能读取docx 不能读取doc文件.说白了,python就相当于wi ...
最新文章
- 字符串匹配数据结构 --Trie树 高效实现搜索词提示 / IDE自动补全
- 优秀的API接口设计原则及方法
- 【Android Studio安装部署系列】十三、Android studio添加和删除Module 2
- Http协议Get方式获取图片
- ORACLE SQLSERVER2005分页
- AE插件自动创建图层工具LayerGenerators使用教程
- 加密狗登录PHP开发,C# 使用加密狗登录 示例源码
- Python:20行代码爬取高质量帅哥美女视频,让你一次看个够
- svn 分支 合并
- element ui 兼容低版本浏览器
- mac下chrome导入证书
- 根据起始时间和结束时间得到期间所有的日期集合
- Houdini实现AO效果
- urllib和urllib2的区别(很全面详细!)
- 秉火429笔记之七位带操作
- CVPR 2019 Oral 论文精选汇总,值得一看的 CV 论文都在这里(持续更新中)
- ComposeUI——日历控件(CalendarComponent)
- Imagery in Action | Week3 无人机数据
- 2000-2021年各省GDP包括名义GDP、实际GDP、GDP平减指数(以2000年为基期)
- 奇瑞新能源又一款新车上市 奇瑞无界Pro炫酷来袭
热门文章
- Java 布尔值(Boolean)
- 小学计算机课在玩中学,小鹿编程“玩中学、学中玩”趣味课程让孩子爱上学习...
- 2022年浙江大学计算机考研初试成绩多久出来?
- Vue项目中使用组件库cube-ui
- HTML5基础知识,3D动画效果实现,定位,弹性布局以及CSS样式的设定,响应式,移动端
- 利用Excel解决日期问题
- CANopen个人之所见,所想
- Unity3d中ScrollView鼠标滚轮滚动慢或滚不动
- freeswitch录音
- United Plugins促销:Autoformer 智能动态处理插件 50% 折扣!