elasticsearch学习笔记--聚合函数篇
Elasticsearch 有一个功能叫聚合(aggregations),允许我们基于数据生成一些精细的分析结果。聚合与 SQL 中的
GROUP BY 类似但更强大。
首先看一下我当前megacorp索引下employeetype中的数据,执行如下语句:
语句1:
GET /megacorp/employee/_search
{"query": {"match_all": {}}
}
结果:
{"took": 0,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 1,"hits": [{"_index": "megacorp","_type": "employee","_id": "2","_score": 1,"_source": {"first_name": "Jane","last_name": "Smith","age": 32,"about": "I like to collect rock albums","interests": ["music"]}},{"_index": "megacorp","_type": "employee","_id": "1","_score": 1,"_source": {"first_name": "John","last_name": "Smith","age": 25,"about": "I love to go rock climbing","interests": ["sports","music"]}},{"_index": "megacorp","_type": "employee","_id": "3","_score": 1,"_source": {"first_name": "Douglas","last_name": "Fir","age": 35,"about": "I like to build cabinets","interests": ["forestry"]}}]}
}
正文:
举个例子,基于上述数据挖掘出雇员中最受欢迎的兴趣爱好:
语句2:
GET /megacorp/employee/_search
{"aggs": {"all_interests": {"terms": { "field": "interests" }}}
}
查询结果如下:
{..."hits": { ... },"aggregations": {"all_interests": {"buckets": [{"key": "music","doc_count": 2},{"key": "forestry","doc_count": 1},{"key": "sports","doc_count": 1}]}}
}
结论:统计所有实体的interests的具体项目和每个项目的个数。
需要说明的是在执行语句2之前需要先执行一段语句(至于why?可以参考我的另一篇博文):
PUT megacorp/_mapping/employee/
{"properties": {"interests": { "type": "text","fielddata": true}}
}
该语句的目的是使得megacorp索引下employee 类型中的interests字段可以使用聚合函数聚合(**all_**interests),同理其他字段在使用聚合函数时也必须执行如上语句,比如对last_name想使用聚合函数,就必须执行如下语句:
PUT megacorp/_mapping/employee/
{"properties": {"last_name": { "type": "text","fielddata": true}}
}
聚合函数有很多种,比如还有avg_interests。
另外如果想知道姓为Smith 的雇员中最受欢迎的兴趣爱好,可以直接添加适当的查询来组合查询:
GET /megacorp/employee/_search
{"query": {"match": {"last_name": "smith"}},"aggs": {"all_interests": {"terms": {"field": "interests"}}}
}
结果:
{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 2,"max_score": 0.2876821,"hits": [{"_index": "megacorp","_type": "employee","_id": "2","_score": 0.2876821,"_source": {"first_name": "Jane","last_name": "Smith","age": 32,"about": "I like to collect rock albums","interests": ["music"]}},{"_index": "megacorp","_type": "employee","_id": "1","_score": 0.2876821,"_source": {"first_name": "John","last_name": "Smith","age": 25,"about": "I love to go rock climbing","interests": ["sports","music"]}}]},"aggregations": {"all_interests": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "music","doc_count": 2},{"key": "sports","doc_count": 1}]}}
}
聚合还支持分级汇总 。比如,查询特定兴趣爱好员工的平均年龄:
GET /megacorp/employee/_search
{"aggs" : {"all_interests" : {"terms" : { "field" : "interests" },"aggs" : {"avg_age" : {"avg" : { "field" : "age" }}}}}
}
结果:
{"took": 1,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 1,"hits": [{"_index": "megacorp","_type": "employee","_id": "2","_score": 1,"_source": {"first_name": "Jane","last_name": "Smith","age": 32,"about": "I like to collect rock albums","interests": ["music"]}},{"_index": "megacorp","_type": "employee","_id": "1","_score": 1,"_source": {"first_name": "John","last_name": "Smith","age": 25,"about": "I love to go rock climbing","interests": ["sports","music"]}},{"_index": "megacorp","_type": "employee","_id": "3","_score": 1,"_source": {"first_name": "Douglas","last_name": "Fir","age": 35,"about": "I like to build cabinets","interests": ["forestry"]}}]},"aggregations": {"all_interests": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "music","doc_count": 2,"avg_age": {"value": 28.5 }},{"key": "forestry","doc_count": 1,"avg_age": {"value": 35 }},{"key": "sports","doc_count": 1,"avg_age": {"value": 25 }}]}}
}
上面的语句的意思是统计具体的每种兴趣爱好喜欢的人数以及这些人的平均年龄。
elasticsearch学习笔记--聚合函数篇相关推荐
- ElasticSearch学习笔记之二十一 指标聚合
ElasticSearch学习笔记之二十一 指标聚合 指标聚合 Avg Aggregation Script Value Script Missing value Weighted Avg Aggre ...
- flink1.12.0学习笔记第2篇-流批一体API
flink1.12.0学习笔记第 2 篇-流批一体API flink1.12.0学习笔记第1篇-部署与入门 flink1.12.0学习笔记第2篇-流批一体API flink1.12.0学习笔记第3篇- ...
- Java学习笔记之基础篇
Java学习笔记之基础篇 目录 Java如何体现平台的无关性? 面向对象(OO)的理解 面向对象和面向过程编程的区别 面向对象三大特征 静态绑定和动态绑定(后期绑定) 延伸:类之间的关系 组合(聚合) ...
- Redis学习笔记1-理论篇
目录 1,Redis 数据类型的底层结构 1.1,Redis 中的数据类型 1.2,全局哈希表 1.3,数据类型的底层结构 1.4,哈希冲突 1.5,rehash 操作 2,Redis 的 IO 模型 ...
- 树莓派4B学习笔记——IO通信篇(UART)
文章目录 UART简介 树莓派使用UART与串口屏通信 串口屏简介 硬件连接 配置串口接口 树莓派打开UART接口 树莓派安装串口调试助手 编程实现 wiringSerial.h Serial简介 C ...
- JavaScript学习笔记之入门篇
JavaScript学习笔记之入门篇 JavaScript引入 1. 页面级 js: 2. 外部js文件: JavaScript变量 1. 变量的作用: 2. 声明变量: 3. 变量赋值: 4. 单一 ...
- elasticSearch学习笔记04-同义词,停用词,拼音,高亮,拼写纠错
由于elasticSearch版本更新频繁,此笔记适用ES版本为 7.10.2 此笔记摘录自<Elasticsearch搜索引擎构建入门与实战>第一版 文中涉及代码适用于kibana开发工 ...
- Postgresql学习笔记-高级语法篇
Postgresql学习笔记-高级语法篇 Postgresql 约束 Postgresql约束用于规定表中的数据规则. 如果存在违反约束的数据行为,行为会被约束终止. 约束可以在创建表的时候就规定(通 ...
- ElasticSearch 学习笔记:Multi Search
本文目录 1 简介 2 格式 3 header格式 4 body格式 5 返回格式 6 性能 7 相关文章 1 简介 批量查询接口(Multi Search API)允许在一次请求中执行多个查询操作, ...
最新文章
- ASP.Net中利用CSS实现多界面两法
- 2019上海车展展后报告(整车篇)
- linux系统安全设置
- 小米MIX 3如何刷成开发版启用Root超级权限
- python 编程一日一练-爱上Python:一日精通Python编程
- 1. Leetcode 1. 两数之和 (数组-双向双指针)
- Provisional headers are shown in Chrome network tab
- GTJ2018如何导出全部工程量_如何成为优秀的造价员?广联达编制内刊手册,造价员算量高手秘籍...
- 2020年了,JavaScript依然是前端最受欢迎的语言吗?
- SQL不重复查找数据及把一列多行内容拼成一行
- input 模糊匹配功能 文本框模糊匹配(纯html+jquery简单实现) demo
- Sinowal Bootkit 分析-中国红客网络技术联盟 - Powered by Discuz!
- 常用邮箱的POP3、IMAP地址
- 5.从Paxos到Zookeeper分布式一致性原理与实践---使用ZooKeeper
- python监控钉钉群消息_使用python对mysql主从进行监控,并调用钉钉发送报警信息...
- 自学单片机编程(三) 流水灯代码
- 推荐一门开源课程“C/C++:从基础语法到优化策略”
- Qhsusb Dload驱动
- Netflix Web 性能案例研究
- 简单的C语言实训代码
热门文章
- dvdscr是什么意思?什么是dvdscr格式?
- Solidity众筹案例
- IDEA 启动项目报错 Shorten the command line via JAR manifest or via a classpath file and rerun
- python numpy 矩阵乘法以及列向量与行向量乘法
- 配置表单和报表以使用HTTP Server(OHS)
- Java程序员:内事不决问百度,外事不决问谷歌,一遇面试就变捞
- forward() takes 2 positional arguments but 3 were given
- 中国石油大学《微观经济学》第一次在线作业
- 使用ajax爬取今日头条街拍图片
- 合天网安实验室CTF-Exp200-Come on,Exploit me!