本文收录于github和gitee ,里面有我完整的Java系列文章,学习或面试都可以看看

(一)概述

在前面关于ES的一系列文章中,已经介绍了ES的概念、常用操作、JavaAPI以及实际的一个小demo,但是在真实的应用场景中,还有可能会有更高阶的一些用法,今天主要介绍两种相对来说会更难一些的操作,聚合查询。该文档基于ElasticSearch7.6,将介绍restful查询语法以及JavaApi。

阅读本文需要你有ElasticSearch的基础​。

(二)前期数据准备

这里准备了包含姓名、年龄、教室、性别和成绩五个字段的数据

PUT /test4
{"mappings" : {"properties" : {"name" : {"type" : "text"},"age":{"type": "integer"},"classroom":{"type": "keyword"},"gender":{"type": "keyword"},"grade":{"type": "integer"}}}
}
PUT /test4/_bulk
{"index": {"_id": 1}}
{"name":"张三","age":18,"classroom":"1","gender":"男","grade":80}
{"index": {"_id": 2}}
{"name":"李四","age":20,"classroom":"2","gender":"男","grade":60}
{"index": {"_id": 3}}
{"name":"王五","age":20,"classroom":"2","gender":"女","grade":70}
{"index": {"_id": 4}}
{"name":"赵六","age":19,"classroom":"1","gender":"女","grade":90}
{"index": {"_id": 5}}
{"name":"毛七","age":20,"classroom":"1","gender":"男","grade":90}

(三)聚合查询

ES中的聚合操作提供了强大的分组及数理计算的能力,ES中聚合从大体上可以分为四种方式:

1、Metrics Aggregation 提供了诸如Max,Min,Avg的数值计算能力

2、Bucket Aggregation 提供了分桶的能力,简单来讲就是将一类相同的数据聚合到一起

3、Pipeline Aggregation 管道聚合,对其他聚合进行二次聚合

4、Matrix Aggregation 对多个字段进行操作并返回矩阵结果

ES官网提供了全部聚合查询文档,这篇文章将介绍常用的几种聚合查询的语法以及JavaApi:

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.6/java-rest-high-aggregation-builders.html#_metrics_aggregations

(四)Metrics Aggregation

4.1 AVG

avg用于计算聚合文档中提取的数值的平均值,restful查询语法如下:

POST /test4/_search
{"aggs": {"avg_grade": {"avg": {"field": "grade"}}}
}

查询得到的结果如下:

接着是JavaApi,核心在于使用AggregationBuilders的avg方法,第七行代码对应于上面的操作。

@Test
public void testAvg() throws Exception {//封装了获取RestHighLevelClient的方法RestHighLevelClient client=ElasticSearchClient.getClient();SearchRequest request = new SearchRequest("test4");SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();searchSourceBuilder.aggregation(AggregationBuilders.avg("agg_grade").field("grade")).size(0);request.source(searchSourceBuilder);SearchResponse search = client.search(request, RequestOptions.DEFAULT);//注意这里要把Aggregation类型转化为ParsedAvg类型ParsedAvg aggregation = search.getAggregations().get("agg_grade");System.out.println(aggregation.getValue()); //返回78.0
}

接下来就直接贴代码了

4.2 Min

获取聚合数据的最小值:

POST /test4/_search
{"aggs": {"min_grade": {"min": {"field": "grade"}}}
}
@Test
public void testMin() throws Exception {//封装了获取RestHighLevelClient的方法RestHighLevelClient client=ElasticSearchClient.getClient();SearchRequest request = new SearchRequest("test4");SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();searchSourceBuilder.aggregation(AggregationBuilders.min("min_grade").field("grade")).size(0);request.source(searchSourceBuilder);SearchResponse search = client.search(request, RequestOptions.DEFAULT);ParsedMin aggregation = search.getAggregations().get("min_grade");System.out.println(aggregation.getValue());
}

4.3 Max

获取聚合数据的最大值:

POST /test4/_search
{"aggs": {"max_grade": {"max": {"field": "grade"}}}
}
@Test
public void testMax() throws Exception {//封装了获取RestHighLevelClient的方法RestHighLevelClient client=ElasticSearchClient.getClient();SearchRequest request = new SearchRequest("test4");SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();searchSourceBuilder.aggregation(AggregationBuilders.max("max_grade").field("grade")).size(0);request.source(searchSourceBuilder);SearchResponse search = client.search(request, RequestOptions.DEFAULT);ParsedMax aggregation = search.getAggregations().get("max_grade");System.out.println(aggregation.getValue());
}

4.4 Sum

获取聚合数据的和:

POST /test4/_search
{"aggs": {"sum_grade": {"sum": {"field": "grade"}}}
}
@Test
public void testSum() throws Exception {//封装了获取RestHighLevelClient的方法RestHighLevelClient client=ElasticSearchClient.getClient();SearchRequest request = new SearchRequest("test4");SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();searchSourceBuilder.aggregation(AggregationBuilders.sum("sum_grade").field("grade")).size(0);request.source(searchSourceBuilder);SearchResponse search = client.search(request, RequestOptions.DEFAULT);ParsedSum aggregation = search.getAggregations().get("sum_grade");System.out.println(aggregation.getValue());
}

4.5 Stats

stats集成了上面的所有计算操作。

POST /test4/_search
{"aggs": {"stats_grade": {"stats": {"field": "grade"}}}
}
@Test
public void testStats() throws Exception {//封装了获取RestHighLevelClient的方法RestHighLevelClient client=ElasticSearchClient.getClient();SearchRequest request = new SearchRequest("test4");SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();searchSourceBuilder.aggregation(AggregationBuilders.stats("sum_grade").field("grade")).size(0);request.source(searchSourceBuilder);SearchResponse search = client.search(request, RequestOptions.DEFAULT);ParsedStats aggregation = search.getAggregations().get("sum_grade");System.out.println(aggregation.getMax());System.out.println(aggregation.getAvg());System.out.println(aggregation.getCount());System.out.println(aggregation.getMin());System.out.println(aggregation.getSum());
}

(五)Bucket Aggregation

桶聚合是按照某个字段将同类型的数据聚合为一类,最常用对桶聚合就是terms聚合了。

5.1 terms

terms查询类似于group by,返回查询字段分组后的值以及数量,比如我对classroom字段terms查询

POST /test4/_search
{"aggs": {"classroom_term": {"terms": {"field": "classroom"}}}
}

返回值就是classroom的分组后的值以及每个组的数量:classroom是1的有3条记录,classroom是2的有2条记录

"aggregations" : {"classroom_term" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "1","doc_count" : 3},{"key" : "2","doc_count" : 2}]}

我们也可以对多个字段进行terms分组,比如我现在对classroom和gender两个字段进行分组:

POST /test4/_search
{"aggs": {"classroom_term": {"terms": {"field": "classroom"},"aggs": {"gender": {"terms": {"field": "gender"}}}}}
}

最后对返回值就是classroom和gender分组后的值和数量:

classroom是1,gender是男有两条;

classroom是1,gender是女有一条;

classroom是2,gender是男有一条;

classroom是2,gender是女有一条;

"aggregations" : {"classroom_term" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "1","doc_count" : 3,"gender" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "男","doc_count" : 2},{"key" : "女","doc_count" : 1}]}},{"key" : "2","doc_count" : 2,"gender" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "女","doc_count" : 1},{"key" : "男","doc_count" : 1}]}}]}

对应的JavaApi使用如下:

@Test
public void testTerms() throws Exception {//封装了获取RestHighLevelClient的方法RestHighLevelClient client=ElasticSearchClient.getClient();SearchRequest request = new SearchRequest("test4");SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();searchSourceBuilder.aggregation(AggregationBuilders.terms("classroom_term").field("classroom").subAggregation(AggregationBuilders.terms("gender").field("gender")));request.source(searchSourceBuilder);SearchResponse search = client.search(request, RequestOptions.DEFAULT);//获取数据时首先对classroom分桶,再对gender分桶Terms classroomTerm = search.getAggregations().get("classroom_term");for(Terms.Bucket classroomBucket:classroomTerm.getBuckets()){Terms genderTerm=classroomBucket.getAggregations().get("gender");for (Terms.Bucket genderBucket:genderTerm.getBuckets()){System.out.println("classRoom:"+classroomBucket.getKeyAsString()+"gender:"+genderBucket.getKeyAsString()+"count:"+genderBucket.getDocCount());}}
}

这里比较难理解对是获取数据时的处理,聚合查询时有个桶的概念,在获取数据时需要遍历获取桶,以上面的代码为例,先获取到classroom的桶,再遍历classroom的桶获取gender的桶,从桶中获取到具体的内容。看下图:

5.2 range

range查询可以统计出每个数据区间内的数量:比如我要统计分数为*~70,70~85,80~*的数据,就可以通过下面的方式:

POST /test4/_search
{"aggs": {"grade_range": {"range": {"field": "grade","ranges": [{"to":70},{"from":70,"to":85},{"from":85}]}}}
}

JavaAPI如下:

@Test
public void testRange() throws Exception {//封装了获取RestHighLevelClient的方法RestHighLevelClient client=ElasticSearchClient.getClient();SearchRequest request = new SearchRequest("test4");SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();searchSourceBuilder.aggregation(AggregationBuilders.range("grade_range").field("grade").addUnboundedTo(70).addRange(70,85).addUnboundedFrom(85));request.source(searchSourceBuilder);SearchResponse search = client.search(request, RequestOptions.DEFAULT);//获取数据时首先对classroom分桶,再对gender分桶Range gradeRange = search.getAggregations().get("grade_range");for(Range.Bucket gradeBucket:gradeRange.getBuckets()){System.out.println("key:"+gradeBucket.getKey()+"count:"+gradeBucket.getDocCount());}
}

(六)总结

至此,关于ES的聚合查询一些常用方法就讲解完毕了,ES提供的其他更多方法可以直接在官方文档中看,讲解的十分详细。我是鱼仔,我们下期再见!

ElasticSearch聚合查询Restful语法和JavaApi详解(基于ES7.6)相关推荐

  1. php聚合查询,php elasticsearch 聚合查询(Aggregation)

    Elasticsearch中的聚合查询,类似SQL的SUM/AVG/COUNT/GROUP BY分组查询,主要用于统计分析场景. 这里主要介绍PHP Elasticsearch 聚合查询的写法,如果不 ...

  2. Elasticsearch聚合查询案例分享

    为什么80%的码农都做不了架构师?>>>    Elasticsearch聚合查询案例分享 1.案例介绍 本文包含三个案例: 案例1:统计特定时间范围内每个应用的总访问量.访问成功数 ...

  3. Elasticsearch聚合查询多字段设置权重

    Elasticsearch聚合查询多字段设置权重 背景 环境说明 script设置权重 小结 背景 实际应用中,可能会需要为为doc文档中某个字段的某些特定的值设置权重,影响排序.es提供了比较灵活的 ...

  4. ElasticSearch聚合查询返回结果buckets取值

    ElasticSearch聚合查询返回结果buckets取值 1.聚合查询如下: {"size":0,"query":{"bool":{&q ...

  5. Elasticsearch之settings和mappings(图文详解)

    Elasticsearch之settings和mappings的意义 简单的说,就是 settings是修改分片和副本数的. mappings是修改字段和类型的. 记住,可以用url方式来操作它们,也 ...

  6. python选择排序从大到小_经典排序算法和Python详解之(一)选择排序和二元选择排序...

    本文源自微信公众号[Python编程和深度学习]原文链接:经典排序算法和Python详解之(一)选择排序和二元选择排序,欢迎扫码关注鸭! 扫它!扫它!扫它 排序算法是<数据结构与算法>中最 ...

  7. python中怎么计数_浅谈python中统计计数的几种方法和Counter详解

    1) 使用字典dict() 循环遍历出一个可迭代对象中的元素,如果字典没有该元素,那么就让该元素作为字典的键,并将该键赋值为1,如果存在就将该元素对应的值加1. lists = ['a','a','b ...

  8. SL651-2014 《水文监测数据通信规约》 中心站查询遥测站实时数据详解

     SL651-2014 <水文监测数据通信规约> 中心站查询遥测站实时数据详解 全国水文标准化技术委员会水文仪器分技术委员会为适应我国水文仪器标准化工作的迅速发展,对用来监测河流.水库等水 ...

  9. 一对一关联查询注解@OneToOne的实例详解(一)

    转载自: https://www.cnblogs.com/boywwj/p/8092915.html 一对一关联查询注解@OneToOne的实例详解 表的关联查询比较复杂,应用的场景很多,本文根据自己 ...

最新文章

  1. Postgresql:删除及查询字段中包含单引号的数据
  2. hashcode的作用_看似简单的hashCode和equals面试题,竟然有这么多坑!
  3. matlab mbuild setup,关于mbuild的一个问题
  4. HDLBits答案(14)_Verilog有限状态机(1)
  5. 西门子数控面板图解_学好四要点让你迅速成为数控机床“操作高手”
  6. Machine Learning Mastery 博客文章翻译:深度学习与 Keras
  7. 手机这5个反人类的设计,你能容忍到第几个?
  8. python派落塔问题_浅析python递归函数和河内塔问题
  9. Oracle Web链接客户端
  10. poj_1390 动态规划
  11. python中pass的使用_Python pass详细介绍及实例代码
  12. 二级c语言 办公软件高级应用,高级应用题库_计算机国二office高级应用考试的题目是从题库20套里抽其中一套还是别的题目_淘题吧...
  13. 无经验想入行程序员该怎么自学
  14. Linux学习笔记CentOS6.5(七)--如何开启8080端口供外界访问
  15. 缺少编译器要求的成员“System.Runtime.CompilerServices.ExtensionAttribute..ctor” 解决方案
  16. 使用v-show时,当isshow:false时,在页面刷新的过程中,isshow依然会短暂显示一下...
  17. 结对-人机对战象棋游戏-设计文档
  18. selenium详细介绍
  19. 英国通信公司XOR推出3000英镑起硬件加密手机
  20. 编写一个简单的生日快乐APP

热门文章

  1. 【工业智能】人工智能之于工业,应当是融入者而非颠覆者;记一场工业场景下的AI技术实践
  2. 【C库函数】memmove函数
  3. 计算机音乐谱handclap,HandClap钢琴简谱-数字双手-Fitz and The Tantrums
  4. 为什么要做SMT贴片加工
  5. 流利阅读Day11 杜克大学道歉
  6. 离散型随机变量的概率密度(概率质量)
  7. Flask-模板-静态资源加载-jsonify
  8. 计算机应用能力考试是指,2019年计算机应用能力考试报考指南:考试介绍
  9. matlab实验7绘图操作绘制三维曲线,上机习题6 MATLAB7.0三维绘图
  10. HEIC图片转JPG和其他格式