文章目录

  • 1. bucket aggregation 查询类型概览
  • 2. 数据准备
  • 3. 使用样例
    • 1. Terms Aggregation:
      • 1. 普通的terms agg
      • 2. 嵌套一个metric agg 作为sub agg查询
      • 3. 嵌套一个terms agg作为sub agg查询
    • 2. Range Aggregation:
    • 3. Date Histogram Aggregation:
    • 4. Date Range Aggregation
    • 5. Filter Aggregation
    • 6. Filters Aggregation
    • 7. Histogram Aggregation
    • 8. Missing Aggregation: 统计某个field不存在的doc
    • 9. nested aggs:用于nested的doc的聚合查询,一般是再有一个子查询来统计
    • 10. child agg 查询,针对join类型的数据进行查询
    • 11. parent agg 查询,针对join类型的数据进行查询
    • 12. Composite Aggregation 多个维度的terms进行组合操作,类似多层terms的嵌套,但是结果不是嵌套的,和mysql中按照多个字段进行group by类似
    • 13. Adjacency Matrix Aggregation,邻接矩阵聚合
    • 14. global agg 查询,针对所有数据的查询
    • 15. Significant Terms Aggregation: 自动查找显著性的关键字
    • 16. Significant Text Aggregation: 自动查找显著性的关键字
    • 17. Sampler Aggregation: 抽样数据聚合
    • 18.Reverse nested Aggregation 在nested agg中仍然可以对parent 的数据进行统计

elasticsearch的aggregate查询现在越来越丰富了,目前总共有4类。

  1. metric aggregation: 主要是min,max,avg,sum,percetile 等单个统计指标的查询
  2. bucket aggregation: 主要是类似group by的查询操作
  3. matrix aggregation: 使用多个字段的值进行计算从而产生一个多维矩阵
  4. pipline aggregation: 主要是能够在其他的aggregation进行一些附加的处理来增强数据

本篇就主要学习bucket aggregation,bucket aggregation查询类似group by 查询,而且相对metric aggregation 查询来说,bucket agg可以有sub aggregation, 也就是可以进行嵌套,嵌套的sub agg可以是bucket agg也可以是 metric agg。

1. bucket aggregation 查询类型概览

Terms Aggregation: 典型的grop by 类型,按照某个field将文档进行分桶,如果该field的value是数组的话,则该文档会被统计到多个bucket当中
Range Aggregation: 一般是针对number field,指定多个范围进行bucket划分
Date Histogram Aggregation: 按照时间进行分bucket,自动按照月等进行划分
Date Range Aggregation: 按照时间范围进行bucket,类似range aggregation
Filter Aggregation: 就是一个简单的过滤器,和query中的filter功能类似
Filters Aggregation: 多个filter进行过滤
Histogram Aggregation: 柱状图的聚合

Missing Aggregation: 统计某个field不存在的doc
Adjacency Matrix Aggregation
Auto-interval Date Histogram Aggregation
Children Aggregation
Composite Aggregation
Diversified Sampler Aggregation
Geo Distance Aggregation
GeoHash grid Aggregation
GeoTile Grid Aggregation
Global Aggregation
IP Range Aggregation
Nested Aggregation
Parent Aggregation
Reverse nested Aggregation
Sampler Aggregation
Significant Terms Aggregation
Significant Text Aggregation

2. 数据准备

演唱会的票信息
GET seats1028/_search

{
"play" : "Auntie Jo",   # 演唱会名称
"date" : "2018-11-6",  # 时间
"theatre" : "Skyline",  # 地点
"sold" : false,      # 这个票是否已经卖出
"actors" : [         # 演员"Jo Hangum","Jon Hittle","Rob Kettleman","Laura Conrad","Simon Hower","Nora Blue"],
"datetime" : 1541497200000,
"price" : 8321,    # 票价
"tip" : 17.5,      # 优惠
"time" : "5:40PM"
}

总共有3w+条这样的数据

3. 使用样例

1. Terms Aggregation:

典型的grop by 类型,按照某个field将文档进行分桶,如果该field的value是数组的话,则该文档会被统计到多个bucket当中

1. 普通的terms agg

GET seats1028/_search
{"size": 0,"aggs": {"term_price":{"terms": {"field": "price","min_doc_count": 13,"size": 50}}}
}返回
"aggregations" : {"term_price" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 35384,"buckets" : [{"key" : 910,"doc_count" : 13},{"key" : 3273,"doc_count" : 13},{"key" : 3648,"doc_count" : 13}]}}

2. 嵌套一个metric agg 作为sub agg查询

按照row进行分组,取doc数量最多的前3个bucket,并计算每个bucket中的price的最大值。


GET seats1028/_search
{"size": 0,"aggs": {"term_price":{"terms": {"field": "row","min_doc_count": 13,"size": 3,"order": {"_count": "desc"}},"aggs": {"max_price": {"max": {"field": "price"}}}}}
}返回"aggregations" : {"term_price" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 13608,"buckets" : [{"key" : 2,"doc_count" : 5796,"max_price" : {"value" : 9998.0}},{"key" : 3,"doc_count" : 5796,"max_price" : {"value" : 9999.0}},{"key" : 1,"doc_count" : 5791,"max_price" : {"value" : 9999.0}}]}}

3. 嵌套一个terms agg作为sub agg查询

先按照row进行bucket划分,给出doc数量前3的row对应的bucket,然后每个bucket按照number进行再分bucket, 并给出doc数量前三的number值对应的bucket。

GET seats1028/_search
{"size": 0,"aggs": {"term_price":{"terms": {"field": "row","min_doc_count": 13,"size": 3,"order": {"_count": "desc"}},"aggs": {"number_term": {"terms": {"field": "number","size": 3,"order": {"_count": "desc"}}}}}}
}返回
"aggregations" : {"term_price" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 13608,"buckets" : [{"key" : 2,"doc_count" : 5796,"number_term" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 4368,"buckets" : [{"key" : 1,"doc_count" : 476},{"key" : 2,"doc_count" : 476},{"key" : 3,"doc_count" : 476}]}},{"key" : 3,"doc_count" : 5796,"number_term" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 4368,"buckets" : [{"key" : 1,"doc_count" : 476},{"key" : 2,"doc_count" : 476},{"key" : 3,"doc_count" : 476}]}},{"key" : 1,"doc_count" : 5791,"number_term" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 4363,"buckets" : [{"key" : 5,"doc_count" : 476},{"key" : 6,"doc_count" : 476},{"key" : 7,"doc_count" : 476}]}}]}}

2. Range Aggregation:

一般是针对number field,指定多个范围进行bucket划分,包含from数值,不包含to对应的数值

GET seats1028/_search
{"size": 0,"aggs": {"price_range": {"range": {"field": "price","ranges": [{"from": 5000,"to": 6000}]}}}
}返回
"aggregations" : {"price_range" : {"buckets" : [{"key" : "5000.0-6000.0","from" : 5000.0,"to" : 6000.0,"doc_count" : 3646}]}}

3. Date Histogram Aggregation:

按照时间进行分bucket,自动按照月等进行划分

GET seats1028/_search
{"size": 0,"aggs": {"price_date_histogram": {"date_histogram": {"field": "datetime","calendar_interval": "month"}}}
}返回"aggregations" : {"price_date_histogram" : {"buckets" : [{"key_as_string" : "2018-03-01T00:00:00.000Z","key" : 1519862400000,"doc_count" : 2310},{"key_as_string" : "2018-04-01T00:00:00.000Z","key" : 1522540800000,"doc_count" : 3946},{"key_as_string" : "2018-05-01T00:00:00.000Z","key" : 1525132800000,"doc_count" : 3948},{"key_as_string" : "2018-06-01T00:00:00.000Z","key" : 1527811200000,"doc_count" : 3948},{"key_as_string" : "2018-07-01T00:00:00.000Z","key" : 1530403200000,"doc_count" : 3948}]}}

4. Date Range Aggregation

按照时间范围进行bucket,类似range aggregation

GET seats1028/_search
{"size": 0,"aggs": {"price_date_histogram": {"date_range": {"field": "datetime","ranges": [{"from": "2018-10-01T00:00:00.000Z","to": "2018-11-01T00:00:00.000Z"}]}}}
}返回"aggregations" : {"price_date_histogram" : {"buckets" : [{"key" : "2018-10-01T00:00:00.000Z-2018-11-01T00:00:00.000Z","from" : 1.538352E12,"from_as_string" : "2018-10-01T00:00:00.000Z","to" : 1.5410304E12,"to_as_string" : "2018-11-01T00:00:00.000Z","doc_count" : 3948}]}}

5. Filter Aggregation

就是一个简单的过滤器,和query中的filter功能类似

GET seats1028/_search
{"size": 0,"aggs": {"sold_filter": {"filter": {"range": {"tip": {"gte": 10,"lte": 20}}},"aggs": {"max_price": {"max": {"field": "price"}}}}}
}返回
"aggregations" : {"sold_filter" : {"doc_count" : 6300, # 这个是filter后的doc count"max_price" : {"value" : 9996.0}}}

6. Filters Aggregation

多个filter进行过滤, 对于每个filter过滤的结果再应用子agg查询

GET seats1028/_search
{"size": 0,"aggs": {"sold_filter": {"filters": {"filters": {    # 这个地方的用法还是挺怪异的,最终还是"tip_filter": {"range": {"tip": {"gte": 10,"lte": 20}}},"number_filter": {"range": {"number": {"gte": 5,"lte":10}}}}},"aggs": {"max_price": {"max": {"field": "price"}}}}}
}
返回"aggregations" : {"sold_filter" : {"buckets" : {"number_filter" : {"doc_count" : 16072,"max_price" : {"value" : 9999.0}},"tip_filter" : {  "doc_count" : 6300,"max_price" : {"value" : 9996.0}}}}}

可以看到这里对每一个子的filter都进行了过滤

7. Histogram Aggregation

柱状图的聚合,这里用来聚合的字段一般是数值型,比较方便用来分组

GET seats1028/_search
{"size": 0,"aggs": {"tip_histogram":{"histogram": {"field": "tip","interval": 4}}}
}返回"aggregations" : {"number_histogram" : {"buckets" : [{"key" : 16.0,"doc_count" : 4200},{"key" : 20.0,"doc_count" : 8400},{"key" : 24.0,"doc_count" : 17808},{"key" : 28.0,"doc_count" : 5794}]}}

8. Missing Aggregation: 统计某个field不存在的doc

GET seats1028/_search
{"size":0,"aggs": {"miss_f": {"missing": {"field": "row"}}}
}返回
"aggregations" : {"miss_f" : {"doc_count" : 1}}

9. nested aggs:用于nested的doc的聚合查询,一般是再有一个子查询来统计

数据样例
这个查询用于nested的doc的聚合查询,一般是再有一个子查询来统计
数据样例,班级里面有一个学生列表,学生有age,name属性

GET nest_test/_mapping
返回
{"mappings" : {"properties" : {"c_name" : {"type" : "text"},"class" : {"type" : "nested","properties" : {"students" : {"type" : "nested","properties" : {"age" : {"type" : "integer"},"name" : {"type" : "text"}}}}}}}}对应的文档有两个
"_source" : {"c_name" : "start_class","class" : {"students" : [{"name" : "jack chen","age" : 30},{"name" : "jack man","age" : 20},{"name" : "pony wang","age" : 60},{"name" : "gebi wang","age" : 90}]}}"_source" : {"c_name" : "sun_class","class" : {"students" : [{"name" : "lucy chen","age" : 30},{"name" : "lucy man","age" : 20},{"name" : "dong wang","age" : 60},{"name" : "chess wang","age" : 90}]}}

对应的查询


GET nest_test/_search
{"size": 0,"aggs": {"nested_agg": {"nested": {"path": "class.students"},"aggs": {"min_age": {"min": {"field": "class.students.age"}}}}}
}返回"aggregations" : {"nested_agg" : {"doc_count" : 8,"min_age" : {"value" : 20.0}}}

10. child agg 查询,针对join类型的数据进行查询

数据准备,每个教室(class_room)可以有多个课程(subject),每个学生(student)可以选择一个或者多个class_room,这样class_room和student就构成了parent/child的关系


PUT join_class
{"mappings": {"properties": {"subject":{"type": "keyword"},"class_student":{"type": "join","relations":{"class_room":"student"}}}}
}PUT join_class/_doc/1
{"subject":["english","Chinese","Russia"],"class_student":{"name":"class_room"},"des":"this class room teach english, Chinese, Russia"
}PUT join_class/_doc/2?routing=1
{"class_student":{"name":"student","parent":1},"name":"jack"
}PUT join_class/_doc/3?routing=1
{"class_student":{"name":"student","parent":1},"name":"pony"
}

下面这个查询要查找的是每个subject的对应的有哪些学生


GET join_class/_search
{"size":0,"query": {"match_all": {}},"aggs": {"subject_term": {"terms": {"field": "subject","size": 10},"aggs": {"subject_student": {"children": {"type": "student"},"aggs": {"term_name": {"terms": {"field": "name.keyword","size": 10}}}}}}}
}返回"aggregations" : {"subject_term" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "Chinese","doc_count" : 1,"subject_student" : {"doc_count" : 2,"term_name" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "jack","doc_count" : 1},{"key" : "pony","doc_count" : 1}]}}},{"key" : "Russia","doc_count" : 1,"subject_student" : {"doc_count" : 2,"term_name" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "jack","doc_count" : 1},{"key" : "pony","doc_count" : 1}]}}},{"key" : "english","doc_count" : 1,"subject_student" : {"doc_count" : 2,"term_name" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "jack","doc_count" : 1},{"key" : "pony","doc_count" : 1}]}}}]}}

11. parent agg 查询,针对join类型的数据进行查询

承接上面的数据样例,下面的请求查找每个学生选的课程


GET join_class/_search
{"size":0,"query": {"match_all": {}},"aggs": {"student_term": {"terms": {"field": "name.keyword","size": 10},"aggs": {"subject_student": {"parent": {"type": "student"},"aggs": {"choose_subject": {"terms": {"field": "subject","size": 10}}}}}}}
}

返回

 "aggregations" : {"student_term" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "jack","doc_count" : 1,"subject_student" : {"doc_count" : 1,"choose_subject" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "Chinese","doc_count" : 1},{"key" : "Russia","doc_count" : 1},{"key" : "english","doc_count" : 1}]}}},{"key" : "pony","doc_count" : 1,"subject_student" : {"doc_count" : 1,"choose_subject" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "Chinese","doc_count" : 1},{"key" : "Russia","doc_count" : 1},{"key" : "english","doc_count" : 1}]}}}]}}

12. Composite Aggregation 多个维度的terms进行组合操作,类似多层terms的嵌套,但是结果不是嵌套的,和mysql中按照多个字段进行group by类似

数据初始化


PUT composite_test
{"mappings": {"properties": {"area": {"type": "keyword"},"userid": {"type": "keyword"},"sendtime": {"type": "date","format": "yyyy-MM-dd HH:mm:ss"}}}
}
POST composite_test/_bulk
{ "index" : {"_type" :"_doc"}}
{"area":"33","userid":"400015","sendtime":"2019-01-17 00:00:00"}
{ "index" : {"_type" : "_doc"}}
{"area":"33","userid":"400015","sendtime":"2019-01-17 00:00:00"}
{ "index" : {"_type" : "_doc"}}
{"area":"35","userid":"400016","sendtime":"2019-01-18 00:00:00"}
{ "index" : { "_type" : "_doc"}}
{"area":"35","userid":"400016","sendtime":"2019-01-18 00:00:00"}
{ "index" : {"_type" : "_doc"}}
{"area":"33","userid":"400017","sendtime":"2019-01-17 00:00:00"}

下面的查询会按照area,userid, sendtime 三个字段进行group by查询


GET composite_test/_search
{"size": 0,"aggs": {"my_buckets": {"composite": {"sources": [{"area": {"terms": {"field": "area"}}},{"userid": {"terms": {"field": "userid"}}},{"sendtime": {"date_histogram": {"field": "sendtime","fixed_interval": "1d","format": "yyyy-MM-dd"}}}]}}}
}

返回

"aggregations" : {"my_buckets" : {"after_key" : {"area" : "35","userid" : "400016","sendtime" : "2019-01-18"},"buckets" : [{"key" : {"area" : "33","userid" : "400015","sendtime" : "2019-01-17"},"doc_count" : 2},{"key" : {"area" : "33","userid" : "400017","sendtime" : "2019-01-17"},"doc_count" : 1},{"key" : {"area" : "35","userid" : "400016","sendtime" : "2019-01-18"},"doc_count" : 2}]}}

13. Adjacency Matrix Aggregation,邻接矩阵聚合

邻接矩阵聚合,上面的composition是多个维度的terms求交,这个更弱一些,只能做指定的field的某些值进行邻接矩阵生成
使用上面的数据样例,下面的查询会返回area=33的doc统计,userid=400015的doc统计,同时还会返回area=33 & userid=400015的doc统计


GET composite_test/_search
{"size": 0,"aggs": {"composite_two": {"adjacency_matrix": {"filters": {"area_filter":{"terms":{"area":["33"]}},"user_id_filter":{"terms":{"userid":["400015"]}}}}}}

返回

"aggregations" : {"composite_two" : {"buckets" : [{"key" : "area_filter","doc_count" : 3},{"key" : "area_filter&user_id_filter","doc_count" : 2},{"key" : "user_id_filter","doc_count" : 2}]}}

14. global agg 查询,针对所有数据的查询

这个就是忽略query的过滤信息,直接针对index中的所有数据进行子聚合

GET seats1028/_search
{"size": 0, "query": {"term": {"row": {"value": 5}}},"aggs": {"global_row": {"global": {},"aggs": {"avg_row": {"avg": {"field": "row"}}}},"avg_row02":{"avg": {"field": "row"}}}
}

返回

"aggregations" : {"global_row" : {"doc_count" : 30992,"avg_row" : {"value" : 4.333871123874673   # 这个值是从所有的doc中算出来的}},"avg_row02" : {"value" : 5.0  # 这个是query过滤后的doc中计算出来的}}

15. Significant Terms Aggregation: 自动查找显著性的关键字

这个是在keyword的字段中查找当前的显著性的字段,查找出现频率比较高的字段
还是使用案例来说明更靠谱,这里举例的是网页新闻news,每个新闻news有作者(author) title, topic,等信息
相关数据构造如下

PUT news
{"mappings": {"properties": {"published": {"type": "date","format": "dateOptionalTime"},"author": {"type": "keyword"},"title": {"type": "text"},"topic": {"type": "keyword"},"views": {"type": "integer"}}}
}POST news/_bulk
{"index": {"_index": "news"}
}
{"author": "John Michael","published": "2018-07-08","title": "Tesla is flirting with its lowest close in over 1 1/2 years (TSLA)","topic": "automobile","views": "431"
}
{"index": {"_index": "news"}
}
{"author": "John Michael","published": "2018-07-22","title": "Tesla to end up like Lehman Brothers (TSLA)","topic": "automobile","views": "1921"
}
{"index": {"_index": "news"}
}
{"author": "John Michael","published": "2018-07-29","title": "Tesla (TSLA) official says that they are going to release a new self-driving car model in the coming year","topic": "automobile","views": "1849"
}
{"index": {"_index": "news"}
}
{"author": "John Michael","published": "2018-08-14","title": "Five ways Tesla uses AI and Big Data","topic": "ai","views": "871"
}
{"index": {"_index": "news"}
}
{"author": "John Michael","published": "2018-08-14","title": "Toyota partners with Tesla (TSLA) to improve the security of self-driving cars","topic": "automobile","views": "871"
}
{"index": {"_index": "news"}
}
{"author": "Robert Cann","published": "2018-08-25","title": "Is AI dangerous for humanity","topic": "ai","views": "981"
}
{"index": {"_index": "news"}
}
{"author": "Robert Cann","published": "2018-09-13","title": "Is AI dangerous for humanity","topic": "ai","views": "871"
}
{"index": {"_index": "news"}
}
{"author": "Robert Cann","published": "2018-09-27","title": "Introduction to Generative Adversarial Networks (GANs) in self-driving cars","topic": "automobile","views": "1183"
}
{"index": {"_index": "news"}
}
{"author": "Robert Cann","published": "2018-10-09","title": "Introduction to Natural Language Processing","topic": "ai","views": "786"
}
{"index": {"_index": "news"}
}
{"author": "Robert Cann","published": "2018-10-15","title": "New Distant Objects Found in the Fight for Planet X ","topic": "astronomy","views": "542"
}

查找每个作者关注最多的topic,那么该作者肯定在该topic的发问最多

GET news/_search
{"size": 0,"aggregations": {"authors": {"terms": {"field": "author"},"aggregations": {"significant_topic_types": {"significant_terms": {"field": "topic"}}}}}
}

返回

  "aggregations" : {"authors" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "John Michael","doc_count" : 5,"significant_topic_types" : {"doc_count" : 5,"bg_count" : 10,"buckets" : [{"key" : "automobile","doc_count" : 4,"score" : 0.4800000000000001,"bg_count" : 5}]}},{"key" : "Robert Cann","doc_count" : 5,"significant_topic_types" : {"doc_count" : 5,  # Robert Cann 总的doc数量为5个"bg_count" : 10,  # index中所有的doc数量为10"buckets" : [{"key" : "ai","doc_count" : 3,  # Robert Cann 的topic为ai的doc总共有3个"score" : 0.2999999999999999,"bg_count" : 4   ## 这里是指索引中topic是ai的文档总共有4个}]}}]}}

上面的统计说明John Michael 这位作者最关注的话题是 automobile(自动驾驶),而Robert Cann 最关注的是ai相关的话题,相关的bg_count的说明查看上面的注释

16. Significant Text Aggregation: 自动查找显著性的关键字

这个和上面的Significant terms Aggregation类似,就是针对的是text字段,而且会进行分词处理
使用上面的数据进行下面的查询


GET news/_search
{"query": {"match": {"title": " AI "}},"size": 0,"aggs": {"significant_title": {"significant_text": {"field": "title"}}}
}

返回

"aggregations" : {"significant_title" : {"doc_count" : 3,"bg_count" : 10,"buckets" : [{"key" : "ai","doc_count" : 3,"score" : 2.3333333333333335,"bg_count" : 3}]}}

17. Sampler Aggregation: 抽样数据聚合

这个一般是在significant_terms 查询的时候,有时候索引中的数据可能非常大,导致耗时也比较严重,可以用这个来做抽样聚合,抽取更相关的样本数据来进行聚合

POST /stackoverflow/_search?size=0
{"query": {"query_string": {"query": "tags:kibana OR tags:javascript"}},"aggs": {"sample": {"sampler": {"shard_size": 200},"aggs": {"keywords": {"significant_terms": {"field": "tags","exclude": ["kibana", "javascript"]}}}}}
}

shard_size 参数指的是每个分片抽取的样本数量,默认为 100
返回

{..."aggregations": {"sample": {"doc_count": 200,"keywords": {"doc_count": 200,"bg_count": 650,"buckets": [{"key": "elasticsearch","doc_count": 150,"score": 1.078125,"bg_count": 200},{"key": "logstash","doc_count": 50,"score": 0.5625,"bg_count": 50}]}}}
}

18.Reverse nested Aggregation 在nested agg中仍然可以对parent 的数据进行统计

Reverse nested Aggregation 的作用主要是能够让聚合在作为 Nested Aggregation 子聚合的情况下,跳出嵌套类型,对根文档的数据作聚合计算。
有例子:

PUT /issues
{"mappings": {"properties" : {"tags" : { "type" : "keyword" },"comments" : { "type" : "nested","properties" : {"username" : { "type" : "keyword" },"comment" : { "type" : "text" }}}}}
}PUT issues/_doc/1
{"tags": ["bug","improve"],"comments": [{"username": "jack","comment": " this is a bug"},{"username": "pony","comment": " this is a improve"}]
}PUT issues/_doc/2
{"tags": ["advice","improve"],"comments": [{"username": "jack","comment": " this is a good job "},{"username": "nacy","comment": " this is a improvement"}]
}

查询

GET /issues/_search
{"size": 0,"query": {"match_all": {}},"aggs": {"comments": {"nested": {"path": "comments"},"aggs": {"top_usernames": {"terms": {"field": "comments.username"},"aggs": {"comment_to_issue": {"reverse_nested": {},"aggs": {"top_tags_per_comment": {"terms": {"field": "tags"}}}}}}}}}
}

返回

"aggregations" : {"comments" : {"doc_count" : 4,"top_usernames" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "jack","doc_count" : 2,"comment_to_issue" : {"doc_count" : 2,"top_tags_per_comment" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "improve","doc_count" : 2},{"key" : "advice","doc_count" : 1},{"key" : "bug","doc_count" : 1}]}}},{"key" : "nacy","doc_count" : 1,"comment_to_issue" : {"doc_count" : 1,"top_tags_per_comment" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "advice","doc_count" : 1},{"key" : "improve","doc_count" : 1}]}}},{"key" : "pony","doc_count" : 1,"comment_to_issue" : {"doc_count" : 1,"top_tags_per_comment" : {"doc_count_error_upper_bound" : 0,"sum_other_doc_count" : 0,"buckets" : [{"key" : "bug","doc_count" : 1},{"key" : "improve","doc_count" : 1}]}}}]}}}

在 Nested Aggregation 聚合下,Reverse nested Aggregation 的子聚合计算聚合的数据集是该嵌套文档的根文档。
根据 Reverse nested Aggregation 的作用,可以清楚这是一个专门作为 Nested Aggregation 子聚合的聚合计算,所以作为顶层聚合或者是作为非 Nested Aggregation 的子聚合是没意义的。
在默认情况下, Reverse nested Aggregation 将找到根文档,当然如果有多层嵌套,也可以通过 path 参数指定文档的路径。

02.elasticsearch bucket aggregation查询相关推荐

  1. 03.elasticsearch pipeline aggregation查询

    文章目录 1. pipeline aggregation查询语法 1. 符号代表 2. 聚合层级 2. pipeline aggregation 查询类型概览 1. sibling aggregati ...

  2. 01.elasticsearch metric aggregation 查询

    文章目录 1. 数据准备 2. metric aggregation分类 3.使用样例 1 . Avg Aggregation : 求query出来的结果的average 值 2 . Weighted ...

  3. Elasticsearch(9) --- 聚合查询(Bucket聚合)

    Elasticsearch(9) --- 聚合查询(Bucket聚合) 系统小说 www.kuwx.net 上一篇讲了Elasticsearch聚合查询中的Metric聚合:Elasticsearch ...

  4. 理解Elasticsearch中的桶聚合(Bucket aggregation)

    0. 前言 Elasticsearch除了在搜索方面非常之快,对数据分析也是非常重要的一面.正确理解Bucket aggregation对我们使用Kibana非常重要.Elasticsearch提供了 ...

  5. 【ES笔记02】ElasticSearch数据库之查询操作(match、must、must_not、should、_source、filter、range、exists、ids、term、terms)

    这篇文章,主要介绍ElasticSearch数据库之查询操作(match.must.must_not.should._source.filter.range.exists.ids.term.terms ...

  6. ElasticSearch+聚合+Aggregation+示例

    ElasticSearch+聚合+Aggregation+示例 聚合提供了分组并统计数据的能力.理解聚合的最简单的方式是将其粗略地等同为SQL的GROUP BY和SQL聚合函数.在Elasticsea ...

  7. 【Elasticsearch】Elasticsearch:aggregation介绍

    本文为博主九师兄(QQ:541711153 欢迎来探讨技术)原创文章,未经允许博主不允许转载.有问题可以先私聊我,本人每天都在线,会帮助需要的人. 文章目录 1.概述 2.关于Elastic Face ...

  8. 【Spring Data ElasticSearch】高级查询,聚合

    [Spring Data ElasticSearch]高级查询,聚合 1. 高级查询 1.1 基本查询 1.2 自定义查询 1.3 分页查询 1.4 排序 2. 聚合 2.1 聚合为桶 2.2 嵌套聚 ...

  9. Elasticsearch:Aggregation 简介

    Aggregation 在中文中也被称作聚合.简单地说,Elasticsearch 中的 aggregation 聚合将你的数据汇总为指标.统计数据或其他分析.聚合可帮助你回答以下问题: 我的网站的平 ...

最新文章

  1. 业务脆弱性评估是业务持续性保障(BCM)的基础数据
  2. 国外计算机科学英语演讲,2014年暨大英语演讲大赛圆满落幕
  3. Pytorch基础(五)—— 池化层
  4. 在java中添加源_关于Java:如何在Android Studio中添加链接的源文件夹?
  5. PAT 1017 Queueing at Bank[一般]
  6. 领导力包括哪些能力?如何提升领导力?
  7. 计算机音乐大学排名,2019音乐类大学排行榜_2019年世界十大权威大学排名报告发布,中国891所高...
  8. 城镇污水处理厂工艺概述及提标改造路线
  9. 前端(五)DOM 文档对象模型
  10. 工作人员必备的计算机知识,工作必备计算机技巧知识
  11. Android全面屏状态栏适配
  12. 2022卡塔尔世界杯:跨境卖家如何用YouTube进行营销?
  13. 输出直角三角形图案-c++
  14. 支持向量机中高斯核函数的直观理解
  15. The following assertion was thrown building LayoutDemo(dirty): A non-null String must be provided to
  16. [hihoCoder] 区域周长 解题报告
  17. 网络基础(网络相关命令)
  18. Unity中 Prefab导出FBX
  19. TFS 数据库表信息
  20. 基于物理的渲染—HDR Tone Mapping

热门文章

  1. cocos2d-x游戏开发(十五)游戏加载动画loading界面
  2. Win32多线程编程(6) — 多线程协作及线程的池化管理
  3. UDP和TCP的区别(详细)
  4. 使用LeakTracer检测android NDK C/C++代码中的memory leak
  5. JQuery属性、事件相关操作
  6. 音视频技术开发周刊 | 133
  7. 一切从用户的需求与体验出发
  8. PCM音频基础知识及采样数据处理
  9. 对话腾讯安全杨勇:产业互联网带来哪些新的安全挑战
  10. redis的bigkeys命令之原理