文章目录

  • 1. pipeline aggregation查询语法
    • 1. 符号代表
    • 2. 聚合层级
  • 2. pipeline aggregation 查询类型概览
    • 1. sibling aggregation
    • 2. parent aggregation
  • 3. 数据准备
  • 4.使用样例
    • 1. sibling aggregation
      • 1. Avg Bucket Aggregation: sibling agg, 对bucket的统计值求average
        • 1. 普通metric求average
        • 2. 对特殊metric _count求avg
      • 2. Max Bucket Aggregation: sibling agg, 求bucket中的最大的bucket
      • 3. Min Bucket Aggregation: sibling agg, 求一组bucket中的最小的bucket
      • 4. Sum Bucket Aggregation: sibling agg, 对一组bucket求sum
      • 5. Stats Bucket Aggregation: sibling agg, 对一组bucket求stats
      • 6. Extended Stats Bucket Aggregation: sibling agg, 对一组bucket求extend stats
      • 7. Percentiles Bucket Aggregation: sibling agg, 对一组bucket求percentiles
    • 2. parent aggregation
      • 1. Derivative Aggregation: parent agg , 对histogram或date_histogram类型求导
      • 2. Moving Average Aggregation: parent agg, 对一组bucket求移动平均值
      • 3. Moving Function Aggregation: parent agg, 对一组bucket移动使用function
      • 4. Cumulative Sum Aggregation
      • 5. Bucket Script Aggregation: parent agg , 桶脚本聚合——基于父聚合的【一个或多个权值】,对这些权值通过脚本进行运算
      • 6. Bucket Selector Aggregation: parent agg , 对一组bucket执行过滤操作,只有满足过滤条件的bucket会被保留到结果集当中
      • 7. Bucket Sort Aggregation:
      • 8. Serial Differencing Aggregation: parent agg 串行差分聚合

elasticsearch的aggregate查询现在越来越丰富了,目前总共有4类。

  1. metric aggregation: 主要是min,max,avg,sum,percetile 等单个统计指标的查询
  2. bucket aggregation: 主要是类似group by的查询操作
  3. matrix aggregation: 使用多个字段的值进行计算从而产生一个多维矩阵
  4. pipeline aggregation: 主要是能够在其他的aggregation进行一些附加的处理来增强数据

本篇就主要学习pipeline aggregation

1. pipeline aggregation查询语法

1. 符号代表

  1. 聚合分隔符 >,指定父子聚合关系,如:“my_bucket>my_stats.avg”
  2. 统计指标分隔符 .,指定聚合的特定统计指标
  3. 聚合名称 <name of the aggregation>,直接指定聚合的名称
  4. 统计指标 <name of the metric>,直接指定统计指标
  5. 完整路径 agg_name[> agg_name]*[. metrics],综合利用上面的方式指定完整路径
  6. 特殊值 _count,bucket的文档个数这个是一个特殊的统计指标(metric),可以在pipeline中对应bucket的doc数量。

2. 聚合层级

** 1.parent **
此类聚合的"输入"是其【父聚合】的输出,并对其进行进一步处理。一般不生成新的桶,而是对父聚合桶信息的增强,可以在parent agg 的每一个bucket中添加新的统计指标。
这种典型的就是移动平均的计算,倒数计算,在parent中的每个bucket中都会增加一个统计指标。

** 2.sibling **
此类聚合的输入是其【兄弟聚合】的输出。并能在同级上计算新的聚合bucket,也就会产生新的agg bucket 分组。
这种典型的就是min,max等在原有bucket的基础上再增加一个新的bucket来输出min,max的值

2. pipeline aggregation 查询类型概览

1. sibling aggregation

  1. Avg Bucket Aggregation: sibling agg, 对bucket的统计值求average
  2. Max Bucket Aggregation: sibling agg, 求bucket中的最大的bucket
  3. Min Bucket Aggregation: sibling agg, 求一组bucket中的最小的bucket
  4. Sum Bucket Aggregation: sibling agg, 对一组bucket求sum
  5. Stats Bucket Aggregation: sibling agg, 对一组bucket求stats
  6. Extended Stats Bucket Aggregation: sibling agg, 对一组bucket求extend stats
  7. Percentiles Bucket Aggregation: sibling agg, 对一组bucket求percentiles

2. parent aggregation

  1. Derivative Aggregation: parent agg , 对histogram或date_histogram类型求导
  2. Moving Average Aggregation: parent agg, 对一组bucket求移动平均值,过期了
  3. Moving Function Aggregation: parent agg, 最一组bucket移动使用function
  4. Cumulative Sum Aggregation: 截止到当前bucket的累计求和
  5. Bucket Script Aggregation: parent agg , 桶脚本聚合——基于父聚合的【一个或多个权值】,对这些权值通过脚本进行运算
  6. Bucket Selector Aggregation: parent agg , 对一组bucket执行过滤操作,只有满足过滤条件的bucket会被保留到结果集当中
  7. Bucket Sort Aggregation: 对bucket进行排序
  8. Serial Differencing Aggregation: parent agg 串行差分聚合

3. 数据准备

traffic_stats存储的是博客每天的阅读信息,包括阅读量和最大阅读耗时

PUT traffic_stats
{"mappings": {"properties": {"date": {"type": "date","format": "dateOptionalTime"},"visits": {"type": "integer"},"max_time_spent": {"type": "integer"}}}
}

数据

PUT _bulk
{"index":{"_index":"traffic_stats"}}
{"visits":"488", "date":"2018-10-1", "max_time_spent":"900"}
{"index":{"_index":"traffic_stats"}}
{"visits":"783", "date":"2018-10-6", "max_time_spent":"928"}
{"index":{"_index":"traffic_stats"}}
{"visits":"789", "date":"2018-10-12", "max_time_spent":"1834"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1299", "date":"2018-11-3", "max_time_spent":"592"}
{"index":{"_index":"traffic_stats"}}
{"visits":"394", "date":"2018-11-6", "max_time_spent":"1249"}
{"index":{"_index":"traffic_stats"}}
{"visits":"448", "date":"2018-11-24", "max_time_spent":"874"}
{"index":{"_index":"traffic_stats"}}
{"visits":"768", "date":"2018-12-18", "max_time_spent":"876"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1194", "date":"2018-12-24", "max_time_spent":"1249"}
{"index":{"_index":"traffic_stats"}}
{"visits":"987", "date":"2018-12-28", "max_time_spent":"1599"}
{"index":{"_index":"traffic_stats"}}
{"visits":"872", "date":"2019-01-1", "max_time_spent":"828"}
{"index":{"_index":"traffic_stats"}}
{"visits":"972", "date":"2019-01-5", "max_time_spent":"723"}
{"index":{"_index":"traffic_stats"}}
{"visits":"827", "date":"2019-02-5", "max_time_spent":"1300"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1584", "date":"2019-02-15", "max_time_spent":"1500"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1604", "date":"2019-03-2", "max_time_spent":"1488"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1499", "date":"2019-03-27", "max_time_spent":"1399"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1392", "date":"2019-04-8", "max_time_spent":"1294"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1247", "date":"2019-04-15", "max_time_spent":"1194"}
{"index":{"_index":"traffic_stats"}}
{"visits":"984", "date":"2019-05-15", "max_time_spent":"1184"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1228", "date":"2019-05-18", "max_time_spent":"1485"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1423", "date":"2019-06-14", "max_time_spent":"1452"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1238", "date":"2019-06-24", "max_time_spent":"1329"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1388", "date":"2019-07-14", "max_time_spent":"1542"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1499", "date":"2019-07-24", "max_time_spent":"1742"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1523", "date":"2019-08-13", "max_time_spent":"1552"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1443", "date":"2019-08-19", "max_time_spent":"1511"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1587", "date":"2019-09-14", "max_time_spent":"1497"}
{"index":{"_index":"traffic_stats"}}
{"visits":"1534", "date":"2019-09-27", "max_time_spent":"1434"}

4.使用样例

1. sibling aggregation

1. Avg Bucket Aggregation: sibling agg, 对bucket的统计值求average

1. 普通metric求average

1.先用date_histogram算一下每月有多少天有人阅读和当月中阅读量最多的一天对应的阅读量


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}}}}}
}

返回

"aggregations" : {"month_term" : {"buckets" : [{"key_as_string" : "2018-10-01T00:00:00.000Z","key" : 1538352000000,"doc_count" : 3,"max_view_count" : {"value" : 789.0}},{"key_as_string" : "2018-11-01T00:00:00.000Z","key" : 1541030400000,"doc_count" : 3,"max_view_count" : {"value" : 1299.0}},{"key_as_string" : "2018-12-01T00:00:00.000Z","key" : 1543622400000,"doc_count" : 3,"max_view_count" : {"value" : 1194.0}},{"key_as_string" : "2019-01-01T00:00:00.000Z","key" : 1546300800000,"doc_count" : 2,"max_view_count" : {"value" : 972.0}},{"key_as_string" : "2019-02-01T00:00:00.000Z","key" : 1548979200000,"doc_count" : 2,"max_view_count" : {"value" : 1584.0}},{"key_as_string" : "2019-03-01T00:00:00.000Z","key" : 1551398400000,"doc_count" : 2,"max_view_count" : {"value" : 1604.0}},{"key_as_string" : "2019-04-01T00:00:00.000Z","key" : 1554076800000,"doc_count" : 2,"max_view_count" : {"value" : 1392.0}},{"key_as_string" : "2019-05-01T00:00:00.000Z","key" : 1556668800000,"doc_count" : 2,"max_view_count" : {"value" : 1228.0}},{"key_as_string" : "2019-06-01T00:00:00.000Z","key" : 1559347200000,"doc_count" : 2,"max_view_count" : {"value" : 1423.0}},{"key_as_string" : "2019-07-01T00:00:00.000Z","key" : 1561939200000,"doc_count" : 2,"max_view_count" : {"value" : 1499.0}},{"key_as_string" : "2019-08-01T00:00:00.000Z","key" : 1564617600000,"doc_count" : 2,"max_view_count" : {"value" : 1523.0}},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}}]}}

增加一个求average的sibling agg, 求每个月的阅读量最多的一天的数平均值(每个月取浏览量最多的一天)


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}}}},"average_month_max": {"avg_bucket": {"buckets_path": "month_term>max_view_count.value"}}}
}

注意这里的bucket_path

"buckets_path": "month_term>max_view_count.value"

month_term和max_view_count都是agg name所以使用>来进行连接
value是max_view_count的统计值,所以使用 .来进行连接

生成的结果是

"aggregations" : {"month_term" : {"buckets" : [{"key_as_string" : "2018-10-01T00:00:00.000Z","key" : 1538352000000,"doc_count" : 3,"max_view_count" : {"value" : 789.0}},......{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}}]},"average_month_max" : {"value" : 1341.1666666666667}}

注意下面这个运行的是没有正确结果的
勘误,这里的使用方式有问题,这个可以改进的


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"}},"avg_month_max": {"avg_bucket": {"buckets_path": "month_term.doc_count"  # 这个地方改成month_term._count 就会有结果了。}}}
}

返回

  "aggregations" : {"month_term" : {"buckets" : [{"key_as_string" : "2018-10-01T00:00:00.000Z","key" : 1538352000000,"doc_count" : 3},......{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2}]},"avg_month_max" : {"value" : null    # 这里面没有正确返回}}

因为他要求sibling agg是一个多个bucket的agg,而且对应的metric是一个数值型的,这里的month_term返回的是一个对象,可能就是这个原因

这里需要勘误一下,这个地方之所以不行是因为使用有误,这里应该使用date_histogram返回的bucket的特殊metric _count

2. 对特殊metric _count求avg

求每个月有阅读记录的天数,并给出天数最多的月份和每个月的平均阅读天数


GET traffic_stats/_search
{"size": 0,"aggs": {"month_days_count": {  # 有阅读记录的天数 ,每个bucket的doc_count,对应的metric为_count"date_histogram": {"field": "date","calendar_interval": "month"}},"max_day_month":{ # 阅读天数最多的月份,这里使用了特殊metric  _count"max_bucket": {"buckets_path": "month_days_count._count"}},"avg_day_each_month":{  # 每个月份的平均阅读天数,这里使用了特殊metric  _count"max_bucket": {"avg_bucket": {"buckets_path": "month_days_count._count"}}}
}返回"aggregations" : {"month_days_count" : {"buckets" : [{"key_as_string" : "2018-10-01T00:00:00.000Z","key" : 1538352000000,"doc_count" : 3},{"key_as_string" : "2018-11-01T00:00:00.000Z","key" : 1541030400000,"doc_count" : 3},{"key_as_string" : "2018-12-01T00:00:00.000Z","key" : 1543622400000,"doc_count" : 3},{"key_as_string" : "2019-01-01T00:00:00.000Z","key" : 1546300800000,"doc_count" : 2},{"key_as_string" : "2019-02-01T00:00:00.000Z","key" : 1548979200000,"doc_count" : 2},{"key_as_string" : "2019-03-01T00:00:00.000Z","key" : 1551398400000,"doc_count" : 2},{"key_as_string" : "2019-04-01T00:00:00.000Z","key" : 1554076800000,"doc_count" : 2},{"key_as_string" : "2019-05-01T00:00:00.000Z","key" : 1556668800000,"doc_count" : 2},{"key_as_string" : "2019-06-01T00:00:00.000Z","key" : 1559347200000,"doc_count" : 2},{"key_as_string" : "2019-07-01T00:00:00.000Z","key" : 1561939200000,"doc_count" : 2},{"key_as_string" : "2019-08-01T00:00:00.000Z","key" : 1564617600000,"doc_count" : 2},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2}]},"max_day_month" : {"value" : 3.0,"keys" : ["2018-10-01T00:00:00.000Z","2018-11-01T00:00:00.000Z","2018-12-01T00:00:00.000Z"]},"avg_day_each_month" : {"value" : 2.25}}

2. Max Bucket Aggregation: sibling agg, 求bucket中的最大的bucket

承接average查询,和avg_bucket类似


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}}}},"max_month_max": {"max_bucket": {"buckets_path": "month_term>max_view_count.value"}}}
}"aggregations" : {"month_term" : {"buckets" : [......{"key_as_string" : "2019-08-01T00:00:00.000Z","key" : 1564617600000,"doc_count" : 2,"max_view_count" : {"value" : 1523.0}},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}}]},"max_month_max" : {"value" : 1604.0,"keys" : ["2019-03-01T00:00:00.000Z"]}}

3. Min Bucket Aggregation: sibling agg, 求一组bucket中的最小的bucket


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}}}},"min_month_max": {"min_bucket": {"buckets_path": "month_term>max_view_count.value"}}}
}

返回

 "aggregations" : {"month_term" : {"buckets" : [......{"key_as_string" : "2019-08-01T00:00:00.000Z","key" : 1564617600000,"doc_count" : 2,"max_view_count" : {"value" : 1523.0}},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}}]},"min_month_max" : {"value" : 789.0,"keys" : ["2018-10-01T00:00:00.000Z"]}}

4. Sum Bucket Aggregation: sibling agg, 对一组bucket求sum

使用样例


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}}}},"sum_month_max": {"sum_bucket": {"buckets_path": "month_term>max_view_count.value"}}}
}

返回

 "aggregations" : {"month_term" : {"buckets" : [......{"key_as_string" : "2019-08-01T00:00:00.000Z","key" : 1564617600000,"doc_count" : 2,"max_view_count" : {"value" : 1523.0}},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}}]},"sum_month_max" : {"value" : 16094.0}}

5. Stats Bucket Aggregation: sibling agg, 对一组bucket求stats

使用样例


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}}}},"stats_month_max": {"stats_bucket": {"buckets_path": "month_term>max_view_count.value"}}}
}

返回

 "aggregations" : {"month_term" : {"buckets" : [......{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}}]},"stats_month_max" : {"count" : 12,"min" : 789.0,"max" : 1604.0,"avg" : 1341.1666666666667,"sum" : 16094.0}}

6. Extended Stats Bucket Aggregation: sibling agg, 对一组bucket求extend stats

使用样例


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}}}},"extend_stats_month_max": {"extended_stats_bucket": {"buckets_path": "month_term>max_view_count.value"}}}
}

返回结果

  "aggregations" : {"month_term" : {"buckets" : [......{"key_as_string" : "2018-10-01T00:00:00.000Z","key" : 1538352000000,"doc_count" : 3,"max_view_count" : {"value" : 789.0}},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}}]},"extend_stats_month_max" : {"count" : 12,"min" : 789.0,"max" : 1604.0,"avg" : 1341.1666666666667,"sum" : 16094.0,"sum_of_squares" : 2.231789E7,"variance" : 61096.13888888899,"std_deviation" : 247.17633157098393,"std_deviation_bounds" : {"upper" : 1835.5193298086347,"lower" : 846.8140035246988}}}

7. Percentiles Bucket Aggregation: sibling agg, 对一组bucket求percentiles

使用样例


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}}}},"percentile_month_max": {"percentiles_bucket": {"buckets_path": "month_term>max_view_count.value"}}}
}

返回

 "aggregations" : {"month_term" : {"buckets" : [......{"key_as_string" : "2019-08-01T00:00:00.000Z","key" : 1564617600000,"doc_count" : 2,"max_view_count" : {"value" : 1523.0}},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}}]},"percentile_month_max" : {"values" : {"1.0" : 789.0,"5.0" : 972.0,"25.0" : 1228.0,"50.0" : 1423.0,"75.0" : 1523.0,"95.0" : 1587.0,"99.0" : 1604.0}}}

2. parent aggregation

1. Derivative Aggregation: parent agg , 对histogram或date_histogram类型求导

查询样例


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}},"deriv_month_max": {  # 注意这里的层级变了,原来的sibling查询是和month_term同级的"derivative": {"buckets_path": "max_view_count.value"}}}}}
}

求一阶导数就是相邻的差值,注意看上面的deriv_month_max 的层级变了

返回

  "aggregations" : {"month_term" : {"buckets" : [{"key_as_string" : "2018-10-01T00:00:00.000Z","key" : 1538352000000,"doc_count" : 3,"max_view_count" : {"value" : 789.0}},{"key_as_string" : "2018-11-01T00:00:00.000Z","key" : 1541030400000,"doc_count" : 3,"max_view_count" : {"value" : 1299.0},"deriv_month_max" : {"value" : 510.0}},{"key_as_string" : "2018-12-01T00:00:00.000Z","key" : 1543622400000,"doc_count" : 3,"max_view_count" : {"value" : 1194.0},"deriv_month_max" : {"value" : -105.0}},{"key_as_string" : "2019-01-01T00:00:00.000Z","key" : 1546300800000,"doc_count" : 2,"max_view_count" : {"value" : 972.0},"deriv_month_max" : {"value" : -222.0}},{"key_as_string" : "2019-02-01T00:00:00.000Z","key" : 1548979200000,"doc_count" : 2,"max_view_count" : {"value" : 1584.0},"deriv_month_max" : {"value" : 612.0}},{"key_as_string" : "2019-03-01T00:00:00.000Z","key" : 1551398400000,"doc_count" : 2,"max_view_count" : {"value" : 1604.0},"deriv_month_max" : {"value" : 20.0}},{"key_as_string" : "2019-04-01T00:00:00.000Z","key" : 1554076800000,"doc_count" : 2,"max_view_count" : {"value" : 1392.0},"deriv_month_max" : {"value" : -212.0}},{"key_as_string" : "2019-05-01T00:00:00.000Z","key" : 1556668800000,"doc_count" : 2,"max_view_count" : {"value" : 1228.0},"deriv_month_max" : {"value" : -164.0}},{"key_as_string" : "2019-06-01T00:00:00.000Z","key" : 1559347200000,"doc_count" : 2,"max_view_count" : {"value" : 1423.0},"deriv_month_max" : {"value" : 195.0}},{"key_as_string" : "2019-07-01T00:00:00.000Z","key" : 1561939200000,"doc_count" : 2,"max_view_count" : {"value" : 1499.0},"deriv_month_max" : {"value" : 76.0}},{"key_as_string" : "2019-08-01T00:00:00.000Z","key" : 1564617600000,"doc_count" : 2,"max_view_count" : {"value" : 1523.0},"deriv_month_max" : {"value" : 24.0}},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0},"deriv_month_max" : {"value" : 64.0}}]}}

2. Moving Average Aggregation: parent agg, 对一组bucket求移动平均值

这个现在过期了,当前推荐使用的是Moving Function Aggregation
可以使用MovingFunctions.unweightedAvg(values) 来代替这个agg操作

3. Moving Function Aggregation: parent agg, 对一组bucket移动使用function


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}},"move_avg_view": {"moving_fn": {"buckets_path": "max_view_count","window": 2,"script": "MovingFunctions.unweightedAvg(values)"}}}}}
}

这里窗口设置的为2,也就是临近的两个bucket求平均值,
第一个bucket因为没有其他bucket可以和他求平均,所以是null, 第二个bucket的均值等于第一个的,第三个bucket的移动均值是(bucket01+bucket02)/2
返回结果

  "aggregations" : {"month_term" : {"buckets" : [{"key_as_string" : "2018-10-01T00:00:00.000Z","key" : 1538352000000,"doc_count" : 3,"max_view_count" : {"value" : 789.0},"move_avg_view" : {"value" : null}},{"key_as_string" : "2018-11-01T00:00:00.000Z","key" : 1541030400000,"doc_count" : 3,"max_view_count" : {"value" : 1299.0},"move_avg_view" : {"value" : 789.0}},{"key_as_string" : "2018-12-01T00:00:00.000Z","key" : 1543622400000,"doc_count" : 3,"max_view_count" : {"value" : 1194.0},"move_avg_view" : {"value" : 1044.0}},{"key_as_string" : "2019-01-01T00:00:00.000Z","key" : 1546300800000,"doc_count" : 2,"max_view_count" : {"value" : 972.0},"move_avg_view" : {"value" : 1246.5}},{"key_as_string" : "2019-02-01T00:00:00.000Z","key" : 1548979200000,"doc_count" : 2,"max_view_count" : {"value" : 1584.0},"move_avg_view" : {"value" : 1083.0}},{"key_as_string" : "2019-03-01T00:00:00.000Z","key" : 1551398400000,"doc_count" : 2,"max_view_count" : {"value" : 1604.0},"move_avg_view" : {"value" : 1278.0}},{"key_as_string" : "2019-04-01T00:00:00.000Z","key" : 1554076800000,"doc_count" : 2,"max_view_count" : {"value" : 1392.0},"move_avg_view" : {"value" : 1594.0}},{"key_as_string" : "2019-05-01T00:00:00.000Z","key" : 1556668800000,"doc_count" : 2,"max_view_count" : {"value" : 1228.0},"move_avg_view" : {"value" : 1498.0}},]}}

4. Cumulative Sum Aggregation

parent agg,
累计和聚合——基于父聚合(只能是histogram或date_histogram类型)的某个权值,对权值在每一个桶中求所有之前的桶的该值累计的和。
截止到当前bucket的累计统计值

GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}},"cur_sum_view": {"cumulative_sum": {"buckets_path": "max_view_count"}}}}}
}

返回

 "aggregations" : {"month_term" : {"buckets" : [{"key_as_string" : "2018-10-01T00:00:00.000Z","key" : 1538352000000,"doc_count" : 3,"max_view_count" : {"value" : 789.0},"cur_sum_view" : {"value" : 789.0}},{"key_as_string" : "2018-11-01T00:00:00.000Z","key" : 1541030400000,"doc_count" : 3,"max_view_count" : {"value" : 1299.0},"cur_sum_view" : {"value" : 2088.0}}......]}}
}

5. Bucket Script Aggregation: parent agg , 桶脚本聚合——基于父聚合的【一个或多个权值】,对这些权值通过脚本进行运算

返回


6. Bucket Selector Aggregation: parent agg , 对一组bucket执行过滤操作,只有满足过滤条件的bucket会被保留到结果集当中

GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}},"select_bucket": {"bucket_selector": {"buckets_path": {"var01": "max_view_count"},"script": "params.var01>1500"}}}}}
}

返回

"aggregations" : {"month_term" : {"buckets" : [{"key_as_string" : "2019-02-01T00:00:00.000Z","key" : 1548979200000,"doc_count" : 2,"max_view_count" : {"value" : 1584.0}},{"key_as_string" : "2019-03-01T00:00:00.000Z","key" : 1551398400000,"doc_count" : 2,"max_view_count" : {"value" : 1604.0}},{"key_as_string" : "2019-08-01T00:00:00.000Z","key" : 1564617600000,"doc_count" : 2,"max_view_count" : {"value" : 1523.0}},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}}]}}

7. Bucket Sort Aggregation:

parent agg, 对一组bucket进行排序z
使用样例

GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}},"select_bucket": {"bucket_sort": {"sort": [{"max_view_count": {"order": "desc"}}]}}}}}
}

返回

  "aggregations" : {"month_term" : {"buckets" : [{"key_as_string" : "2019-03-01T00:00:00.000Z","key" : 1551398400000,"doc_count" : 2,"max_view_count" : {"value" : 1604.0}},{"key_as_string" : "2019-09-01T00:00:00.000Z","key" : 1567296000000,"doc_count" : 2,"max_view_count" : {"value" : 1587.0}},{"key_as_string" : "2019-02-01T00:00:00.000Z","key" : 1548979200000,"doc_count" : 2,"max_view_count" : {"value" : 1584.0}},{"key_as_string" : "2019-08-01T00:00:00.000Z","key" : 1564617600000,"doc_count" : 2,"max_view_count" : {"value" : 1523.0}},{"key_as_string" : "2019-07-01T00:00:00.000Z","key" : 1561939200000,"doc_count" : 2,"max_view_count" : {"value" : 1499.0}},{"key_as_string" : "2019-06-01T00:00:00.000Z","key" : 1559347200000,"doc_count" : 2,"max_view_count" : {"value" : 1423.0}},{"key_as_string" : "2019-04-01T00:00:00.000Z","key" : 1554076800000,"doc_count" : 2,"max_view_count" : {"value" : 1392.0}}]}}

8. Serial Differencing Aggregation: parent agg 串行差分聚合

可以配置的参数

lag:滞后间隔(比如lag=7,表示每次从当前桶的值中减去其前面第7个桶的值)
buckets_path:用于计算均值的权值路径
gap_policy:空桶处理策略(skip/insert_zeros)
format:该聚合的输出格式定义


GET traffic_stats/_search
{"size": 0,"aggs": {"month_term": {"date_histogram": {"field": "date","calendar_interval": "month"},"aggs": {"max_view_count": {"max": {"field": "visits"}},"diff_bucket": {"serial_diff": {"buckets_path": "max_view_count","lag": 2}}}}}
}

返回

 "aggregations" : {"month_term" : {"buckets" : [{"key_as_string" : "2018-10-01T00:00:00.000Z","key" : 1538352000000,"doc_count" : 3,"max_view_count" : {"value" : 789.0}},{"key_as_string" : "2018-11-01T00:00:00.000Z","key" : 1541030400000,"doc_count" : 3,"max_view_count" : {"value" : 1299.0}},{"key_as_string" : "2018-12-01T00:00:00.000Z","key" : 1543622400000,"doc_count" : 3,"max_view_count" : {"value" : 1194.0},"diff_bucket" : {"value" : 405.0}},{"key_as_string" : "2019-01-01T00:00:00.000Z","key" : 1546300800000,"doc_count" : 2,"max_view_count" : {"value" : 972.0},"diff_bucket" : {"value" : -327.0}},{"key_as_string" : "2019-02-01T00:00:00.000Z","key" : 1548979200000,"doc_count" : 2,"max_view_count" : {"value" : 1584.0},"diff_bucket" : {"value" : 390.0}},......]}}

可以看到从第3个开始diff_bucket才开始有值,diff_bucket=(第3个bucket的max_view_count)-(第1个bucket的max_view_count)

03.elasticsearch pipeline aggregation查询相关推荐

  1. 02.elasticsearch bucket aggregation查询

    文章目录 1. bucket aggregation 查询类型概览 2. 数据准备 3. 使用样例 1. Terms Aggregation: 1. 普通的terms agg 2. 嵌套一个metri ...

  2. 01.elasticsearch metric aggregation 查询

    文章目录 1. 数据准备 2. metric aggregation分类 3.使用样例 1 . Avg Aggregation : 求query出来的结果的average 值 2 . Weighted ...

  3. Elasticsearch Pipeline Aggregation管道聚合详解

    文章目录 1. buckets_path 2. 特殊路径 3. Bucket Sort Aggregation 4. Avg Bucket Aggregation 5. Max Bucket Aggr ...

  4. ElasticSearch+聚合+Aggregation+示例

    ElasticSearch+聚合+Aggregation+示例 聚合提供了分组并统计数据的能力.理解聚合的最简单的方式是将其粗略地等同为SQL的GROUP BY和SQL聚合函数.在Elasticsea ...

  5. Elasticsearch:Aggregation 简介

    Aggregation 在中文中也被称作聚合.简单地说,Elasticsearch 中的 aggregation 聚合将你的数据汇总为指标.统计数据或其他分析.聚合可帮助你回答以下问题: 我的网站的平 ...

  6. Elasticsearch 5: 聚集查询

    目录 1. 聚集查询 2. 指标聚集 2.1 平均值聚集 2.1.1 avg 聚集 2.2 计数聚集与极值聚集 2.2.1 计数聚集 2.2.2 极值聚集 2.3 统计聚集 2.3.1 stats 聚 ...

  7. ElasticSearch实战系列五: ElasticSearch的聚合查询基础使用教程之度量(Metric)聚合

    Title:ElasticSearch实战系列四: ElasticSearch的聚合查询基础使用教程之度量(Metric)聚合 前言 在上上一篇中介绍了ElasticSearch实战系列三: Elas ...

  8. 【Spring Data ElasticSearch】高级查询,聚合

    [Spring Data ElasticSearch]高级查询,聚合 1. 高级查询 1.1 基本查询 1.2 自定义查询 1.3 分页查询 1.4 排序 2. 聚合 2.1 聚合为桶 2.2 嵌套聚 ...

  9. Elasticsearch(9) --- 聚合查询(Bucket聚合)

    Elasticsearch(9) --- 聚合查询(Bucket聚合) 系统小说 www.kuwx.net 上一篇讲了Elasticsearch聚合查询中的Metric聚合:Elasticsearch ...

最新文章

  1. pytorch 报错No module named torch
  2. Delphi中流对象 TStream
  3. iOS中 最新微信支付/最全的微信支付教程详解 韩俊强的博客
  4. C# 枚举类型在switch case语句中的使用
  5. 鸿蒙济判法讲义,2020-02-09《薛兆丰经济学讲义》读书笔记
  6. linux界面唤醒,Linux计算机实现自动唤醒和关闭的方法步骤详解
  7. 3打包忽略文件夹_Py打包exe(下篇): 进阶——用户体验改进
  8. .net byte转java byte_「Java知识收集整理」Java语法的基础
  9. 海康摄像机通过Ehome协议接入EasyCVR无法成功上线的原因排查及配置注意事项
  10. vs2010注册码 激活方法
  11. Android按键之Menu详解
  12. 怎样设置默认浏览器,绝对干货!
  13. windows xp apache php mysql_WindowsXP下安装和配置Apache2.2.22服务器+PHP5+Mysql5 wu金
  14. “你写公众号有啥用啊?还没我摊煎饼赚得多呢!”
  15. 几种投影的特点及分带方法
  16. IOS开发 当滑动tabelview时,使键盘滑落
  17. linux学习好文章,好网站
  18. java sns_SNS:美图秀秀的社交化变革
  19. SVM用于上证指数的预测
  20. 6套法律逻辑学试题及答案

热门文章

  1. 日服巫术online过驱动保护分析(纯工具)(工具+自写驱动)
  2. PostgreSQL学习笔记8之索引
  3. 长连接和Keepalive
  4. gh0st源码分析与远控的编写(二)
  5. 第01讲:必知必会,掌握 HTTP 基本原理
  6. Kafka发送超过broker限定大小的消息时Client和Broker端各自会有什么异常?
  7. Java虚拟机结构分析
  8. Comparable与Comparator浅析
  9. Mybatis SQL拦截器实现
  10. 【今晚9点】:对话黄琦——从FB到快手,短视频领域里的“实习生”