ES分组聚合:计算每个tag下的商品数量且某个filed包含指定关键字,分组,平均,每个tags下的平均价格,排序,指定范围区间
1、第一个分析需求:计算每个tag下的商品数量
GET /ecommerce/product/_search
{"aggs": {"group_by_tags": {"terms": { "field": "tags" }}}
}
执行之后的结果是:
{"error": {"root_cause": [{"type": "illegal_argument_exception","reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."}],"type": "search_phase_execution_exception","reason": "all shards failed","phase": "query","grouped": true,"failed_shards": [{"shard": 0,"index": "ecommerce","node": "urqovJ9yQPCO6fNM70Lc8w","reason": {"type": "illegal_argument_exception","reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."}}],"caused_by": {"type": "illegal_argument_exception","reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [tags] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory."}},"status": 400
}
上面的报错的意思是要将文本field的fielddata属性设置为true
PUT /ecommerce/_mapping/product
{"properties": {"tags": {"type": "text","fielddata": true}}
}
设置完成之后的效果是:
{"acknowledged": true
}
然后再执行下面的操作:
GET /ecommerce/product/_search
{"aggs": {"group_by_tags": {"terms": {"field": "tags"}}}
}
执行,然后看最后面的结果:
"aggregations": {"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "fangzhu","doc_count": 2},{"key": "meibai","doc_count": 2},{"key": "qingxin","doc_count": 1}]}
}
说明按照tags里面的内容进行了buckets分组统计,可以看到每个tags出现的次数。
GET /ecommerce/product/_search
{"size": 0,"aggs": {"all_tags": {"terms": { "field": "tags" }}}
}
{"took": 20,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 4,"max_score": 0,"hits": []},"aggregations": {"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "fangzhu","doc_count": 2},{"key": "meibai","doc_count": 2},{"key": "qingxin","doc_count": 1}]}}
}
2、第二个聚合分析的需求:对名称中包含yagao的商品,计算每个tag下的商品数量
GET /ecommerce/product/_search
{"size": 0,"query": {"match": {"name": "yagao"}},"aggs": {"all_tags": {"terms": {"field": "tags"}}}
}
运行结果是:
{"took": 6,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 4,"max_score": 0,"hits": []},"aggregations": {"all_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "fangzhu","doc_count": 2},{"key": "meibai","doc_count": 2},{"key": "qingxin","doc_count": 1}]}}
}
3、第三个聚合分析的需求:先分组,再算每组的平均值,计算每个tag下的商品的平均价格
GET /ecommerce/product/_search
{"size": 0,"aggs" : {"group_by_tags" : {"terms" : { "field" : "tags" },"aggs" : {"avg_price" : {"avg" : { "field" : "price" }}}}}
}
{"took": 8,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 4,"max_score": 0,"hits": []},"aggregations": {"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "fangzhu","doc_count": 2,"avg_price": {"value": 27.5 }},{"key": "meibai","doc_count": 2,"avg_price": {"value": 40 }},{"key": "qingxin","doc_count": 1,"avg_price": {"value": 40 }}]}}
}
4、第四个数据分析需求:计算每个tag下的商品的平均价格,并且按照平均价格降序排序
GET /ecommerce/product/_search
{"size": 0,"aggs" : {"all_tags" : {"terms" : { "field" : "tags", "order": { "avg_price": "desc" } },"aggs" : {"avg_price" : {"avg" : { "field" : "price" }}}}}
}
下面的语句的意思是:按照tags进行分组,并按照它里面的平均值进行降序排列
"terms" : { "field" : "tags", "order": { "avg_price": "desc" } }
上面的运行结果是:
{"took": 3,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 4,"max_score": 0,"hits": []},"aggregations": {"all_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "meibai","doc_count": 2,"avg_price": {"value": 40 }},{"key": "qingxin","doc_count": 1,"avg_price": {"value": 40 }},{"key": "fangzhu","doc_count": 2,"avg_price": {"value": 27.5 }}]}}
}
5、第五个数据分析需求:按照指定的价格范围区间进行分组,然后在每组内再按照tag进行分组,最后再计算每组的平均价格
GET /ecommerce/product/_search
{"size": 0,"aggs": {"group_by_price": {"range": {"field": "price","ranges": [{"from": 0,"to": 20},{"from": 20,"to": 40},{"from": 40,"to": 50}]},"aggs": {"group_by_tags": {"terms": {"field": "tags"},"aggs": {"average_price": {"avg": {"field": "price"}}}}}}}
}
最终的结果:
{"took": 61,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 4,"max_score": 0,"hits": []},"aggregations": {"group_by_price": {"buckets": [{"key": "0.0-20.0","from": 0,"to": 20,"doc_count": 0,"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [] }},{"key": "20.0-40.0","from": 20,"to": 40,"doc_count": 2,"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [ { "key": "fangzhu", "doc_count": 2, "average_price": { "value": 27.5 } }, { "key": "meibai", "doc_count": 1, "average_price": { "value": 30 } } ] }},{"key": "40.0-50.0","from": 40,"to": 50,"doc_count": 1,"group_by_tags": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [ { "key": "qingxin", "doc_count": 1, "average_price": { "value": 40 } } ] }}]}}
}
ES分组聚合:计算每个tag下的商品数量且某个filed包含指定关键字,分组,平均,每个tags下的平均价格,排序,指定范围区间相关推荐
- Seaborn使用violinplot函数可视化多分组小提琴图(violin plot)、每个小提琴图内部包含两个分组、使用inner函数设置在小提琴图中使用虚线显示分位数位置
Seaborn使用violinplot函数可视化多分组小提琴图(violin plot).每个小提琴图内部包含两个分组.使用inner函数设置在小提琴图中使用虚线显示分位数位置(inner = 'qu ...
- Elasticsearch Java API 的使用(13)—分组聚合之一
分组聚和不像度量聚合那样通过字段进行计算,而是根据文档创建分组.每个聚合都关联一个标准(取决于聚合的类型),决定了一个文档在当前的条件下是否会"划入"分组中. 换句话说,分组实际上 ...
- pandas php,pandas分组聚合代码详解
pandas分组聚合代码详解 本篇文章小编给大家分享一下pandas分组聚合代码详解,对大家学习pandas分组聚合有一定的帮助,小编觉得挺不错的,现在分享给大家供大家参考,有需要的小伙伴们可以来看看 ...
- ES group分组聚合的坑
参考链接:https://blog.csdn.net/u010454030/article/details/71762838 ES group分组聚合的坑 原来知道Elasticsearch在分组聚合 ...
- Pandas 统计分析基础 笔记4 任务4.4 使用分组聚合进行组内计算
文章目录 pandas_任务4.4 使用分组聚合进行组内计算 4.4.1 使用groupby方法拆分数据 代码 4-51 对菜品订单详情表依据订单编号分组 代码 4-52 GroupBy 类求均值,标 ...
- pandas编写自定义函数计算多个数据列的加和(sum)、使用groupby函数和apply函数聚合计算分组内多个数据列的加和
pandas编写自定义函数计算多个数据列的加和(sum).使用groupby函数和apply函数聚合计算分组内多个数据列的加和 目录
- pandas使用groupby函数进行分组聚合、使用agg函数指定聚合统计计算的数值变量、并自定义统计计算结果的名称(naming columns after aggregation)
pandas使用groupby函数进行分组聚合.使用agg函数指定聚合统计计算的数值变量.并自定义统计计算结果的名称(naming columns after aggregation in dataf ...
- pandas使用groupby函数按照多个分组变量进行分组聚合统计、使用agg函数计算分组的多个统计指标(grouping by multiple columns in dataframe)
pandas使用groupby函数按照多个分组变量进行分组聚合统计.使用agg函数计算分组的多个统计指标(grouping by multiple columns in dataframe) 目录
- pandas使用groupby函数、agg函数获取每个分组聚合对应的标准差(std)实战:计算分组聚合单数据列的标准差(std)、计算分组聚合多数据列的标准差(std)
pandas使用groupby函数.agg函数获取每个分组聚合对应的标准差(std)实战:计算分组聚合单数据列的标准差(std).计算分组聚合多数据列的标准差(std) 目录
最新文章
- R-C3D 视频活动检测的经典算法
- JS 前20个常用操作字符串的函数
- 你在学校我安排了你没有做到最多凶你一顿,在公司不一样,直接得让走人!...
- java native方法
- solr 如何实现精确查询
- 阶段1 语言基础+高级_1-3-Java语言高级_06-File类与IO流_05 IO字符流_5_flush方法和close方法的区别...
- ip地址与整数的相互转化
- 禁用联想笔记本电脑自带的键盘
- 【自我解析】2020华为杯数学建模比赛A题
- 专业版谷歌地球地图永久版带手机版App
- Qt 语言家实现中英文切换(解决纯代码添加部件的中英文转换问题)
- 一款很哇塞的csdn开发助手,你确定不来看看嘛
- 小说更新太慢怎么办_写网络小说写得太慢怎么办?
- 锐龙R7PRO 4750G、锐龙R5 PRO 4650G和 锐龙R3 PRO4350G怎么样 哪个好
- 蓄水池问题c语言编程,蓄水池算法(Reservoir Sampling)
- 将uc/OS-III移植到stm32F103上并创建多任务
- java-php-python-ssm校园易购二手交易平台计算机毕业设计
- 解读Conflux的共识机制
- iphone修改app名称_iPhone6 plus怎么修改图标名字?苹果6 plus设置修改图标名字教程...
- 「Ceph源码分析」纠删码解码
热门文章
- python lambda函数详细解析(据说面试90%的人经常遇到)
- Django框架(15.Django中的自关联)
- python天天向上的力量 B
- VTK:RT 分析源用法实战
- VTK:隐式布尔值demo用法实战
- wxWidgets:wxCommandLinkButton类用法
- boost::spirit模块实现使用不同的输出语法格式化单个容器类型的测试程序
- boost::mpl模块实现unique相关的测试程序
- boost::boyer_myrvold_params::kuratowski_subgraph用法的测试程序
- boost::fusion::make_map用法的测试程序