使用 https://github.com/taowen/es-monitor 可以用 SQL 进行 elasticsearch 的查询。今天需要做一些最简单的聚合查询

COUNT(*)

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select count(*) from quote
EOF

{"count(*)": 20994400}

Elasticsearch

{"aggs": {}, "size": 0
}

{"hits": {"hits": [], "total": 20994400, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 26, "timed_out": false
}

这个就不算聚合，只是看了一下最终满足过滤条件的 total hits count。

COUNT(ipo_year)

这个和 COUNT(*) 的区别是 COUNT(ipo_year) 要求字段必须有值才算一个。

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select count(ipo_year) from symbol
EOF

{"count(ipo_year)": 2898}

Elasticsearch

{"aggs": {"count(ipo_year)": {"value_count": {"field": "ipo_year"}}}, "size": 0
}

{"hits": {"hits": [], "total": 6714, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 55, "aggregations": {"count(ipo_year)": {"value": 2898}}, "timed_out": false
}

Profile

[{"query": [{"query_type": "MatchAllDocsQuery","lucene": "*:*","time": "0.3204170000ms","breakdown": {"score": 0,"create_weight": 10688,"next_doc": 278660,"match": 0,"build_scorer": 31069,"advance": 0}}],"rewrite_time": 2279,"collector": [{"name": "MultiCollector","reason": "search_multi","time": "2.957183000ms","children": [{"name": "TotalHitCountCollector","reason": "search_count","time": "0.2319240000ms"},{"name": "ValueCountAggregator: [count(ipo_year)]","reason": "aggregation","time": "1.999916000ms"}]}]}
]

这是我们的第一个聚合例子。可以从profile结果看出来，其实现方式在采集文档的时候加上了ValueCountAggregator统计了字段非空的文档数量。

COUNT(DISTINCT ipo_year)

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select count(distinct ipo_year) from symbol
EOF

{"count(distinct ipo_year)": 39}

Elasticsearch

{"aggs": {"count(distinct ipo_year)": {"cardinality": {"field": "ipo_year"}}}, "size": 0
}

{"hits": {"hits": [], "total": 6714, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 24, "aggregations": {"count(distinct ipo_year)": {"value": 39}}, "timed_out": false
}

Profile

[{"query": [{"query_type": "MatchAllDocsQuery","lucene": "*:*","time": "0.2033600000ms","breakdown": {"score": 0,"create_weight": 7501,"next_doc": 162905,"match": 0,"build_scorer": 32954,"advance": 0}}],"rewrite_time": 2300,"collector": [{"name": "MultiCollector","reason": "search_multi","time": "2.438386000ms","children": [{"name": "TotalHitCountCollector","reason": "search_count","time": "0.2240230000ms"},{"name": "CardinalityAggregator: [count(distinct ipo_year)]","reason": "aggregation","time": "1.471620000ms"}]}]}
]

这个例子里 ValueCountAggregator 变成了 CardinalityAggregator

SUM(market_cap)

MIN/MAX/AVG/SUM 这几个简单的聚合也是支持的

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select sum(market_cap) from symbol
EOF

{"sum(market_cap)": 11454155180142.0}

Elasticsearch

{"aggs": {"sum(market_cap)": {"sum": {"field": "market_cap"}}}, "size": 0
}

{"hits": {"hits": [], "total": 6714, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 15, "aggregations": {"sum(market_cap)": {"value": 11454155180142.0}}, "timed_out": false
}

Profile

[{"query": [{"query_type": "MatchAllDocsQuery","lucene": "*:*","time": "0.2026870000ms","breakdown": {"score": 0,"create_weight": 8097,"next_doc": 163069,"match": 0,"build_scorer": 31521,"advance": 0}}],"rewrite_time": 2151,"collector": [{"name": "MultiCollector","reason": "search_multi","time": "2.461247000ms","children": [{"name": "TotalHitCountCollector","reason": "search_count","time": "0.3302140000ms"},{"name": "SumAggregator: [sum(market_cap)]","reason": "aggregation","time": "1.102363000ms"}]}]}
]

过滤 + 聚合

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select sum(market_cap) from symbol where ipo_year=1998
EOF

{"sum(market_cap)": 107049150786.0}

Elasticsearch

{"query": {"term": {"ipo_year": 1998}}, "aggs": {"sum(market_cap)": {"sum": {"field": "market_cap"}}}, "size": 0
}

{"hits": {"hits": [], "total": 56, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 11, "aggregations": {"sum(market_cap)": {"value": 107049150786.0}}, "timed_out": false
}

Profile

[{"query": [{"query_type": "TermQuery","lucene": "ipo_year:`N","time": "0.4526400000ms","breakdown": {"score": 0,"create_weight": 220579,"next_doc": 159412,"match": 0,"build_scorer": 72649,"advance": 0}}],"rewrite_time": 3750,"collector": [{"name": "MultiCollector","reason": "search_multi","time": "0.2203470000ms","children": [{"name": "TotalHitCountCollector","reason": "search_count","time": "0.009478000000ms"},{"name": "SumAggregator: [sum(market_cap)]","reason": "aggregation","time": "0.1557820000ms"}]}]}
]

query 过滤完，然后再计算 aggs

原文来自：https://segmentfault.com/a/1190000003502849

ElasticSearch学习_陶文6_【03】把 Elasticsearch 当数据库使：简单指标相关推荐

ElasticSearch学习_陶文2_时间序列数据库的秘密(2)——索引
如何快速检索? Elasticsearch是通过Lucene的倒排索引技术实现比关系型数据库更快的过滤.特别是它对多条件的过滤支持非常好,比如年龄在18和30之间,性别为女性这样的组合查询.倒排索引很 ...
ElasticSearch学习_陶文1_时间序列数据库的秘密（1）—— 介绍
什么是时间序列数据?最简单的定义就是数据格式里包含timestamp字段的数据.比如股票市场的价格,环境中的温度,主机的CPU使用率等.但是又有什么数据是不包含timestamp的呢?几乎所有的数据都 ...
ElasticSearch学习_陶文3_时间序列数据库的秘密（3）——加载和分布式计算
加载如何利用索引和主存储,是一种两难的选择. 选择不使用索引,只使用主存储:除非查询的字段就是主存储的排序字段,否则就需要顺序扫描整个主存储. 选择使用索引,然后用找到的row id去主存储加载数据 ...
ElasticSearch学习_陶文4_【01】把 Elasticsearch 当数据库使：表结构定义
Elaticsearch 有非常好的查询性能,以及非常强大的查询语法.在一定场合下可以替代RDBMS做为OLAP的用途.但是其官方查询语法并不是SQL,而是一种Elasticsearch独创的DSL. ...
ElasticSearch学习_陶文5_【02】把 Elasticsearch 当数据库使：过滤和排序
使用 https://github.com/taowen/es-monitor 可以用 SQL 进行 elasticsearch 的查询.本章介绍简单的文档过滤条件 exchange='nyse' S ...
elasticsearch删除索引_一文带您了解 Elasticsearch 中，如何进行索引管理（图文教程）
在 Elasticsearch 中,索引是一个非常重要的概念,它是具有相同结构的文档集合.类比关系型数据库,比如 Mysql, 你可以把它对标看成和库同级别的概念. 今天小哈将带着大家了解, 在 El ...
java etl工具_一文带你入门ETL工具-datax的简单使用
什么是ETL? ETL负责将分布的.异构数据源中的数据如关系数据.平面数据文件等抽取到临时中间层后进行清洗.转换.集成,最后加载到数据仓库或数据集市中,成为联机分析处理.数据挖掘的基础. ETL是数据 ...
elasticsearch原理_花几分钟看一下Elasticsearch原理解析与性能调优
基本概念定义一个分布式的实时文档存储,每个字段可以被索引与搜索一个分布式实时分析搜索引擎能胜任上百个服务节点的扩展,并支持 PB 级别的结构化或者非结构化数据用途全文检索结构化搜索分 ...
数据库身份证号用什么类型_【文末送书】MySQL数据库？看这一篇干货文章就够了！...
前言为啥学习MySQL呢?因为MySQL是最流行的关系型数据库管理系统之一,在web应用方面,MySQL是最好的软件.MySQL所使用的sql语言是用于访问数据库的最常用标准化语言. 放心,读这期内 ...

ElasticSearch学习_陶文6_【03】把 Elasticsearch 当数据库使：简单指标

COUNT(*)

COUNT(ipo_year)

COUNT(DISTINCT ipo_year)

SUM(market_cap)

过滤 + 聚合

ElasticSearch学习_陶文6_【03】把 Elasticsearch 当数据库使：简单指标相关推荐

最新文章

热门文章