使用 https://github.com/taowen/es-monitor 可以用 SQL 进行 elasticsearch 的查询。今天需要做一些最简单的聚合查询

COUNT(*)

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select count(*) from quote
EOF
{"count(*)": 20994400}

Elasticsearch

{"aggs": {}, "size": 0
}
{"hits": {"hits": [], "total": 20994400, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 26, "timed_out": false
}

这个就不算聚合,只是看了一下最终满足过滤条件的 total hits count。

COUNT(ipo_year)

这个和 COUNT(*) 的区别是 COUNT(ipo_year) 要求字段必须有值才算一个。

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select count(ipo_year) from symbol
EOF
{"count(ipo_year)": 2898}

Elasticsearch

{"aggs": {"count(ipo_year)": {"value_count": {"field": "ipo_year"}}}, "size": 0
}
{"hits": {"hits": [], "total": 6714, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 55, "aggregations": {"count(ipo_year)": {"value": 2898}}, "timed_out": false
}

Profile

[{"query": [{"query_type": "MatchAllDocsQuery","lucene": "*:*","time": "0.3204170000ms","breakdown": {"score": 0,"create_weight": 10688,"next_doc": 278660,"match": 0,"build_scorer": 31069,"advance": 0}}],"rewrite_time": 2279,"collector": [{"name": "MultiCollector","reason": "search_multi","time": "2.957183000ms","children": [{"name": "TotalHitCountCollector","reason": "search_count","time": "0.2319240000ms"},{"name": "ValueCountAggregator: [count(ipo_year)]","reason": "aggregation","time": "1.999916000ms"}]}]}
]

这是我们的第一个聚合例子。可以从profile结果看出来,其实现方式在采集文档的时候加上了ValueCountAggregator统计了字段非空的文档数量。

COUNT(DISTINCT ipo_year)

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select count(distinct ipo_year) from symbol
EOF
{"count(distinct ipo_year)": 39}

Elasticsearch

{"aggs": {"count(distinct ipo_year)": {"cardinality": {"field": "ipo_year"}}}, "size": 0
}
{"hits": {"hits": [], "total": 6714, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 24, "aggregations": {"count(distinct ipo_year)": {"value": 39}}, "timed_out": false
}

Profile

[{"query": [{"query_type": "MatchAllDocsQuery","lucene": "*:*","time": "0.2033600000ms","breakdown": {"score": 0,"create_weight": 7501,"next_doc": 162905,"match": 0,"build_scorer": 32954,"advance": 0}}],"rewrite_time": 2300,"collector": [{"name": "MultiCollector","reason": "search_multi","time": "2.438386000ms","children": [{"name": "TotalHitCountCollector","reason": "search_count","time": "0.2240230000ms"},{"name": "CardinalityAggregator: [count(distinct ipo_year)]","reason": "aggregation","time": "1.471620000ms"}]}]}
]

这个例子里 ValueCountAggregator 变成了 CardinalityAggregator

SUM(market_cap)

MIN/MAX/AVG/SUM 这几个简单的聚合也是支持的

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select sum(market_cap) from symbol
EOF
{"sum(market_cap)": 11454155180142.0}

Elasticsearch

{"aggs": {"sum(market_cap)": {"sum": {"field": "market_cap"}}}, "size": 0
}
{"hits": {"hits": [], "total": 6714, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 15, "aggregations": {"sum(market_cap)": {"value": 11454155180142.0}}, "timed_out": false
}

Profile

[{"query": [{"query_type": "MatchAllDocsQuery","lucene": "*:*","time": "0.2026870000ms","breakdown": {"score": 0,"create_weight": 8097,"next_doc": 163069,"match": 0,"build_scorer": 31521,"advance": 0}}],"rewrite_time": 2151,"collector": [{"name": "MultiCollector","reason": "search_multi","time": "2.461247000ms","children": [{"name": "TotalHitCountCollector","reason": "search_count","time": "0.3302140000ms"},{"name": "SumAggregator: [sum(market_cap)]","reason": "aggregation","time": "1.102363000ms"}]}]}
]

过滤 + 聚合

SQL

$ cat << EOF | ./es_query.py http://127.0.0.1:9200
select sum(market_cap) from symbol where ipo_year=1998
EOF
{"sum(market_cap)": 107049150786.0}

Elasticsearch

{"query": {"term": {"ipo_year": 1998}}, "aggs": {"sum(market_cap)": {"sum": {"field": "market_cap"}}}, "size": 0
}
{"hits": {"hits": [], "total": 56, "max_score": 0.0}, "_shards": {"successful": 1, "failed": 0, "total": 1}, "took": 11, "aggregations": {"sum(market_cap)": {"value": 107049150786.0}}, "timed_out": false
}

Profile

[{"query": [{"query_type": "TermQuery","lucene": "ipo_year:`N","time": "0.4526400000ms","breakdown": {"score": 0,"create_weight": 220579,"next_doc": 159412,"match": 0,"build_scorer": 72649,"advance": 0}}],"rewrite_time": 3750,"collector": [{"name": "MultiCollector","reason": "search_multi","time": "0.2203470000ms","children": [{"name": "TotalHitCountCollector","reason": "search_count","time": "0.009478000000ms"},{"name": "SumAggregator: [sum(market_cap)]","reason": "aggregation","time": "0.1557820000ms"}]}]}
]

query 过滤完,然后再计算 aggs

原文来自:https://segmentfault.com/a/1190000003502849

ElasticSearch学习_陶文6_【03】把 Elasticsearch 当数据库使:简单指标相关推荐

  1. ElasticSearch学习_陶文2_时间序列数据库的秘密(2)——索引

    如何快速检索? Elasticsearch是通过Lucene的倒排索引技术实现比关系型数据库更快的过滤.特别是它对多条件的过滤支持非常好,比如年龄在18和30之间,性别为女性这样的组合查询.倒排索引很 ...

  2. ElasticSearch学习_陶文1_时间序列数据库的秘密(1)—— 介绍

    什么是时间序列数据?最简单的定义就是数据格式里包含timestamp字段的数据.比如股票市场的价格,环境中的温度,主机的CPU使用率等.但是又有什么数据是不包含timestamp的呢?几乎所有的数据都 ...

  3. ElasticSearch学习_陶文3_时间序列数据库的秘密(3)——加载和分布式计算

    加载 如何利用索引和主存储,是一种两难的选择. 选择不使用索引,只使用主存储:除非查询的字段就是主存储的排序字段,否则就需要顺序扫描整个主存储. 选择使用索引,然后用找到的row id去主存储加载数据 ...

  4. ElasticSearch学习_陶文4_【01】把 Elasticsearch 当数据库使:表结构定义

    Elaticsearch 有非常好的查询性能,以及非常强大的查询语法.在一定场合下可以替代RDBMS做为OLAP的用途.但是其官方查询语法并不是SQL,而是一种Elasticsearch独创的DSL. ...

  5. ElasticSearch学习_陶文5_【02】把 Elasticsearch 当数据库使:过滤和排序

    使用 https://github.com/taowen/es-monitor 可以用 SQL 进行 elasticsearch 的查询.本章介绍简单的文档过滤条件 exchange='nyse' S ...

  6. elasticsearch删除索引_一文带您了解 Elasticsearch 中,如何进行索引管理(图文教程)

    在 Elasticsearch 中,索引是一个非常重要的概念,它是具有相同结构的文档集合.类比关系型数据库,比如 Mysql, 你可以把它对标看成和库同级别的概念. 今天小哈将带着大家了解, 在 El ...

  7. java etl工具_一文带你入门ETL工具-datax的简单使用

    什么是ETL? ETL负责将分布的.异构数据源中的数据如关系数据.平面数据文件等抽取到临时中间层后进行清洗.转换.集成,最后加载到数据仓库或数据集市中,成为联机分析处理.数据挖掘的基础. ETL是数据 ...

  8. elasticsearch原理_花几分钟看一下Elasticsearch原理解析与性能调优

    基本概念 定义 一个分布式的实时文档存储,每个字段 可以被索引与搜索 一个分布式实时分析搜索引擎 能胜任上百个服务节点的扩展,并支持 PB 级别的结构化或者非结构化数据 用途 全文检索 结构化搜索 分 ...

  9. 数据库身份证号用什么类型_【文末送书】MySQL数据库?看这一篇干货文章就够了!...

    前言 为啥学习MySQL呢?因为MySQL是最流行的关系型数据库管理系统之一,在web应用方面,MySQL是最好的软件.MySQL所使用的sql语言是用于访问数据库的最常用标准化语言. 放心,读这期内 ...

最新文章

  1. 数据结构与算法实验祝恩_《数据结构与算法》实验教学大纲
  2. java HashMap的使用
  3. mysql-python 安装错误: Cannot open include file: 'config-win.h': No such file or directory
  4. python的全局变量能暂存数据吗_Python 中的全局变量 局部变量
  5. Visual Stdio的解决方案资源管理器位置调整
  6. BZOJ3123: [Sdoi2013]森林
  7. 运维人员打字耍不要快_法考经验与教训 —— 打字
  8. c++ java通信 protocol buffer,google protocol buffer (C++,Java序列化应用实例)
  9. 网络基础Cisco路由交换四
  10. python遍历本地文件系统
  11. Hardware Emulation Platform (硬件仿真平台) 在IC验证中的运用情况如何?
  12. Python 绝对简明手册
  13. Mac 显示和隐藏文件
  14. 云集新功能:移动 Web 调试从未变得如此简单
  15. 纯净内存清理加速软件(Mem Reduct)
  16. VMware、vSphere 6.0 网络和存储配置
  17. mysql io 优化_mysql 中io优化
  18. Tiny6410+K9GAG08U0E
  19. jQuery css和显示隐藏及siblings使用
  20. 没学后端也能开发小程序——微信小程序云开发的介绍知识

热门文章

  1. Elasticsearch入门到实战(二)
  2. 【Robotium学习笔记】搭建环境篇(2016.3.1)
  3. 热控TSI大机轴振位移传感器/PR6423/000-031
  4. numpy numpy.concatenate()函数
  5. 华大单片机HC32F003如何新建工程(ddl库版本)
  6. 第四章 我国农民家庭经营
  7. 关于信号源有哪些参数与功能(二)
  8. 虚拟机上CentOS 7 架设ftp服务器 最新版系统超详细!!
  9. 神经网络计算机,什么是神经网络计算机?
  10. 简单的英语四级作文可能用到的句子