by截取字段 group_深入理解 group by【思考点滴】

网上搜索 group by，结果一大堆，千篇一律

……

现在从浅入深的讲一下 group by的应用

1、group by的作用

group by 简单来讲，就是分组，配合计算使用，计算方法如SUM、count、avg、max、min等…

应用举例 : 从交易表中查询出今天有交易的商户账户。

2、使用 group by 完成一个简单的应用

查询order_amount表中，是否有指定的账户：使用group by 就能做到 : select account_id from order_amount

group by account_id；

同样distinct 也能做到：使用 distinct 也能做到 : select distinct account_id from

order_amount ;

如上使用 group by 和 distinct 的效率是一样的，如果是查找是否存在，建议使用 order by + limit 1，扫描到第一条数据就完成，效率最高

select

account_id from

order_amount where

account_id =

xxx limit 1，

group by 和 distinct 使用上的差异

举例1：select count(distinct account_id)

from order_amount ;

举例2：select count(account_id) from order_amount

group by account_id;

感兴趣的话，可以自己运行一下。

差异：

group by : 先分组，再计算

distinct ：先去重，再计算，distinct 只是去重，不太适合条件和计算类查找

复杂应用，通过两个条件锁定一条记录

举例，获取一批商户，符合指定条件的最后一个订单详情(如：12点到15点之间，参与秒杀商户的最后一个订单)

方法1：

1、select * from (select * from order_amount

where xxx yyy zzz) as sel group

by xy

方法2：

2、select * from order_amount

group by xy

having id = (select max(id) from order_amount

where xxx yyy zzz)

注 distinct 无法完成本类需求，只能静静的看着 group

by 表演了。

方法1分析：涉及子查询，子查询空间和效率的问题都暴露出来了。

select * from (select * from order_amount

where xxx yyy zzz) as sel group

by xy

优点：

可以实现功能

缺点：

1、group by 中的，除了算法使用的字段和group

by 以外的字段，其它字段的值是随机的，默认获取的是选择查询索引(where或者group

by)的第一条符合分组的记录填充的。

2、当子查询的结果非常大的时候，数据库服务器的临时表空间会用完，因此多余的查询数据会丢失

3、子查询生成的临时表，没有索引可用，如果临时表数据很大，则主select语句的效率也很低

4、子查询结果很大的时候，生成临时表的时间也很长

如果子查询的数据超过1G【1G一般是mysql中默认的，子查询配置的表大小，数量差不多是500万条以上数据】，则后面的查询结构就丢失了。造成随机性数据丢失的问题。

所以一般数据量都不会踩到这个坑，踩到这个坑的都不是一般的数据量！

方法2分析：

select * from order_amount

group by xy

having id = (select max(id) from order_amount

where xxx yyy zzz)

优点：

可以实现功能，合理的利用了having

语句，查询结果集很小，无临时表空间占满的问题

缺点：

1、效率偏低。

方法3：

select * from order_amount

where id in (select max(id) from order_amount

where

xxx yyy

zzz

group

xy)

优点：

可以实现功能，查询结果集很小，无临时表空间占满的问题，效率应该比网友指路要好很多

缺点：

不能说没有缺点，暂时是最好的选择。

更复杂的需求

step 1：先定位出唯一记录的ID或者索引信息

需求1:12点到15点之间，最后完成，且最后创建的订单select max(concat(complete_time,create_time)) from order_amount where xxx yyy zzz group by xy

需求2:12点到15点之间，最后完成，且最先创建的订单：select max(concat(complete_time,2000000000-create_time)) from order_amount where xxx yyy zzz group by xy

需求3:12点到15点之间，最先完成，且最后创建的订单：select min(concat(complete_time,2000000000-create_time)) from order_amount where xxx yyy zzz group by xy

需求4:12点到15点之间，最先完成，且最先创建的订单：select min(concat(complete_time,create_time)) from order_amount where xxx yyy zzz group by xy

step 2：通过如上唯一信息，查询唯一数据

如上只是个举例，总的来讲，还是通过 max/min(concat(xxx,yyy,bbb,...)) 等方式完成按需查找，找到符合条件的唯一记录，其中 xxx,yyy,bbb 可以是字段，也可以是一种运算，如2000000000-create_time，总的原则来讲，就是想通过max或者min搜索出想要的唯一信息。如果觉得数据量不是很大，则可以使用 select * from (select * from order_amount order

by complete_time desc,

create_time asc where

xxx yyy zzz) as sel

group by xy，zz，dd，通过内查询按要求排序，通过group by筛选出第一条记录。

group by 总结

1、group by 非计算列，非group by列，如何自行控制？

解决方法：子查询，子查询按需进行排序

select * from (select * from order_amount order

by complete_time desc,

create_time asc where

xxx yyy zzz) as sel

group by xy

2、group by 也是优先使用索引。

3、group by 一次可以完成多个函数，可以通过多个字段进行分组

select count(amount) as cnt, SUM(amount)

as total_amount, avg(amount)

as avg_amount, max(id)

as max_id, min(id)

as min_id, xy, za, hs from

order_amount

group by xy,

za,hs

4、同时可以使用 with rollup再获取上级汇总

select count(amount) as cnt,

SUM(amount) as total_amount, avg(amount)

as avg_amount,

max(id) as max_id,

min(id) as min_id, xy, za, hs from

order_amount

group by xy, za,hs with

rollup

5、group by 之后的结果也可以排序,并非select的条件，且不影响select的结果。

select count(amount) as cnt,

SUM(amount) as total_amount, avg(amount)

as avg_amount,

max(id) as max_id,

min(id) as min_id, xy, za, hs from

order_amount

group by xy asc, za desc,hs asc

6、使用 group by 的时候，难免会用到子查询，一定要严格审视子查询结果的大小和性能

by截取字段 group_深入理解 group by【思考点滴】相关推荐

理解group by
先来看下表1,表名为test: 表1 执行如下SQL语句: 1 2 SELECT name FROM test GROUP BY name 你应该很容易知道运行的结果,没错,就是下表2: 表2 可是为 ...
sql截取字段中一部分值
sql截取字段中一部分值查出表中reportDept 字段倒数两位的内容不等于"天津"的内容 select * from (select reportDept from emp) ...
alter添加多个字段_卡片/模板/笔记/字段——你都理解了吗？
卡片/模板/笔记/字段--你都理解了吗? 秃飞猛进的Daniel Anki干货铺今天创建卡片在开始创建卡片之前,有两个概念前悉知. 模板字段模板模板的出现是为了简化大量的重复劳动,将无需频繁 ...
html截取字段,截取字符串（除去HTML标记）
函数一:ASP截取字符串函数功能:利用VBScript截取指定长度的字符串,超过长度的地方用"--"代替. function sysSubStr($string,$length, ...
by截取字段 group_sqlserver group by后获取其他字段（多种方法）
大家都知道用group by的话,select 后面指定的字段必须与group by后面的一致.group by 只有个别字段,如果拿出其他未分组的字段信息呢?在网上搜了下, 总结如下: 使用了gro ...
postgresql截取字段的值
使用split_part()切割函数 split_part(string text, delimiter text2, field int) text要切割的字段(数据库字段): text2按照什么形 ...
在sql server里，日期字段按天数进行group by查询的方法
比如一张表里有如下时间字段的记录: 2009-01-01 12:00:00 2008-12-23 11:00:11 2009-12-22 11:22:00 2009-01-01 14:00:00 20 ...
oracle 截取字符串中间_oracle截取字段中的部分字符串
使用Oracle中Instr()和substr()函数: 在Oracle中可以使用instr函数对某个字符串进行判断,判断其是否含有指定的字符. 其语法为: instr(sourceString,de ...
mysql 截取字段、截取字符串
MySQL截取某一指定字段的部分内容 1.MySQL相关语法 1.1. 字符串截取:left(str, length) select left('sqlstudy.com', 3); 结果:| sql ...

by截取字段 group_深入理解 group by【思考点滴】

by截取字段 group_深入理解 group by【思考点滴】相关推荐

最新文章

热门文章