07 练习 SUM聚合函数

聚合问题
1.
算出 orders 表格中的 poster_qty 纸张总订单量。
SELECT SUM(poster_qty) AS order_poster_number
FROM orders
# 结果返回:
order_poster_number
7236462.
算出 orders 表格中 standard_qty 纸张的总订单量。
SELECT SUM(standard_qty) AS order_standard_number
FROM orders
# 结果返回
order_standard_number
19383463.
根据 orders 表格中的 total_amt_usd 得出总销售额。
SELECT SUM(total_amt_usd) AS order_total_amt
FROM orders
# 结果返回
order_total_amt
23141511.834.
算出 orders 表格中每个订单在 standard 和 gloss 纸张上消费的数额。结果应该是表格中每个订单的金额。
SELECT id AS id, standard_amt_usd + gloss_amt_usd AS order_twopart_amt
FROM orders
# 解释:该问题不需要利用聚合函数,直接相加,对于每个订单进行操作5.
每个订单的 price/standard_qty 纸张各不相同。我想得出 orders 表格中每个销售机会的这一比例。
SELECT SUM(standard_amt_usd)/SUM(standard_qty) AS standard_price_per_unit
FROM orders;
# 结果返回
standard_price_per_unit
4.9900000000000000

11.练习 MIN  MAX AVG聚合函数

问题:MIN、MAX 与 AVERAGE
1.
最早的订单下于何时?
SELECT MIN(occurred_at)
FROM web_events
# 结果返回
min
2013-12-04T04:18:29.000Z2.
尝试执行和第一个问题一样的查询,但是不使用聚合函数。
SELECT occurred_at
FROM web_events
ORDER BY occurred_at
LIMIT 1
# 结果返回
occurred_at
2013-12-04T04:18:29.000Z3.
最近的 web_event 发生在什么时候?
SELECT MAX(occurred_at)
FROM web_events
#结果返回
max
2017-01-01T23:51:09.000Z4.
尝试以另一种方式执行上个问题的查询,不使用聚合函数。
SELECT occurred_at
FROM web_events
ORDER BY occurred_at DESC
LIMIT 1
#结果返回
occurred_at
2017-01-01T23:51:09.000Z5.(该题其实就是对金额和数量进行求平均)
算出每个订单在每种纸张上消费的平均 (AVERAGE) 金额,以及每个订单针对每种纸张购买的平均数量。最终答案应该有 6 个值,每个纸张类型平均销量对应一个值,以及平均数量对应一个值。
SELECT AVG(standard_qty) mean_standard, AVG(gloss_qty) mean_gloss, AVG(poster_qty) mean_poster, AVG(standard_amt_usd) mean_standard_usd, AVG(gloss_amt_usd) mean_gloss_usd, AVG(poster_amt_usd) mean_poster_usd
FROM orders;
# 结果返回
mean_standard   mean_gloss  mean_poster mean_standard_usd   mean_gloss_usd  mean_poster_usd
280.4320023148148148    146.6685474537037037    104.6941550925925926    1399.3556915509259259   1098.5474204282407407   850.11653935185185196.
我相信你都渴望知道在所有订单上消费的中值 total_usd 是多少?虽然这一概念已经超出我们的范围。注意,这比我们到目前为止介绍的基本内容要高深一点,但是我们可以按照以下方式对答案进行硬编码。
SELECT *
FROM (SELECT total_amt_usdFROM ordersORDER BY total_amt_usdLIMIT 3457) AS Table1
ORDER BY total_amt_usd DESC
LIMIT 2;
# 结果返回
total_amt_usd
2483.16

14.练习GROUP BY

1.
SELECT a.name a_name, w.occurred_at a_time
FROM accounts a
JOIN web_events w
ON a.id = w.account_id
ORDER BY w.occurred_at
LIMIT 12.
SELECT a.name a_name,SUM(o.total_amt_usd) a_toal_amt
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a_name
# 总是容易在JOIN后面的语句,用where,不用ON ,这是错误的,一定要记住。3.
SELECT w.occurred_at a_date,w.channel a_channel,a.name a_name
FROM accounts a
JOIN web_events w
ON a.id = w.account_id
ORDER BY a_date DESC
LIMIT 1
# 返回结果
a_date                      a_channel           a_name
2017-01-01T23:51:09.000Z    organic Molina      Healthcare4.
SELECT w.channel, COUNT(*)
FROM web_events w
JOIN accounts a
ON a.id = w.account_id
GROUP BY w.channel
#结果返回
channel count
adwords 906
direct  5298
banner  476
facebook    967
organic 952
twitter 4745.
SELECT a.name
FROM accounts a
JOIN web_events w
ON a.id = w.account_id
ORDER BY w.occurred_at
LIMIT 1
# 结果返回
name
DISH Network6.
SELECT a.name a_name,MIN(o.total_amt_usd) as a_toal_amt
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a_name
ORDER BY a_toal_amt
# 结果返回太多,不进行展示7.
SELECT r.name r_name,COUNT(s.name) as sale_count
FROM region r
JOIN Sales_reps s
ON r.id = s.region_id
GROUP BY r_name
ORDER BY sale_count
# 发现region中不同的区域仅仅有4个区域。

17.练习GROUP BY (multiple columns)

1.
SELECT a.name a_name, AVG(o.standard_qty) as stand_number,AVG(o.gloss_qty) as gloss_number, AVG(o.poster_qty) as poster_number
FROM accounts a
JOIN orders o
ON a.id= o.account_id
GROUP BY a_name2.
SELECT a.name a_name, AVG(o.standard_amt_usd) as stand_amt,AVG(o.gloss_amt_usd) as gloss_amt, AVG(o.poster_amt_usd) as poster_amt
FROM accounts a
JOIN orders o
ON a.id= o.account_id
GROUP BY a_name3.
SELECT s.name s_name,w.channel channel,COUNT(*) as count
FROM Sales_reps s
JOIN accounts a
ON s.id = a.Sales_rep_id
JOIN web_events w
ON a.id = w.account_id
GROUP BY s_name,channel
ORDER BY s_name,count DESC4.
SELECT r.name r_name,w.channel channel,COUNT(*) as count
FROM region r
JOIN Sales_reps s
ON r.id = s.region_id
JOIN accounts a
ON s.id = a.Sales_rep_id
JOIN web_events w
ON a.id = w.account_id
GROUP BY r_name,channel
ORDER BY r_name,count DESC

20.DISTINCT 函数的练习

1.使用 DISTINCT 检查是否有任何客户与多个区域相关联?下面的两个查询产生了相同的行数(351 行),因此我们知道每个客户仅与一个区域相关联。如果每个客户与多个区域相关联,则第一个查询返回的行数应该比第二个查询的多。SELECT DISTINCT a.id, r.id, a.name, r.name
FROM accounts a
JOIN sales_reps s
ON s.id = a.sales_rep_id
JOIN region r
ON r.id = s.region_id;andSELECT DISTINCT id, name
FROM accounts;2.
有没有销售代表要处理多个客户?实际上,所有销售代表都要处理多个客户。销售代表处理的最少客户数量是 3 个。有 50 个销售代表,他们都有多个客户。在第二个查询中使用 DISTINCT 确保包含了第一个查询中的所有销售代表。SELECT s.id, s.name, COUNT(*) num_accounts
FROM accounts a
JOIN sales_reps s
ON s.id = a.sales_rep_id
GROUP BY s.id, s.name
ORDER BY num_accounts;
andSELECT DISTINCT id, name
FROM sales_reps;

23.练习HAVING 函数

问题:HAVING有多少位销售代表需要管理超过 5 个客户?
SELECT s.id, s.name, COUNT(*) num_accounts
FROM accounts a
JOIN sales_reps s
ON s.id = a.sales_rep_id
GROUP BY s.id, s.name
HAVING COUNT(*) >5
ORDER BY num_accounts;
#34有多少个客户具有超过 20 个订单?
SELECT a.id, a.name, COUNT(*) num_orders
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING COUNT(*) > 20
ORDER BY num_orders;
#120哪个客户的订单最多?
SELECT a.id, a.name, COUNT(*) num_orders
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY num_orders DESC;
#
id  name    num_orders
3411    Leucadia National   71有多少个客户在所有订单上消费的总额超过了 30,000 美元?
SELECT a.id, a.name, SUM(total_amt_usd) total_amt
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING SUM(total_amt_usd) >  30000
ORDER BY total_amt DESC;
# 204有多少个客户在所有订单上消费的总额不到 1,000 美元?
SELECT a.id, a.name, SUM(total_amt_usd) total_amt
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING SUM(total_amt_usd) < 1000
ORDER BY total_amt DESC;
# 3哪个客户消费的最多?
SELECT a.id, a.name, SUM(total_amt_usd) total_amt
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING SUM(total_amt_usd) >  30000
ORDER BY total_amt DESC;
#
id  name    total_amt
4211    EOG Resources   382873.30哪个客户消费的最少?
SELECT a.id, a.name, SUM(total_amt_usd) total_amt
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING SUM(total_amt_usd) < 1000
ORDER BY total_amt;
#
id  name    total_amt
1901    Nike    390.25哪个客户使用 facebook 作为与消费者沟通的渠道超过 6 次?
SELECT a.id, a.name, COUNT(*) channel_count
FROM accounts a
JOIN orders o
ON a.id = o.account_id
JOIN web_events w
ON w.account_id = a.id
WHERE w.channel = 'facebook'
GROUP BY a.id, a.name
HAVING  COUNT(*) >  6
ORDER BY channel_count DESC;
#220哪个客户使用 facebook 作为沟通渠道的次数最多?
id  name    channel_count
2351    AutoNation  784
# 根据上一个的结果哪个渠道是客户最常用的渠道?
SELECT a.id, a.name, w.channel,COUNT(*) channel_count
FROM accounts a
JOIN orders o
ON a.id = o.account_id
JOIN web_events w
ON w.account_id = a.id
GROUP BY a.id, a.name, w.channel
ORDER BY a.id, channel_count DESC;
#
id  name    channel channel_count
1001    Walmart direct  616
# 以下是官网给出的答案
有多少位销售代表需要管理超过 5 个客户?
SELECT s.id, s.name, COUNT(*) num_accounts
FROM accounts a
JOIN sales_reps s
ON s.id = a.sales_rep_id
GROUP BY s.id, s.name
HAVING COUNT(*) > 5
ORDER BY num_accounts;实际上,我们可以使用 SUBQUERY 获得这一结果,如下所示。其他查询也可以使用这一逻辑,下面就不显示了。
SELECT COUNT(*) num_reps_above5
FROM(SELECT s.id, s.name, COUNT(*) num_accountsFROM accounts aJOIN sales_reps sON s.id = a.sales_rep_idGROUP BY s.id, s.nameHAVING COUNT(*) > 5ORDER BY num_accounts) AS Table1;有多少个客户具有超过 20 个订单?
SELECT a.id, a.name, COUNT(*) num_orders
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING COUNT(*) > 20
ORDER BY num_orders;哪个客户的订单最多?
SELECT a.id, a.name, COUNT(*) num_orders
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY num_orders DESC
LIMIT 1;有多少个客户在所有订单上消费的总额超过了 30,000 美元?
SELECT a.id, a.name, SUM(o.total_amt_usd) total_spent
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING SUM(o.total_amt_usd) > 30000
ORDER BY total_spent;有多少个客户在所有订单上消费的总额不到 1,000 美元?
SELECT a.id, a.name, SUM(o.total_amt_usd) total_spent
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
HAVING SUM(o.total_amt_usd) < 1000
ORDER BY total_spent;哪个客户消费的最多?
SELECT a.id, a.name, SUM(o.total_amt_usd) total_spent
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY total_spent DESC
LIMIT 1;哪个客户消费的最少?
SELECT a.id, a.name, SUM(o.total_amt_usd) total_spent
FROM accounts a
JOIN orders o
ON a.id = o.account_id
GROUP BY a.id, a.name
ORDER BY total_spent
LIMIT 1;哪个客户使用 facebook 作为与消费者沟通的渠道超过 6 次?
SELECT a.id, a.name, w.channel, COUNT(*) use_of_channel
FROM accounts a
JOIN web_events w
ON a.id = w.account_id
GROUP BY a.id, a.name, w.channel
HAVING COUNT(*) > 6 AND w.channel = 'facebook'
ORDER BY use_of_channel;哪个客户使用 facebook 作为沟通渠道的次数最多?
SELECT a.id, a.name, w.channel, COUNT(*) use_of_channel
FROM accounts a
JOIN web_events w
ON a.id = w.account_id
WHERE w.channel = 'facebook'
GROUP BY a.id, a.name, w.channel
ORDER BY use_of_channel DESC
LIMIT 1;哪个渠道是客户最常用的渠道?
SELECT a.id, a.name, w.channel, COUNT(*) use_of_channel
FROM accounts a
JOIN web_events w
ON a.id = w.account_id
GROUP BY a.id, a.name, w.channel
ORDER BY use_of_channel DESC
LIMIT 10;

27.练习DATE函数

1 按天计算每天标准纸张的销售量
SELECT DATE_TRUNC('day',occurred_at) as day,SUM(standard_qty) as standard_qty_sum
FROM orders
GROUP BY DATE_TRUNC('day',occurred_at)
ORDER BY DATE_TRUNC('day',occurred_at)2.利用DATE_PART函数进行提取特定日期的数据
SELECT DATE_PART('dow',occurred_at) as day_of_week,SUM(total) as total_qty
FROM orders
GROUP BY 1
ORDER BY 2# 结果展示
day_of_week total_qty
4   510405
3   511525
2   517357
1   518016
6   521813
5   536776
0   559873

1.
SELECT DATE_TRUNC('year', occurred_at) as month_id,SUM(total_amt_usd) as total_sale_amt
FROM orders
GROUP BY  DATE_TRUNC('year', occurred_at)
ORDER BY total_sale_amt DESC
# 结果展示
month_id    total_sale_amt
2016-01-01T00:00:00.000Z    12864917.92
2015-01-01T00:00:00.000Z    5752004.94
2014-01-01T00:00:00.000Z    4069106.54
2013-01-01T00:00:00.000Z    377331.00
2017-01-01T00:00:00.000Z    78151.432.
SELECT DATE_TRUNC('month', occurred_at) as month_id,SUM(total_amt_usd) as total_sale_amt
FROM orders
GROUP BY DATE_TRUNC('month', occurred_at)
ORDER BY total_sale_amt DESC
# 结果 38results
month_id    total_sale_amt
2016-12-01T00:00:00.000Z    1770282.62
2016-11-01T00:00:00.000Z    1396045.62
2016-10-01T00:00:00.000Z    1377981.57
2016-07-01T00:00:00.000Z    1227707.47
2016-09-01T00:00:00.000Z    1206399.93
2016-06-01T00:00:00.000Z    1152556.74
2016-08-01T00:00:00.000Z    1087667.48
………………3.
SELECT DATE_TRUNC('year',occurred_at) as year,SUM(total) as total_sale
FROM orders
GROUP BY DATE_TRUNC('year',occurred_at)
ORDER BY total_sale DESC# 结果
year    total_sale
2016-01-01T00:00:00.000Z    2041600
2015-01-01T00:00:00.000Z    912972
2014-01-01T00:00:00.000Z    650896
2013-01-01T00:00:00.000Z    58310
2017-01-01T00:00:00.000Z    119874.
SELECT DATE_TRUNC('month', occurred_at) as month_id,SUM(total) as total_sale
FROM orders
GROUP BY DATE_TRUNC('month', occurred_at)
ORDER BY total_sale DESC# 38results
month_id    total_sale
2016-12-01T00:00:00.000Z    271062
2016-11-01T00:00:00.000Z    221166
2016-10-01T00:00:00.000Z    220590
2016-09-01T00:00:00.000Z    192720
2016-07-01T00:00:00.000Z    188680
2016-06-01T00:00:00.000Z    185808
2016-08-01T00:00:00.000Z    175218
2016-05-01T00:00:00.000Z    132025
2016-03-01T00:00:00.000Z    127496
……………………5.
SELECT DATE_TRUNC('month', o.occurred_at) ord_date, SUM(o.gloss_amt_usd) tot_spent
FROM orders o
JOIN accounts a
ON a.id = o.account_id
WHERE a.name = 'Walmart'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 1;
#结果返回
ord_date    tot_spent
2016-05-01T00:00:00.000Z    9257.64

官方给出的答案:

1.Parch & Posey 在哪一年的总销售额最高?数据集中的所有年份保持均匀分布吗?
SELECT DATE_PART('year', occurred_at) ord_year,  SUM(total_amt_usd) total_spent
FROM orders
GROUP BY 1
ORDER BY 2 DESC;
对于 2013 年和 2017 年来说,每一年只有一个月的销量(2013 年为 12,2017 年为 1)。 因此,二者都不是均匀分布。销量一年比一年高,2016 年是到目前为止最高的一年。按照这个速度,我们预计 2017 年可能是最高销量的一年。2. Parch & Posey 在哪一个月的总销售额最高?数据集中的所有月份保持均匀分布吗?
为了保持公平,我们应该删掉 2013 年和 2017 年的销量。原因如上。
SELECT DATE_PART('month', occurred_at) ord_month, SUM(total_amt_usd) total_spent
FROM orders
WHERE occurred_at BETWEEN '2014-01-01' AND '2017-01-01'
GROUP BY 1
ORDER BY 2 DESC;
12 月的销量最高。3.Parch & Posey 在哪一年的总订单量最多?数据集中的所有年份保持均匀分布吗?
SELECT DATE_PART('year', occurred_at) ord_year,  COUNT(*) total_sales
FROM orders
GROUP BY 1
ORDER BY 2 DESC;
同样,到目前为止,2016 年的订单量最多,但是与数据集中的其他年份相比,2013 年和 2017 年的分布不均匀。4.Parch & Posey 在哪一个月的总订单量最多?数据集中的所有年份保持均匀分布吗?
SELECT DATE_PART('month', occurred_at) ord_month, COUNT(*) total_sales
FROM orders
WHERE occurred_at BETWEEN '2014-01-01' AND '2017-01-01'
GROUP BY 1
ORDER BY 2 DESC;
12 月依然是销量最多的月份,但是有趣的是,11 月是销量第二多的月份。为了保持公平,删掉了 2017 年和 2013 年的数据。5. Walmart 在哪一年的哪一个月在铜版纸上的消费最多?
SELECT DATE_TRUNC('month', o.occurred_at) ord_date, SUM(o.gloss_amt_usd) tot_spent
FROM orders o
JOIN accounts a
ON a.id = o.account_id
WHERE a.name = 'Walmart'
GROUP BY 1
ORDER BY 2 DESC
LIMIT 1;
在 2016 年 5 月,Walmart 在铜版纸上的消费做多。

31.练习CASE函数

# 讲解时练习题
SELECT account_id,occurred_at,CASE WHEN total >500 THEN 'Over 500' WHEN total >300 AND total<=500 THEN '301-500' WHEN total>100  AND total<=300 THEN '101-300' ELSE '100 or under' END AS total_group
FROM orders# 返回结果
account_id  occurred_at total_group
1001    2015-10-06T17:31:14.000Z    101-300
1001    2015-11-05T03:34:33.000Z    101-300
1001    2015-12-04T04:21:55.000Z    101-300
1001    2016-01-02T01:18:24.000Z    101-300
1001    2016-02-01T19:27:27.000Z    101-300
1001    2016-03-02T15:29:32.000Z    101-300
1001    2016-04-01T11:20:18.000Z    101-300
1001    2016-05-01T15:55:51.000Z    101-300
1001    2016-05-31T21:22:48.000Z    101-300
1001    2016-06-30T12:32:05.000Z    101-300
1001    2016-07-30T03:26:30.000Z    101-30
…………………………
6912results# 例子2 利用when-then-end句子实现,避免分母为零的除法运算。(之前计算单价时出现的问题)
SELECT account_id,CASE WHEN standard_qty=0 OR standard_qty IS NULL THEN 0 ELSE standard_amt_usd/standard_qty END AS unit_price
FROM orders
LIMIT 10;
# 结果
Output10 results
account_id  unit_price
1001    4.9900000000000000
1001    4.9900000000000000
1001    4.9900000000000000
1001    4.9900000000000000
1001    4.9900000000000000
1001    4.9900000000000000
1001    4.9900000000000000
1001    4.9900000000000000
1001    4.9900000000000000
1001    4.9900000000000000# 例子3  CASE 与聚合,根据订单进行分类,针对分类的结果进行聚合。更适合在report。
SELECT CASE WHEN total >500 THEN 'Over 500' WHEN total >300 AND total<=500 THEN '301-500' WHEN total>100  AND total<=300 THEN '101-300' ELSE '100 or under' END AS total_group,COUNT(*) order_count
FROM orders
GROUP BY 1
# 结果
total_group order_count
301-500 1691
101-300 1404
Over 500    3196
100 or under    621

1.
SELECT a.name a_name,o.total_amt_usd total_amt,
CASE WHEN total_amt_usd >200000 THEN 'TOP' WHEN total_amt_usd >100000 AND total_amt_usd<=200000 THEN 'Meidan' ELSE 'Low' END AS amt_level
FROM orders o
JOIN accounts a
ON a.id = o.account_id
GROUP BY 1,2
ORDER BY total_amt DESC;
# 结果展示
6910 results
a_name  total_amt   amt_level
Pacific Life    232207.07   TOP
Core-Mark Holding   112875.18   Meidan
EOG Resources   107533.55   Meidan
DISH Network    95005.82    Low
Fidelity National Financial 93547.84    Low
Republic Services   93505.69    Low
IBM 93106.81    Low
……………………2.
SELECT a.name a_name,o.total_amt_usd total_amt,o.occurred_at,
CASE WHEN total_amt_usd >200000 THEN 'TOP' WHEN total_amt_usd >100000 AND total_amt_usd<=200000 THEN 'Meidan' ELSE 'Low' END AS amt_level
FROM orders o
JOIN accounts a
ON a.id = o.account_id
WHERE o.occurred_at >='2016-01-01'
GROUP BY 1,2,3
ORDER BY total_amt DESC;
#结果展示
3782 results
a_name  total_amt   occurred_at amt_level
Pacific Life    232207.07   2016-12-26T08:53:24.000Z    TOP
Core-Mark Holding   112875.18   2016-06-24T13:32:55.000Z    Meidan
Fidelity National Financial 93547.84    2016-07-17T14:50:43.000Z    Low
Disney  92991.05    2016-06-05T17:14:15.000Z    Low
State Farm Insurance Cos.   84099.62    2016-10-26T00:19:31.000Z    Low
Mosaic  82163.71    2016-10-19T18:03:43.000Z    Low
CHS 77285.75    2016-07-24T02:32:03.000Z    Low
Kohl's 45465.23    2016-10-21T21:08:01.000Z    Low
…………3.
SELECT s.name, COUNT(*) count_orders,CASE WHEN COUNT(*) >200 THEN 'TOP' ELSE 'Not' END AS label
FROM Sales_reps s
JOIN accounts a
ON s.id = a.sales_rep_id
JOIN orders o
ON a.id = o.account_id
GROUP BY 1
ORDER BY 2 DESC
# 结果 展示
Output 50 results
name    count_orders    label
Earlie Schleusner   335 TOP
Vernita Plump   299 TOP
Tia Amato   267 TOP
Georgianna Chisholm 256 TOP
Moon Torian 250 TOP
Nelle Meaux 241 TOP
Maren Musto 224 TOP
Dorotha Seawell 208 TOP
Charles Bidwell 205 TOP
Maryanna Fiorentino 204 TOP
Calvin Ollison  199 Not
Sibyl Lauria    193 Not
Elwood Shutt    191 Not
Hilma Busick    191 Not
Arica Stoltzfus 186 Not
Delilah Krum    185 Not
Gianna Dossey   184 Not
Micha Woodford  179 Not
Michel Averette 173 Not
Elna Condello   168 Not
Brandie Riva    167 Not
Cliff Meints    151 Not
Necole Victory  136 Not
Samuel Racine   134 Not
Dawna Agnew 116 Not
Eugena Esser    116 Not
Debroah Wardle  116 Not
………………4.
SELECT s.name, COUNT(*), SUM(o.total_amt_usd) total_spent, CASE WHEN COUNT(*) > 200 OR SUM(o.total_amt_usd) > 750000 THEN 'top'WHEN COUNT(*) > 150 OR SUM(o.total_amt_usd) > 500000 THEN 'middle'ELSE 'low' END AS sales_rep_level
FROM orders o
JOIN accounts a
ON o.account_id = a.id
JOIN sales_reps s
ON s.id = a.sales_rep_id
GROUP BY s.name
ORDER BY 3 DESC;
# 结果展示
50 results
name    count   total_spent sales_rep_level
Earlie Schleusner   335 1098137.72  top
Tia Amato   267 1010690.60  top
Vernita Plump   299 934212.93   top
Georgianna Chisholm 256 886244.12   top
Arica Stoltzfus 186 810353.34   top
Dorotha Seawell 208 766935.04   top
Nelle Meaux 241 749076.16   top
Sibyl Lauria    193 722084.27   middle
Maren Musto 224 702697.29   top
Brandie Riva    167 675917.64   middle
Charles Bidwell 205 675637.19   top
Elwood Shutt    191 662500.24   middle
Maryanna Fiorentino 204 655954.74   top
Moon Torian 250 650393.52   top
Hilma Busick    191 622808.04   middle
Dawna Agnew 116 604519.38   middle
Calvin Ollison  199 594516.08   middle
Cliff Meints    151 556105.34   middle
Gianna Dossey   184 550973.02   middle
Michel Averette 173 523977.06   middle
Delilah Krum    185 512179.11   middle
Elna Condello   168 508913.05   middle
Micha Woodford  179 488448.47   middle
Necole Victory  136 475282.05   low
Samuel Racine   134 470408.98   low
Cordell Rieder  99  447934.80   low
Julia Behrman   115 447712.91   low
Derrick Boggess 102 383933.65   low
Saran Ram   106 362689.34   low
Renetta Carew   83  330188.69   low
Shawanda Selke  100 327828.61   low
Eugena Esser    116 311801.45   low
Marquetta Laycock   90  307940.94   low
Debroah Wardle  116 293374.01   low
Ernestine Pickron   89  283243.25   low
Retha Sears 64  283203.03   low
Silvana Virden  65  262170.64   low
Lavera Oles 73  258316.81   low
Sherlene Wetherington   63  218909.58   low
Ayesha Monica   86  217146.59   low
Babette Soukup  60  215905.27   low
Carletta Kosinski   61  213032.45   low
Soraya Fulton   54  210436.05   low
Chau Rowles 63  184282.60   low
Cara Clarke 54  166138.65   low
Akilah Drinkard 66  136613.99   low
Kathleen Lalonde    35  116307.79   low
Elba Felder 62  114976.59   low
Julie Starr 35  89097.65    low
Nakesha Renn    13  49361.11    low
……………………

官网给出的答案

我们想要根据相关的购买量了解三组不同的客户。最高的一组是终身价值(所有订单的总销售额)大于 200,000 美元的客户。第二组是在 200,000 到 100,000 美元之间的客户。最低的一组是低于 under 100,000 美元的客户。请提供一个表格,其中包含与每个客户相关的级别。你应该提供客户的名称、所有订单的总销售额和级别。消费最高的客户列在最上面。
SELECT a.name, SUM(total_amt_usd) total_spent, CASE WHEN SUM(total_amt_usd) > 200000 THEN 'top'WHEN  SUM(total_amt_usd) > 100000 THEN 'middle'ELSE 'low' END AS customer_level
FROM orders o
JOIN accounts a
ON o.account_id = a.id
GROUP BY a.name
ORDER BY 2 DESC;现在我们想要执行和第一个问题相似的计算过程,但是我们想要获取在 2016 年和 2017 年客户的总消费数额。级别和上一个问题保持一样。消费最高的客户列在最上面。
SELECT a.name, SUM(total_amt_usd) total_spent, CASE WHEN SUM(total_amt_usd) > 200000 THEN 'top'WHEN  SUM(total_amt_usd) > 100000 THEN 'middle'ELSE 'low' END AS customer_level
FROM orders o
JOIN accounts a
ON o.account_id = a.id
WHERE occurred_at > '2015-12-31'
GROUP BY 1
ORDER BY 2 DESC;我们想要找出绩效最高的销售代表,也就是有超过 200 个订单的销售代表。创建一个包含以下列的表格:销售代表名称、订单总量和标为 top 或 not 的列(取决于是否拥有超过 200 个订单)。销售量最高的销售代表列在最上面。
SELECT s.name, COUNT(*) num_ords,CASE WHEN COUNT(*) > 200 THEN 'top'ELSE 'not' END AS sales_rep_level
FROM orders o
JOIN accounts a
ON o.account_id = a.id
JOIN sales_reps s
ON s.id = a.sales_rep_id
GROUP BY s.name
ORDER BY 2 DESC;
值得注意的是,上述语句假定每个名称是唯一的,好几次都是这么假定的。否则需要根据名称和 ID 拆分表格。之前的问题没有考虑中间水平的销售代表或销售额。管理层决定也要看看这些数据。我们想要找出绩效很高的销售代表,也就是有超过 200 个订单或总销售额超过 750000 美元的销售代表。中间级别是指有超过 150 个订单或销售额超过 500000 美元的销售代表。创建一个包含以下列的表格:销售代表名称、总订单量、所有订单的总销售额,以及标为 top、middle 或 low 的列(取决于上述条件)。在最终表格中将销售额最高的销售代表列在最上面。
SELECT s.name, COUNT(*), SUM(o.total_amt_usd) total_spent, CASE WHEN COUNT(*) > 200 OR SUM(o.total_amt_usd) > 750000 THEN 'top'WHEN COUNT(*) > 150 OR SUM(o.total_amt_usd) > 500000 THEN 'middle'ELSE 'low' END AS sales_rep_level
FROM orders o
JOIN accounts a
ON o.account_id = a.id
JOIN sales_reps s
ON s.id = a.sales_rep_id
GROUP BY s.name
ORDER BY 3 DESC;
根据上述标准,你可能会见到几个表现很差的销售代表!

SQL学习-2.7 SQL聚合相关推荐

  1. 最方便的在线Oracle SQL学习环境--Live SQL

    在线Oracle SQL学习环境--Live SQL 概述 一键开始 登录和注册 开始编写SQL之旅 Live SQL的各选项 1. SQL Worksheet 2. My Session 3. Qu ...

  2. SQL学习笔记 | 02 SQL语句结构

    SQL学习笔记 | 02 SQL语句结构 一.表的导入 1.表的命名 2.导入步骤 3.导入需注意 二.标准SQL语法 1.语句结构 2.数据表的其他关键词 3.SQL语句的分类 一.表的导入 1.表 ...

  3. SQL学习指南:SQL和大数据

    由于SQL被数百万的人使用并且被集成到数千的应用程序中,因此利用SQL来处理这些数据是有意义的.在过去的几年里,涌现出了一批新的工具以支持SQL访问结构化.半结构化和非结构化的数据,这些工具包括Pre ...

  4. 【SQL学习记录】SQL Server全文本搜索

    1 全文本搜索 (Full-text Search) 1.1 全文本搜索简介 全文本搜索支持查询: 一个或多个特定单词 以特定的文本开头 特定单词的各种词性(动词.名词.形容词.进行时.过去时等) 和 ...

  5. Spark学习笔记(7)---Spark SQL学习笔记

    Spark SQL学习笔记 Spark SQL学习笔记设计到很多代码操作,所以就放在github, https://github.com/yangtong123/RoadOfStudySpark/bl ...

  6. spark代码连接hive_spark SQL学习(spark连接hive)

    spark 读取hive中的数据 scala> import org.apache.spark.sql.hive.HiveContext import org.apache.spark.sql. ...

  7. SQL学习七、聚合函数

    不放在第六篇中,是因为这个函数使用频率太高,所以单独作为一节. 聚合函数的使用场景 确定表中行数(或者满足某个条件或包含某个特定值的行数): 获得表中某些行的和: 找出表列(或所有行或某些特定的行)的 ...

  8. sql学习练习题_学习SQL:练习SQL查询

    sql学习练习题 Today is the day for SQL practice #1. In this series, so far, we've covered most important ...

  9. python与SQL学习比较

    目录:python与SQL比较学习 数据字段包含:自增id,订单时间ts,用户id,订单id,订单金额 1.limit 查看全部数据或前n行数据 1.查看全部数据或前n行数据 查看全部数据,panda ...

最新文章

  1. 【转】CSS3 Box-sizing
  2. 18.IDA-创建自己的sig
  3. 嵩天-Python语言程序设计程序题--第八周:程序设计方法学
  4. IDEA 创建 SpringCloud项目-多项目方式
  5. 北妈每日一题:JS从无序乱码找我要的数字!
  6. golang Java_goLang
  7. 没有iml文件会怎么样_【商标服务】商标管理:商标没有办理续展会怎么样 ?...
  8. 接口测试——jemter生成HTML测试报告
  9. 计算n位二进制的所有情况
  10. BZOJ 1146 网络管理Network(树链剖分+BST)
  11. 填坑唯品会分布式调度Saturn
  12. linux 6.5 dos2unix,dos2unix 安装
  13. 路飞学城Python-Day37(practise)
  14. WeNet语音识别实战
  15. 做人呢,最重要的就是开心啦~
  16. 做smart报表的一般步骤
  17. 齐岳提供AIE分子N-苄基-4-溴-1, 8-蔡酰亚胺,近红外发射的BODIPY-PhOSi和BODIPY-DMA,超分子聚合物PNA-GBP·I2的合成
  18. [线性控制理论]关于Laplace变换中求导运算的结果推导
  19. 服务端socket程序
  20. 迭代重建技术(ART)简要介绍

热门文章

  1. Google 规避账户关联
  2. oracle oca教材,OCA官方中文教材.pdf
  3. 7 款 DevOps 工具管理 Kubernetes
  4. u-boot for tiny210 ver2.2(by liukun321咕唧咕唧)
  5. 20130827 MHC的MVC之路
  6. 定制开发体育指数直播APP体育指数数据API调用代码
  7. 各类申报:限价申报与市价申报
  8. mariadb与mysql
  9. python删除空值的行_python如何删除列为空的行
  10. 离散集合运算c语言程序,离散数学集合运算c语言.doc