SQLZOO刷题笔记-更新中

  • 注意
  • Self Join
    • 10. 公交车的转车站点
  • Window functions
    • 0. 排序
    • 1. warming up
    • 2. Who won?
    • 3. PARTITION BY
    • 4. Edinburgh Constituency
    • 5. Winners Only
    • 6. Scottish seats
  • MySQL 的 DATE_FORMAT() 函数
  • Window LAG
    • 2. Introducing the LAG function
    • 3. Number of new cases
    • 4. Weekly changes
    • 5. LAG using a JOIN
    • 6. RANK()
    • 7. Infection rate
    • 8. Turning the corner
    • 主键
    • 复合主键
    • 外键

注意

笔记中的默认 SQL Engine 为 MySQL,若换成其他的,会另外说明。

Self Join

10. 公交车的转车站点

Find the routes involving two buses that can go from Craiglockhart to Lochend.
Show the bus no. and company for the first bus, the name of the stop for the transfer,
and the bus no. and company for the second bus.
Hint: Self-join twice to find buses that visit Craiglockhart and Lochend, then join those on matching stops.

思路:根据题目的提示,找到第一辆公交车(经过 Craiglockhart 的公交车)对应的线路、公司、所经过的站点名称,将这三列信息选出来组合成一个表格,并将表格重命名为表格 bus1 ;同理,找到第二辆公交车(经过 Lochend 的公交车)的三列信息,并组合成表格 bus2 。此时,可以将 bus1 的 “name” 这列看成一个集合,将 bus2 的 “name” 这列看成另一个集合,然后,取这两个集合的交集,就可以得到 “中转站”(即:bus1bus2 共同经过的站点)。
  之所以这样做,是因为用下列代码测试完发现,并没有一辆公交车是直接经过 Craiglockhart 和 Lochend 这两个地点的,所以需要转车。(下列代码运行后到不到公交车)

select distinct r1.num,r1.companyfrom route r1 join route r2 on r1.company=r2.company and r1.num=r2.numjoin stops s1 on s1.id=r1.stop join stops s2 on s2.id=r2.stop
where s1.name='Craiglockhart' and s2.name='Lochend'


以下代码可以得到正确的结果:

select bus1.num ,bus1.company,bus1.name 中转站,bus2.num,bus2.company from(select distinct r11.num,r11.company, s12.name from route r11 join route r12 on r11.company=r12.company and r11.num=r12.numjoin stops s11 on s11.id=r11.stop join stops s12 on s12.id=r12.stopwhere s11.name='Craiglockhart')bus1
join (select distinct r21.num,r21.company,s22.namefrom route r21 join route r22 on r21.company=r22.company and r21.num=r22.numjoin stops s21 on s21.id=r21.stop join stops s22 on s22.id=r22.stopwhere s21.name='Lochend')bus2
onbus1.name=bus2.name
order by bus1.num,中转站,bus2.num

Window functions

0. 排序

(1)rank():1,2,2,4,5,6,6,6,9,……比如,高考排名;
(2)dense_rank():1,1,2,3,4,4,4,5,……;
(3)row_rank():1,2,3,4,5,6,7,……主要用于显示行数。

1. warming up

 Show the lastName, party and votes for the constituency ‘S14000024’ in 2017.

SELECT lastName, party, votesFROM geWHERE constituency = 'S14000024' AND yr = 2017
ORDER BY votes DESC

2. Who won?

 You can use the RANK function to see the order of the candidates. If you RANK using (ORDER BY votes DESC) then the candidate with the most votes has rank 1.
 Show the party and RANK for constituency S14000024 in 2017. List the output by party

SELECT party, votes,RANK() OVER (ORDER BY votes DESC) as posnFROM geWHERE constituency = 'S14000024 ' AND yr = 2017
order by party

3. PARTITION BY

 The 2015 election is a different PARTITION to the 2017 election. We only care about the order of votes for each year.
 Use PARTITION to show the ranking of each party in S14000021 in each year. Include yr, party, votes and ranking (the party with the most votes is 1).

SELECT yr,party, votes,RANK() OVER (PARTITION BY yr ORDER BY votes DESC) as posnFROM geWHERE constituency = 'S14000021'
ORDER BY party,yr

4. Edinburgh Constituency

 Edinburgh constituencies are numbered S14000021 to S14000026.
 Use PARTITION BY constituency to show the ranking of each party in Edinburgh in 2017. Order your results so the winners are shown first, then ordered by constituency.

SELECT constituency,party, votes,rank() over(partition by constituency order by votes desc) posnFROM geWHERE constituency BETWEEN 'S14000021' AND 'S14000026'AND yr  = 2017
ORDER BY posn,constituency

5. Winners Only

 You can use SELECT within SELECT to pick out only the winners in Edinburgh.
 Show the parties that won for each Edinburgh constituency in 2017.

select constituency,party from
( SELECT constituency,party, votes,rank() over(partition by constituency order by votes desc) as posnFROM geWHERE constituency BETWEEN 'S14000021' AND 'S14000026'AND yr  = 2017ORDER BY posn,constituency )SortedTable /*Every derived table must have its own alias:每个派生表都必须有自己的别名。这里的“SortedTable”是自己命名的*/
where posn=1

SortedTable 如下图所示:

正确结果如下图所示:

6. Scottish seats

知识背景:爱丁堡(英文、苏格兰文:Edinburgh;苏格兰盖尔文:Dùn Èideann)是英国苏格兰首府,位于苏格兰中部低地的福斯湾的南岸。

 You can use COUNT and GROUP BY to see how each party did in Scotland. Scottish constituencies start with ‘S’
 Show how many seats for each party in Scotland in 2017.

思路:在一个选区(constituency)的不同政党(party)中,得票数最高的政党将获得一个席位(seat)。所以,要先根据选区来划分(PARTITION BY),选出各个选区中得票数最高的政党,记为"选区内的排名=1,即:前面题目中的“posn=1”。最后,再将这些政党通过GROUP BY来分类统计。

select party as 政党,count(constituency) as 获得的席位数
from(select party,constituency,rank() over(partition by constituency order by votes desc)as 选区内的排名from gewhere constituency like 'S%' and yr=2017order by 选区内的排名 asc)排序后的表格
where 排序后的表格.选区内的排名=1
group by party

《排序后的表格》如下图所示:

得到的正确结果如下图:

MySQL 的 DATE_FORMAT() 函数

DATE_FORMAT()

Window LAG

注:本小节的第 2、3、6题要点击右上角的齿轮,将 SQL Engine 从 MySQL 改为 Microsoft SQL,不然会报错。

2. Introducing the LAG function

 The LAG function is used to show data from the preceding row or the table. When lining up rows the data is partitioned by country name and ordered by the data whn. That means that only data from Italy is considered.
 Modify the query to show confirmed for the day before.

SELECT name, DAY(whn), confirmed,LAG(confirmed,1) OVER (PARTITION BY name ORDER BY whn) 昨天确诊人数FROM covid
WHERE name = 'Italy'AND MONTH(whn) = 3
ORDER BY whn

3. Number of new cases

 The number of confirmed case is cumulative - but we can use LAG to recover the number of new cases reported for each day.
 Show the number of new cases for each day, for Italy, for March.

SELECT name 国家, DAY(whn) as '2020年3月__日',confirmed - LAG(confirmed, 1) OVER (PARTITION BY name ORDER BY whn) as 今日新冠肺炎确诊人数FROM covid
WHERE name = 'Italy'
AND MONTH(whn) = 3
ORDER BY whn

4. Weekly changes

 The data gathered are necessarily estimates and are inaccurate. However by taking a longer time span we can mitigate some of the effects.
 You can filter the data to view only Monday’s figures WHERE WEEKDAY(whn) = 0.
 Show the number of new cases in Italy for each week - show Monday only.

SELECT name 国家, DATE_FORMAT(whn,'%Y-%m-%d') 日期,confirmed 截至今日确诊病例, confirmed - lag(confirmed,1) over(partition by name order by whn) 今日新增确诊病例FROM covid
WHERE name = 'Italy'
AND WEEKDAY(whn) = 0
group by name, confirmed, whn /* 这句话一定要写,不然会报错 */
ORDER BY whn

  以上代码如果不加group by name, confirmed, whn这句,则会报错:Mixing of GROUP columns (MIN(),MAX(),COUNT(),...) with no GROUP columns is illegal if there is no GROUP BY clause.

  上图来源于:这个地方。很遗憾,sqlzoo 的 window_lag 换了个地址,bug 又出现了。不用管它,知道就好。
  值得注意的是,本小节的第2、3题,如果用 MySQL 这个引擎的话,也是加上group by name, confirmed, whn这句话就能得到正确的结果。也就是说,用 MySQL 这个引擎比用 Microsoft SQL 这个引擎多了group by name, confirmed, whn这句话。

正确运行结果:

5. LAG using a JOIN

 You can JOIN a table using DATE arithmetic. This will give different results if data is missing.
 Show the number of new cases in Italy for each week - show Monday only.
 In the sample query we JOIN this week tw with last week lw using the DATE_ADD function.

select tw.name 国家, date_format(tw.whn,'%Y-%m-%d') 日期,tw.confirmed - lw.confirmed 比上周一增加的确诊病例数from covid as lw right join covid as tw on lw.name=tw.name and date_add(lw.whn,interval 7 day)=tw.whn
where tw.name='Italy' and weekday(tw.whn)=0
order by tw.whn

6. RANK()

 The query shown shows the number of confirmed cases together with the world ranking for cases.
 United States has the highest number, Spain is number 2…
 Notice that while Spain has the second highest confirmed cases, Italy has the second highest number of deaths due to the virus.
 Include the ranking for the number of deaths in the table.

SELECT name,confirmed,RANK() OVER (ORDER BY confirmed DESC) rc,deaths,rank() over(order by deaths desc) rdFROM covid
WHERE whn = '2020-04-20'
ORDER BY confirmed DESC

7. Infection rate

 The query shown includes a JOIN t the world table so we can access the total population of each country and calculate infection rates (in cases per 100,000).
 Show the infect rate ranking for each country. Only include countries with a population of at least 10 million.

SELECT  world.name 国家,population 总人口数,confirmed 感染的人数,ROUND(confirmed/(population/100000),0) 每十万人中感染的人数,rank() over(order by 每十万人中感染的人数 desc) 感染率排名FROM covid JOIN world ON covid.name=world.name
WHERE whn = '2020-04-20' AND population > 10000000
group by 国家,confirmed,population
ORDER BY 感染率排名 asc

8. Turning the corner

 For each country that has had at last 1000 new cases in a single day, show the date of the peak number of new cases.

select 国家, max(新增确诊病例) 单日新增最多病例数,rank() over(order by 单日新增最多病例数 desc) 排名
from( select name 国家,confirmed - lag(confirmed,1) over(partition by name order by whn) 新增确诊病例from covidgroup by 国家,confirmed
)SortedTable
group by 国家
having 单日新增最多病例数>=1000
order by 排名 asc

  上述代码中之所以采用 select 嵌套,是因为 lag 无法与 max 嵌套使用,所以我们选择先构造一个表格 SortedTable,然后再从该表格中去取列的 max 值。

主键

  数据库自动按主键值的顺序显示表中的记录。如果没有定义主键,则按输入记录的顺序显示表中的记录。

复合主键

  就是用两个或两个以上的字段(即:表中的列的名称)做为主键。有时一个字段难以标识唯一,就采取复合主键方式来标识唯一。

外键

  如果公共关键字在一个关系中是主关键字,那么这个公共关键字被称为另一个关系的外键。由此可见,外键表示了两个关系之间的相关联系。以另一个关系的外键作主关键字的表被称为主表,具有此外键的表被称为主表的从表。外键又称作外关键字。
(详见:外键——百度百科)

SQLZOO刷题笔记相关推荐

  1. sqlzoo刷题笔记-02 | SUM and COUNT

    网址:https://sqlzoo.net/wiki/SUM_and_COUNT 1.Show the total population of the world. 显示世界总人口数. SELECT ...

  2. Github最强算法刷题笔记.pdf

    资料一 昨晚逛GitHub,无意中看到一位大佬(https://github.com/halfrost)的算法刷题笔记,感觉发现了宝藏!有些小伙伴可能已经发现了,但咱这里还是忍不住安利一波,怕有些小伙 ...

  3. 我收藏的谷歌和阿里大佬的刷题笔记

    金三银四大家在准备校招.社招,或者闲暇的时候,都可以刷刷 Leetcode,保持良好的手感. 之前刷题,一直觉得漫无目的地刷,效率很低.后来发现了两个刷题笔记,谷歌大佬高畅和BAT大佬霜神写的 Lee ...

  4. 三级网络技术刷题笔记

    三级网络技术刷题笔记 RPR与FDDI一样使用双环结构 在RPR环中,源节点向目的节点成功发出的数据帧要由目的节点从环中收回 RPR中每个节点都执行SRP公平算法 RPR环能够在50ms内实现自愈 O ...

  5. 卷进大厂系列之LeetCode刷题笔记:二分查找(简单)

    LeetCode刷题笔记:二分查找(简单) 学算法,刷力扣,加油卷,进大厂! 题目描述 涉及算法 题目解答 学算法,刷力扣,加油卷,进大厂! 题目描述 力扣题目链接 给定一个 n 个元素有序的(升序) ...

  6. 发现一位大佬的算法刷题笔记PDF

    昨晚逛GitHub,无意中看到一位大佬(https://github.com/halfrost)的算法刷题笔记,感觉发现了宝藏!有些小伙伴可能已经发现了,但咱这里还是忍不住安利一波,怕有些小伙伴没有看 ...

  7. 阿里大神的刷题笔记.pdf

    今天在浏览 Github 的时候,发现了一个让人眼前一亮的项目,一本厚厚的算法刷题笔记,来自一位阿里的资深技术大神. 作者在大学期间参加过三年的 ACM 比赛,对算法有着较为透彻的了解,在找工作之前, ...

  8. 赞!Google 资深软件工程师 LeetCode 刷题笔记首次公开

    有人说写代码就像我们平时开车,仅凭经验你就可以将车开走:但当有一天,这辆车出问题跑不起来的时候,你不懂汽车的运行机制,你要怎么排除和解决问题?所以拥有扎实的数据结构和算法,才能开好编程这辆车. 作为程 ...

  9. 经典算法刷题笔记pdf

    昨晚逛GitHub,无意中看到一位大佬(https://github.com/halfrost)的算法刷题笔记,感觉发现了宝藏!有些小伙伴可能已经发现了,但咱这里还是忍不住安利一波,怕有些小伙伴没有看 ...

最新文章

  1. 单目和多目视觉统一标定
  2. 计算机与材料成型与控制方面的应用,广东科技学院
  3. javascript的变量
  4. 智能指针引用计数器版
  5. CCPC-Wannafly Winter Camp Day8 (Div2, onsite) A 题 Aqours (精巧的树形DP)
  6. 2019-5-30-websocket下readyState常量
  7. Kinect2.0获取数据
  8. Google 的开源方法论
  9. 如果测试没有梦想,那跟咸鱼有什么区别?
  10. 中国风海报设计必备的墨染素材!点睛国潮风!
  11. 论文阅读:Pyramidal Feature Shrinking for Salient Object Detection
  12. 大数据和人工智能,金融产业的创新发展通道
  13. perl:非贪婪的数量词
  14. Android移动客户端性能测试浅谈——电量
  15. 关于如何设计网站首页
  16. 让devcpp支持c++11
  17. python输入学生姓名_python学生信息管理系统实现代码
  18. USB 为什么一般选择48MHz
  19. 用华为手机拍照!要学会这4个功能,随手一拍都是单反大片
  20. 括号匹配算法问题 JS

热门文章

  1. {“errcode“:48001,“errmsg“:“api unauthorized, hints: [ req_id: xxxxxxx]“}
  2. Python-rot13-替换式密码
  3. 命令行修改mysql密码
  4. SQL注入Cookie注入
  5. JKD 下载、安装、配置
  6. JAVA——jdk8的下载与安装,win10下配置JDK环境变量
  7. Java项目:ssm+mysql医药进销存系统
  8. 计算机usb口不识别读卡器,windows7系统下usb读卡器读不出来如何解决
  9. jquery mobile mobiscroll 日期插件使 用mobiscroll
  10. Deep Biaffine Attention for Dependency Parsing