Neo4j:BBC冠军联赛图表
几个周末之前,我开始抓取Bern 拜仁慕尼黑/巴塞罗那比赛的BBC直播文本提要,最初只是从犯规开始,然后建立犯规图表。
从那以后,我花了更多时间,并设法对其他一些事件进行建模,包括尝试,目标,发牌和任意球。
我刚开始只是为拜仁慕尼黑/巴塞罗那比赛做这件事,但意识到扩展这一点并为2014/2015冠军联赛的每场比赛绘制图表并不是特别困难。
为此,我们首先需要下载每个匹配项的页面。 我下载了此页面,并编写了一个简单的Python脚本来获取适当的URI:
from bs4 import BeautifulSoup
from soupselect import select
import bs4soup = BeautifulSoup(open("data/results", "r"))matches = select(soup, "a.report")for match in matches:print "http://www.bbc.co.uk/%s" %(match.get("href"))
然后,我将运行此脚本的输出通过管道传递到wget中:
find_all_matches.py | xargs wget -O data/raw
更新抓取和导入代码以处理多个匹配相对简单。 从头到尾的整个过程如下所示:
大部分代码处于“抓取魔术”阶段,在此阶段中,我经历了所有事件并提取了可以在图中链接在一起的适当元素。
例如,任意球和犯规事件通常相邻,因此我们希望淘汰两名参与者,纸牌类型,事件时间和事件发生的比赛。
我使用Python的Beautiful Soup库完成此任务,但是没有理由您不能使用另一套工具。
README页面显示了如何创建自己的图形版本,但是这里是使用Rik的 元图查询查询图形的概述:
到目前为止,这是我最喜欢的一些查询:
哪个射门次数超过10的球员转换率最高?
match (a:Attempt)<-[:HAD_ATTEMPT]-(app)<-[:MADE_APPEARANCE]-(player),(app)-[:FOR_TEAM]-(team)
WITH player, COUNT(*) as times, COLLECT(a) AS attempts, team
WITH player, times, LENGTH([a in attempts WHERE a:Goal]) AS goals, team
WHERE times > 10
RETURN player.name, team.name, goals, times, (goals * 1.0 / times) AS conversionRate
ORDER BY conversionRate DESC
LIMIT 10==> +------------------------------------------------------------------------------------+
==> | player.name | team.name | goals | times | conversionRate |
==> +------------------------------------------------------------------------------------+
==> | "Luiz Adriano" | "Shakhtar Donetsk" | 9 | 14 | 0.6428571428571429 |
==> | "Yacine Brahimi" | "FC Porto" | 5 | 13 | 0.38461538461538464 |
==> | "Mario Mandzukic" | "Atlético de Madrid" | 5 | 14 | 0.35714285714285715 |
==> | "Sergio Agüero" | "Manchester City" | 6 | 18 | 0.3333333333333333 |
==> | "Karim Benzema" | "Real Madrid" | 6 | 19 | 0.3157894736842105 |
==> | "Klaas-Jan Huntelaar" | "FC Schalke 04" | 5 | 16 | 0.3125 |
==> | "Neymar" | "Barcelona" | 9 | 29 | 0.3103448275862069 |
==> | "Thomas Müller" | "FC Bayern München" | 7 | 24 | 0.2916666666666667 |
==> | "Jackson Martínez" | "FC Porto" | 7 | 24 | 0.2916666666666667 |
==> | "Callum McGregor" | "Celtic" | 3 | 11 | 0.2727272727272727 |
==> +------------------------------------------------------------------------------------+
哪些球员因犯规立即报仇?
match (firstFoul:Foul)-[:COMMITTED_AGAINST]->(app1)<-[:MADE_APPEARANCE]-(revengeFouler),(app1)-[:IN_MATCH]->(match), (firstFoulerApp)-[:COMMITTED_FOUL]->(firstFoul),(app1)-[:COMMITTED_FOUL]->(revengeFoul)-[:COMMITTED_AGAINST]->(firstFoulerApp),(firstFouler)-[:MADE_APPEARANCE]->(firstFoulerApp)
WHERE (firstFoul)-[:NEXT]->(revengeFoul)
RETURN firstFouler.name AS firstFouler, revengeFouler.name AS revengeFouler, firstFoul.time, revengeFoul.time, match.home + " vs " + match.away==> +---------------------------------------------------------------------------------------------------------------------------------+
==> | firstFouler | revengeFouler | firstFoul.time | revengeFoul.time | match.home + " vs " + match.away |
==> +---------------------------------------------------------------------------------------------------------------------------------+
==> | "Derk Boerrigter" | "Jean Philippe Mendy" | "88:48" | "89:42" | "Celtic vs NK Maribor" |
==> | "Mario Suárez" | "Pajtim Kasami" | "27:17" | "32:38" | "Olympiakos vs Atlético de Madrid" |
==> | "Aleksandr Volodko" | "Casemiro" | "39:27" | "44:32" | "FC Porto vs BATE Borisov" |
==> | "Thomas Müller" | "Mario Fernandes" | "87:22" | "88:31" | "CSKA Moscow vs FC Bayern München" |
==> | "Vinicius" | "Marco Verratti" | "56:36" | "58:00" | "APOEL Nicosia vs Paris Saint Germain" |
==> | "Lasse Schöne" | "Dani Alves" | "84:08" | "86:18" | "Barcelona vs Ajax" |
==> | "Nick Viergever" | "Dani Alves" | "57:22" | "60:37" | "Barcelona vs Ajax" |
==> | "Nani" | "Atsuto Uchida" | "6:10" | "8:40" | "FC Schalke 04 vs Sporting Lisbon" |
==> | "Andreas Samaris" | "Yannick Ferreira-Carrasco" | "89:21" | "90:00 +4:21" | "Monaco vs Benfica" |
==> | "Simon Kroon" | "Guillherme Siqueira" | "84:05" | "90:00 +0:29" | "Atlético de Madrid vs Malmö FF" |
==> | "Mario Suárez" | "Isaac Thelin" | "32:02" | "38:47" | "Atlético de Madrid vs Malmö FF" |
==> | "Hakan Balta" | "Henrikh Mkhitaryan" | "62:09" | "64:14" | "Borussia Dortmund vs Galatasaray" |
==> | "Marco Reus" | "Selcuk Inan" | "36:17" | "44:03" | "Borussia Dortmund vs Galatasaray" |
==> | "Hakan Balta" | "Sven Bender" | "10:57" | "12:51" | "Borussia Dortmund vs Galatasaray" |
==> | "Vinicius" | "Edinson Cavani" | "87:56" | "90:00 +1:25" | "Paris Saint Germain vs APOEL Nicosia" |
==> | "Jackson Martínez" | "Carlos Gurpegi" | "64:55" | "66:17" | "Athletic Club vs FC Porto" |
==> | "Nani" | "Chinedu Obasi" | "1:30" | "4:47" | "Sporting Lisbon vs FC Schalke 04" |
==> | "Vitali Rodionov" | "Bruno Martins Indi" | "52:16" | "60:08" | "BATE Borisov vs FC Porto" |
==> | "Raheem Sterling" | "Behrang Safari" | "29:00" | "33:27" | "Liverpool vs FC Basel" |
==> | "Derlis González" | "Fábio Coentrão" | "52:55" | "57:59" | "FC Basel vs Real Madrid" |
==> | "Josip Drmic" | "Lisandro López" | "15:04" | "17:35" | "Benfica vs Bayer 04 Leverkusen" |
==> | "Fred" | "Bastian Schweinsteiger" | "6:04" | "9:28" | "Shakhtar Donetsk vs FC Bayern München" |
==> | "Alex Sandro" | "Derlis González" | "4:07" | "7:28" | "FC Basel vs FC Porto" |
==> | "Luca Zuffi" | "Ruben Neves" | "73:49" | "84:44" | "FC Porto vs FC Basel" |
==> | "Marco Verratti" | "Oscar" | "28:49" | "34:04" | "Chelsea vs Paris Saint Germain" |
==> | "Cristiano Ronaldo" | "Jesús Gámez" | "20:59" | "25:37" | "Real Madrid vs Atlético de Madrid" |
==> | "Bernardo Silva" | "Álvaro Morata" | "49:20" | "62:31" | "Monaco vs Juventus" |
==> | "Arturo Vidal" | "Fabinho" | "38:19" | "45:00" | "Monaco vs Juventus" |
==> +---------------------------------------------------------------------------------------------------------------------------------+
哪个球员花费最长的时间为犯规报仇?
match (foul1:Foul)-[:COMMITTED_AGAINST]->(app1)-[:COMMITTED_FOUL]->(foul2)-[:COMMITTED_AGAINST]->(app2)-[:COMMITTED_FOUL]->(foul1),(player1)-[:MADE_APPEARANCE]->(app1), (player2)-[:MADE_APPEARANCE]->(app2),(foul1)-[:COMMITTED_IN_MATCH]->(match:Match)<-[:COMMITTED_IN_MATCH]-(foul2)
WHERE (foul1)-[:NEXT*]->(foul2)
WITH match, foul1, player1, player2, foul2 ORDER BY foul1.sortableTime, foul2.sortableTime
WITH match, foul1, player1, player2, COLLECT(foul2) AS revenge
WITH match, foul1, player1,player2, revenge[0] AS revengeFoul
RETURN player1.name, player2.name, foul1.time, revengeFoul.time, revengeFoul.sortableTime - foul1.sortableTime AS secondsWaited, match.home + " vs " + match.away AS match
ORDER BY secondsWaited DESC
LIMIT 5==> +---------------------------------------------------------------------------------------------------------------------------+
==> | player1.name | player2.name | foul1.time | revengeFoul.time | secondsWaited | match |
==> +---------------------------------------------------------------------------------------------------------------------------+
==> | "Stefan Johansen" | "Ondrej Duda" | "1:30" | "82:11" | 4841 | "Legia Warsaw vs Celtic" |
==> | "Neymar" | "Vinicius" | "2:35" | "80:08" | 4653 | "Barcelona vs APOEL Nicosia" |
==> | "Jérémy Toulalan" | "Stefan Kießling" | "9:19" | "86:37" | 4638 | "Monaco vs Bayer 04 Leverkusen" |
==> | "Nabil Dirar" | "Domenico Criscito" | "6:32" | "82:39" | 4567 | "Zenit St Petersburg vs Monaco" |
==> | "Nabil Dirar" | "Eliseu" | "7:20" | "81:30" | 4450 | "Monaco vs Benfica" |
==> +---------------------------------------------------------------------------------------------------------------------------+
谁的镜头最多?
match (team)<-[:FOR_TEAM]-(app)<-[appRel:MADE_APPEARANCE]-(player:Player)
optional match (a:Attempt)<-[att:HAD_ATTEMPT]-(app)
WITH player, COUNT( DISTINCT appRel) AS apps, COUNT(att) as times, COLLECT(a) AS attempts, team
WITH player,apps, times, LENGTH([a in attempts WHERE a:Goal]) AS goals, team
WHERE times > 10
RETURN player.name, team.name, apps, goals, times, (goals * 1.0 / times) AS conversionRate
ORDER BY times DESC
LIMIT 10==> +-------------------------------------------------------------------------------------------+
==> | player.name | team.name | apps | goals | times | conversionRate |
==> +-------------------------------------------------------------------------------------------+
==> | "Cristiano Ronaldo" | "Real Madrid" | 12 | 10 | 69 | 0.14492753623188406 |
==> | "Lionel Messi" | "Barcelona" | 12 | 10 | 51 | 0.19607843137254902 |
==> | "Robert Lewandowski" | "FC Bayern München" | 12 | 6 | 43 | 0.13953488372093023 |
==> | "Carlos Tévez" | "Juventus" | 12 | 7 | 34 | 0.20588235294117646 |
==> | "Gareth Bale" | "Real Madrid" | 10 | 2 | 32 | 0.0625 |
==> | "Luis Suárez" | "Barcelona" | 9 | 6 | 30 | 0.2 |
==> | "Neymar" | "Barcelona" | 11 | 9 | 29 | 0.3103448275862069 |
==> | "Hakan Calhanoglu" | "Bayer 04 Leverkusen" | 8 | 2 | 29 | 0.06896551724137931 |
==> | "Edinson Cavani" | "Paris Saint Germain" | 8 | 6 | 27 | 0.2222222222222222 |
==> | "Alexis Sánchez" | "Arsenal" | 9 | 4 | 25 | 0.16 |
==> +-------------------------------------------------------------------------------------------+
也许您能想到一些更酷的产品? 我很想看到他们。 从github获取代码并尝试一下。
翻译自: https://www.javacodegeeks.com/2015/06/neo4j-the-bbc-champions-league-graph.html
Neo4j:BBC冠军联赛图表相关推荐
- 冠军联赛:当火焰变成焰火 海水变成泪水
2019独角兽企业重金招聘Python工程师标准>>> 蓝色像海水,红色像火焰:当蓝军遭遇红军的时候,犹如一半是海水,一半是火焰. 悬念几乎保持到了最后一秒钟,一支红蓝铅 ...
- 南威尔士警方称,2017年欧洲冠军联赛决赛使用的人脸识别技术错误率超过90%
没有任何人脸识别程序是可以达到100%的准确率的,这是一个可预见的,且在未来相当长一段时间内都会存在的问题. 近日,南威尔士警方在一项记录请求中透露,其余2017年欧洲冠军联赛决赛等事件中使用的自动面 ...
- 欧洲篮球冠军联赛网站
欧洲篮球冠军联赛 http://www.euroleague.net
- elo 评分_Elo评分系统:使用Clojure对欧洲冠军联赛球队进行排名
elo 评分 正如我在较早的博客文章中提到的那样, 我一直在学习有关排名系统的知识,而我遇到的第一个系统是Elo评级系统 ,该系统最有名的是用于对棋手进行排名的系统 . Elo评分系统使用以下公式计算 ...
- 第147杆147分诞生!吉尔伯特冠军联赛创历史
大卫·吉尔伯特(资料图) 图片来源:Osports全体育图片社 中新网1月23日电 斯诺克第147杆147分终于诞生!北京时间23日上午,斯诺克冠军联赛第五小组进行的一场比赛中,大卫•吉尔伯特在与斯蒂 ...
- Elo评分系统:使用Clojure对欧洲冠军联赛球队进行排名
正如我在较早的博客文章中提到的那样, 我一直在学习有关排名系统的知识,而我遇到的第一个系统是Elo评级系统 ,该系统最著名地用于对棋手进行排名. Elo评分系统使用以下公式来计算球员/团队参加比赛后的 ...
- Neo4j:足球转移图表
考虑到就欧洲足球而言,我们仍处于赛季前 转会狂潮 ,我认为整理足球转会图表以查看是否有任何有趣的见解会很有趣. 我花了一段时间才找到合适的消息来源,但最终我遇到了transfermarkt.co.uk ...
- Linux系统编程:验证kernel内核缓存区大小->4096字节
Linux系统编程:验证kernel内核缓存区大小->4096字节 李四老师 于 2018-04-04 00:40:04 发布 2778 收藏 2 分类专栏: [Linux编程] [C/C++编 ...
- 【BZOJ2768】[JLOI2010]冠军调查/【BZOJ1934】[Shoi2007]Vote 善意的投票 最小割
[BZOJ2768][JLOI2010]冠军调查 Description 一年一度的欧洲足球冠军联赛已经进入了淘汰赛阶段.随着卫冕冠军巴萨罗那的淘汰,英超劲旅切尔西成为了头号热门.新浪体育最近在吉林教 ...
最新文章
- 如何发布ActiveX 控件
- 一行代码简化Python异常信息:错误清晰指出,排版简洁美观 | 开源分享
- 网络安全-windowserver搭建DHCP服务器
- 在C#中怎么调用Resources文件中的图片
- python函数定义中参数列表里的参数是_python函数参数中的/和*是什么意思?
- 商品管理后台html,商品类型管理.html
- HDU2546 饭卡【贪心+0-1背包】
- 计算机相关专业及本科课程整理
- 生活过得很苦 不知道什么时候才能解脱
- Flink CDC 2.2 正式发布,新增四种数据源,支持动态加表,提供增量快照框架
- 安装成功后python报错_python安装mysql的依赖包mysql-python操作
- php 安装scws,SCWS分词扩展在windows下的安装方法
- 3d安卓环境搭建_RoboCup 仿真3D简介及环境搭建
- Exynos4412——LCD驱动
- 云南等保2.0介绍,等保合规二级、三级整改所需设备清单和具体解决方案
- bootstrap视频教程 jsp_家政服务系统(JAVA,SSM,BOOTSTRAP,JSP,AJAX,MYSQL)+手把手系列视频教程...
- 狐妖小红娘服务器维护,3月7游戏更新公告 狐妖小红娘版本上线
- 十进制100转换成八进制是多少?
- java 等待线程/线程池执行完毕
- 体育和旅游融合成为今夏显著的旅行趋势