
数据科学与人工智能领域的职业 (CAREERS IN DATA SCIENCE & AI)

Alan Turing was 27 years old when the Second World War began and he started working for the British code-breaking organization to help them break German ciphers.

第二次世界大战开始时,艾伦·图灵(Alan Turing)才27岁,他开始为英国密码破解组织工作,以帮助他们破解德国密码。

In layman terms, a cipher is a system for methodically disguising information by converting it into a code. In other words, a cipher is an algorithm for accomplishing the encryption of information. Cipher systems were widely used to protect wartime secrets during the second world war. Turing and his team focused on cryptanalysis to decode these messages. At its core, this decoding involved building counter-algorithms to deconstruct the working of the German cipher systems, most notably Nazi Germany’s Enigma cipher machine. Turing and his team made several advancements towards this.

用通俗易懂的术语来说,密码是一种通过将信息转换为代码来系统地伪装信息的系统。 换句话说,密码是一种用于完成信息加密的算法。 在第二次世界大战期间,密码系统被广泛用于保护战时秘密。 Turing和他的团队专注于密码分析以解码这些消息。 这种解码的核心是建立反算法,以解构德国密码系统,尤其是纳粹德国的Enigma密码机。 图灵和他的团队在此方面取得了一些进步。

It is estimated that Turing’s work shortened the war by more than two years and saved over 14 million lives. Talk about having an impact!

据估计,图灵的工作将战争缩短了两年多,挽救了超过1400万人的生命。 谈论产生影响!

The interesting thing is that in many ways, Turing’s work, both in cryptanalysis as well as his subsequent research with early computing systems, dealt with some of the initial development of intelligent machines. Therefore, this really represents the beginnings of the impact of AI itself.

有趣的是,图灵在密码分析以及随后对早期计算系统的研究中,在许多方面都涉及智能机的一些初期发展。 因此,这确实代表了AI本身的影响的开端。

And if we fast forward a few decades into the future, it is amply evident that the impact of AI has only increased further — going far beyond wartime code-breaking into encompassing much of our everyday lives. Let’s take a few examples.

而且,如果我们向前迈进几十年,那么很明显,人工智能的影响只会进一步增加-远远超出战时代码破解范围,无法涵盖我们的许多日常生活。 让我们举几个例子。

数据科学改变了LinkedIn的增长轨迹 (Data science changes LinkedIn’s growth trajectory)

In 2006 LinkedIn was still a small company with big ambitions. It was then that a young Analyst, soon after completing his PhD in Physics from Stanford, had joined LinkedIn as one of their first data scientists. His name was Jonathan Goldman. Little did he know what a profound impact he was going to have, not just on LinkedIn’s future, but also on clearly establishing the importance of data science for companies around the world.

2006年,LinkedIn仍然是一家志向远大的小公司。 那时,一位年轻的分析师在获得了斯坦福大学的物理学博士学位后不久,便加入了LinkedIn,成为他们最早的数据科学家之一。 他的名字叫乔纳森·戈德曼。 他几乎不知道自己将对LinkedIn的未来产生深远的影响,而且对于清楚地确定数据科学对于全球公司的重要性也将产生深远的影响。

While wading through the rich troughs of data that LinkedIn had begun acquiring about its users by then, Goldman came up with an interesting thought. He realized that while users joined the network and also invited their friends/colleagues to join, there was still a gap. They were unable to connect with people they knew that were already on the platform.

在浏览LinkedIn那时已开始获取的有关其用户的大量数据的同时,高盛提出了一个有趣的想法。 他意识到,尽管用户加入了网络并邀请他们的朋友/同事加入,但仍然存在差距。 他们无法与平台上已经认识的人联系。

He dived into the data about the users — who worked where, studied where and when, was connected to who else, was located where etc. He formed hypotheses around probabilities of knowing someone based on these parameters and tested them. He looked for patterns — e.g. if I worked with X and X knows Y, what are the chances I know Y? Eventually, this was the base from which Goldman built the “People You May Know” product, now a ubiquitous part of the LinkedIn experience. And once released, the feature catapulted LinkedIn’s growth numbers very rapidly. The rest, as they say, is history.

他深入研究了有关用户的数据-谁在哪里工作,在哪里研究何时何地,与其他人有联系,在何处等。他根据这些参数对认识某人的概率进行了假设,并对其进行了测试。 他在寻找模式-例如,如果我与X一起工作并且X知道Y,那么我知道Y的机会是多少? 最终,这是高盛开发“您可能认识的人”产品的基础,现在产品已成为LinkedIn经验中无处不在的一部分。 并在发布后,该功能Swift推动了LinkedIn的增长。 其余的,正如他们所说,是历史。

Netflix引领数据分析新时代 (Netflix ushers in a new era of data analytics)

Netflix has always been a company deeply rooted in data. Some of you might remember that Netflix originally started as an online DVD rental company (before pivoting into online video streaming in the mid 2000s). A key factor in determining early success for Netflix lay in its ability to put in front of the customers the movies that they were most likely to rent, as opposed to having to search through the entire catalog to find something of interest. In essence, this was one of the world’s first large scale recommender systems.

Netflix一直是一家扎根于数据的公司。 你们中有些人可能还记得Netflix最初是一家在线DVD租赁公司(在2000年代中期转向在线视频流传输之前)。 决定Netflix早期成功的关键因素在于它能够将客户最有可能租借的电影展示给客户,而不是必须在整个目录中进行搜索才能找到感兴趣的东西。 从本质上讲,这是世界上最早的大规模推荐系统之一。

But then Netflix did something that changed the game. For Netflix, and for the world at large.

但是后来Netflix做出了改变游戏规则的事情。 对于Netflix以及整个世界。

In Oct 2006, Netflix launched a competition that started humbly enough but very soon snowballed into becoming the largest Machine Learning competition ever held.


The rules were simple enough. Netflix offered a grand prize of $1,000,000 to the team that managed to beat Netflix’s in-house movie recommender (called Cinematch) by more than 10% (or, to be technically correct, the requirement was to reduce the error rate — RSME — by over 10%). Participants were given a sparsely populated training dataset of ~100 million ratings to build their models on.

规则很简单。 Netflix向设法击败Netflix内部电影推荐人(称为Cinematch)的团队提供了100万美元的大奖(或在技术上正确的要求是将错误率(RSME)降低超过10%)。 为参与者提供了人口稀少的约1亿个评分的训练数据集,以建立他们的模型。

The competition ended up running for 3 years and saw participation from over 40,000 teams from 186 countries.


The Netflix competition led to profound insights — and commensurate improvements in performance — in the world of analytics as applied to optimizing machine-based recommendations. Not only did it revolutionize recommender systems, applications of which go far beyond movies, it forced practitioners to really push the envelope when it came to machine learning application. Today, you’ll be hard pressed to find a business-to-consumer company that does not leverage recommender systems in some way or form, including matchmaking! (Read my article on this topic for more: “Can Artificial Intelligence Help You Find Love: Understanding The Business of Matchmaking”)

Netflix竞赛在分析领域中产生了深刻的见解,并在性能上产生了相应的改进,这些都被用于优化基于机器的建议。 它不仅革新了推荐器系统,其应用范围远远超过了电影,而且还迫使从业人员在机器学习应用程序方面真正突破了极限。 今天,您将很难找到一家不以某种方式或形式利用推荐系统(包括对接会)的企业对消费者的公司! (有关更多信息,请阅读我的文章: “人工智能可以帮助您找到爱情:了解对接会的业务” )

Now that we’ve established the undeniable impact of data science and AI on humanity and business alike, let’s see if this influence is episodic or has it seen a sustained growth. The short answer is — yes, a thousand times yes — for sustained, exponential growth! But don’t take my word for it. Let’s instead look at a few hard data points.

既然我们已经确定了数据科学和AI对人类和商业的不可否认的影响,那么让我们看看这种影响是否是偶发性的或持续增长的。 简短的答案是-是的,是的一千倍-是持续的,指数级的增长! 但是不要相信我。 让我们来看一些硬数据点。

公众的兴趣水平 (Level of interest within the general public)

Gone are the days when one had to do extensive primary research or surveys to get a sense of the pulse of the public. Google trends can now provide these insights with a few clicks.

人们不得不进行广泛的基础研究或调查以了解公众的脉搏的日子已经一去不复返了。 只需点击几下,Google趋势现在就可以提供这些见解。

Ironical that data helps us establish the increasing importance of data in this case!


Fig 1. provides the steadily and significantly growing trend from 2008 to 2020 for three key search terms on Google across the world — “Machine Learning”, “Data Science” and “Artificial Intelligence”.


(Jan 2008–Jan 2020)(2008年1月至2020年1月)

专门研究AI和机器学习的学生 (Students specializing in AI and Machine Learning)

One of the leading indicators of a particular discipline gaining traction is the number of students aspiring to pursue their education in the said discipline. The AI Index Report for 2019 mapped these trends across a few top universities. Fig 2a and 2b provide the increasing trend of enrollments in courses on ‘Introduction to Machine Learning’ and ‘Introduction to Artificial Intelligence’ respectively. Do note, that university enrollments are also limited by number of seats available, so these charts, though skyrocketing in recent years, likely under-represent the actual interest in these disciplines!

吸引某门学科发展的主要指标之一是渴望在该门学科中接受教育的学生人数。 2019年的AI指数报告将这些趋势映射到了一些顶尖大学。 图2a和2b分别提供了“机器学习 入门 “人工智能入门 课程的入学人数增长趋势。 请注意,大学的入学人数也受到可用席位的限制,因此,尽管这些图表近年来猛增,但可能不足以代表这些学科的实际兴趣!

(Source: 2019 AI Index Report)(来源:2019年AI指数报告)
(Source: 2019 AI Index Report)(来源:2019年AI指数报告)

Another interesting student statistic is the growing number of doctoral candidates. According to the 2019 AI Index Report, AI has quickly become the most desired specialization among computer science PhD students in USA.

另一个有趣的学生统计数据是博士候选人数量的增长。 根据2019年AI指数报告,人工智能已Swift成为美国计算机科学博士生中最需要的专业。

There are over twice as many PhD students for AI compared to the second most popular specialization (security/information assurance)!


投资开发AI功能 (Investments into development of AI capabilities)

We live in a decidedly capitalistic world. Therefore, the direction in which money is flowing is almost always one of the best indicators of what is the hottest thing around. We therefore look at total funding attracted by AI startups.

我们生活在一个绝对的资本主义世界。 因此,资金流动的方向几乎始终是周围最热事物的最好指标之一。 因此,我们着眼于人工智能初创公司吸引的总资金。

From humble beginnings of just over $300 Million invested in the AI space in 2009, the landscape changed rapidly in less than a decade. In 2018, a total of $40.4 Billion was invested in AI startups globally. This is a mind-boggling increase at a cumulative annual growth rate (CAGR) of more than 70%!

从2009年在AI领域投资超过3亿美元的谦虚开始,情况在不到十年的时间里Swift变化。 2018年,全球对AI初创公司的投资总额达到404亿美元。 这是惊人的增长,累计年增长率(CAGR)超过70%!

Total PE-VC investment in AI in USD Billions (Source: CAPIQ, Quid, Crunchbase, 2019)

To sum up — AI, data science and machine learning have truly arrived. Throughout recent history, this emerging branch of study has had a disproportionate impact on how people live their lives and how companies run their businesses. This trend is not just here to stay but is undoubtedly going to witness a further explosion in growth rates, as evident from looking at the best talent and big money gravitating towards these fields.

综上所述,人工智能,数据科学和机器学习已经真正到来。 纵观最近的历史,这个新兴的研究分支对人们的生活方式以及公司的经营方式产生了不成比例的影响。 这种趋势不仅会持续下去,而且毫无疑问会见证增长率的进一步增长,这可以从寻找吸引这些领域的最佳人才和巨额资金中看出。

如果曾经有完美的时机进入新的职业领域,那就是现在! (If ever there was a perfect time to enter a new career field, it is now!)

翻译自: https://towardsdatascience.com/is-being-a-data-scientist-really-the-sexiest-job-around-hell-yeah-b652a20b302




  • vc6创建dll文件的步骤_创建真正有用的产品支持页面的6步骤计划
  • 函数指针深入探索
  • Logback:同时按照日期和大小分割日志(最新日志可以不带日期或数字)
  • 【Unity】从零使用Amplify Shader - 超简单2D外轮廓
  • 【工程】Pulp-Amply(三)
  • 农林学科英语 课后习题答案与复习大纲
  • OpenCV部分
  • 单片机是指把组成微型计算机的各功能部件即,单片机结题报告.doc
  • DOCK6.9学习(VII)
  • [论文笔记]EMNLP2019: Fine-Grained Entity Typing via Hierarchical Multi Graph Convolutional Networks
  • DataGrip连接Hive详细步骤
  • 小白入门Python Web之web开发前的准备(持续更新~)
  • c语言中isupper用法,python之函数用法isupper()
  • 深度学习实例第二部分:OpenCV
  • 极品,git简介,安装,方法
  • 清除浮动最有效的css写法,清除浮动最有效的css写法
  • Skywalking光会用可不行,必须的源码分析分析 - Skywalking Agent 插件解析
  • ITE EC(IT81202)--- PMC模块手册翻译
  • Python-Level5-day10am:视频基本处理,图像处理综合案例
  • Unexpected error while obtaining UI hierarchy:使用uiautomatorviewer定位元素报错
  • Oracle的对象权限、角色权限、系统权限
  • Liquibase集成达梦数据库、Activiti集成达梦数据库
  • Workflow规则收藏
  • Drools7 动态更新规则
  • GitLab CI/CD .gitlab-ci.yaml 关键词(十二):条件限定,only ,except,触发规则rules,工作流workflow
  • java中workFlowEvent_关于WorkFlow的使用以及例子
  • Windows Workflow学习笔记
  • WorkFlow工程项目简介
  • Workflow Engine for .Net Core ENTERPRISE v4.0.10-SEO-狼术
  • WorkFlow建立


  1. 如何成为python 数据分析师_成为一名数据分析师,应该掌握怎样的技术栈?

    数据分析师是不易被人工智能取代的新兴职业,相比算法工程师.人工智能工程师而言比较好入门.学好数据分析,也可为进一步的数据科学.机器学习打下一定的基础. 最近我知乎了各种如何学习数据分析之类的话题,ge ...

  2. python高考谣言_新浪微博中文谣言数据

    中文谣言数据 该数据为从新浪微博不实信息举报平台抓取的中文谣言数据,分为两个部分.其中当前目录下的数据集仅包含谣言原微博,不包含转发/评论信息:而CED_Dataset中是包含转发/评论信息的中文谣言 ...

  3. oracle数据如何获取游标中动态字段_如何实现报表数据的动态层次钻取(二)

    上一篇<如何实现报表数据的动态层次钻取(一)>介绍了利用复杂 sql 实现动态层次结构的方法,但该方法依赖 Oracle 的递归语法,在其他类型的数据库中难以实现.要想通用地实现此类报表, ...

  4. 左边是地狱右边也是地狱_像我这样的设计师的特别地狱

    左边是地狱右边也是地狱 by Adrian Hanft 通过阿德里安·汉夫特(Adrian Hanft) 像我这样的设计师的特别地狱 (A Special Hell for Designers Lik ...

  5. 以下关于python二维数据的描述中错误的是_关于二维数据CSV存储问题,以下选项中描述错误的是‪‪‪‪‪‪‫‪‪‪‪‪‫‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‪‫:...

    [单选题]C语言中最基本的数据类型包括( ). [单选题]计算机的性能可以用很多指标来衡量,除了用其运算速度.字长等指标以外,( )也作为主要指标. [单选题]Word2010文档文件的扩展名是( ) ...

  6. 高考大数据:哪个省才是高考地狱模式?结论和想象不太一样

    来源 | 国金证券吴劲草 <哪些省才是真正的高考地狱模式?-数据量化全国31省高考难度,结论可能和想象中不太一样> 不同省份的高考难度,一直是一个争议严重的问题. 每个地方的人,都会觉得自 ...

  7. 高考大数据:全国31省高考难度,哪个才是地狱模式?

    目录 1.高考是个什么难度级别的考试 2.未来高考人数会变少吗? 3.各地高考人数差异巨大 4.各地高考难度评级 5.上海能上到高中的人只有一半?--这是全国各地的普遍情况 6.部分省市点评 不同省份 ...

  8. 数据告诉你,哪个省才是高考地狱模式?第一名,你想不到!

    转自:上海数据分析 最近各省高考分数线逐渐出炉,那么高考哪个省最难呢? 不同省份的高考难度,一直是一个争议严重的问题. 每个地方的人,都会觉得自己是比较难的那一个.因为其实不管在哪里,高考都是件不容易 ...

  9. 数据告诉你,哪个省才是高考地狱模式?

    来源:上海数据分析 一直有一个争议严重的问题:高考哪个省最难呢? 每个地方的人,都会觉得自己是比较难的那一个.因为其实不管在哪里,高考都是件不容易的事情. 高考录取,本质上是一种"省内筛选& ...


  1. *44.程序的链接方式
  2. socket编程的select模型
  3. one order event trace - how to switch on
  4. informix和mysql的区别_DB2与Informix区别比较
  5. android串口service,Android串口操作库:EZ-SerialPort
  6. 用GZIP来压缩socket传输的序列化的类
  7. ImportError: No module named MySQLdb
  8. 谷歌、火狐浏览器扩展开发
  9. 前端模糊搜索,拼音模糊搜索,js拼音模糊搜索
  10. 什么是OEM和ODM
  11. Instagram移动网页版推图片分享功能:追求国际增长
  12. 微信企业号已停止提供企业消息会话服务器,企业微信注册时显示会话服务已经被安装了怎么解决 解决攻略教程大全...
  13. 2019南京大学计算机考研录取名单,2019南京大学计算机考研录取名单啥时出来
  14. android studio增加一个界面,Android Studio在同一个窗口中打开多个Project【附效果图附源码...
  15. linux一键安装包 制作,linux一键安装包
  16. python数据可视化库_python和r中用于数据可视化的前9个库
  17. 聊聊常见的服务(接口)认证授权
  18. NGINX-RTMP 直播服务部署
  19. 跟紧时代的脚步:梦想是一定要有的,万一实现了呢!
  20. 【直播问答精选】湿热灭菌和冻干验证主题研讨会——让灭菌和验证变的简单!


  1. 职场干货:身为程序员的你,用了多长时间学习和研究,才达到某一领域技术专家的水平?
  2. 红米路由器ac2100怎样设置ipv6_红米(Redmi)路由器AC2100手机怎么设置? | 192路由网...
  3. 给Krpano小白们的最最最入门级教程(二)
  4. Python地学分析 — GDAL对遥感影像重投影
  5. 给你的SpringBoot工程打的jar包瘦瘦身
  6. 宇宙存在三级量子--超越爱因斯坦
  7. 程序员的linux杯子,6款专为程序猿定制的礼品
  8. AutoJS4.1.0实战教程---京东领京豆
  9. 41份艾媒舆情-舆情相关行业报告
  10. 微型计算机系统原理接口与EDA设计技术,微型计算机系统与接口