《成为一名机器学习工程师》

by Sudharsan Asaithambi

通过Sudharsan Asaithambi

成为机器学习的拉斐尔·纳达尔 (Become the Rafael Nadal of Machine Learning)

One year back, I was a newbie to the world of Machine Learning. I used to get overwhelmed by small decisions, like choosing the language to code with, choosing the right online courses, or choosing the correct algorithms.

一年前,我是机器学习领域的新手。 我过去常常被一些小的决定所淹没,例如选择编码语言,选择正确的在线课程或选择正确的算法。

So, I have planned to make it easier for folks to get into Machine Learning.

因此,我计划让人们更轻松地学习机器学习。

I’ll assume that many of us are starting from scratch on our Machine Learning journey. Let’s find out how current professionals in the field reached their destination, and how we can emulate them on our journey.

我假设我们中的许多人是在机器学习之旅中从头开始的。 让我们找出当前该领域的专业人员如何到达目的地,以及我们如何在旅途中效仿他们。

I will illustrate how you can learn Data Science by drawing a parallel between how Rafael Nadal learned to play tennis, and how you can learn Machine Learning.

我将通过拉斐尔·纳达尔(Rafael Nadal)的打网球方式与机器学习的方式之间的相似之处来说明如何学习数据科学。

投入自己-阶段1 (Commit Yourself — Stage 1)

Nadal had sports talent all around him in his family. Inspired by them, he began his tennis journey at the age of 3.

纳达尔在他的家人中都拥有体育才能。 受他们的启发,他从3岁开始网球之旅。

For anyone starting out in Machine Learning, it’s important to surround yourselves with people who are also learning, teaching and practicing Machine Learning.

对于刚开始学习机器学习的任何人来说,重要的是要让自己也同时学习,教授和练习机器学习。

Learning the ropes is not easy if you do it alone. So, commit yourselves to learning Machine Learning — and find data science communities to help make your entry less painful.

如果独自一人学习绳索并不容易。 因此,请致力于学习机器学习-并找到数据科学社区,以帮助减轻您的入学痛苦。

学习生态系统-第二阶段 (Learn the Ecosystem — Stage 2)

Rafael Nadal learnt the not only the rules of Tennis, but also the surrounding ‘ecosystem’.

拉斐尔·纳达尔(Rafael Nadal)不仅学习了网球规则,还学习了周围的“生态系统”。

He learnt about the different types of rackets, balls, court surfaces. He learned about the scoring in tennis. He enrolled himself for a tennis coaching.

他了解了球拍,球和球场表面的不同类型。 他了解了网球得分的知识。 他报名参加了网球教练。

探索机器学习生态系统 (Discover the Machine Learning ecosystem)

Data Science is a field which has embraced and made full use of open source platforms. While data analysis can be conducted in a number of languages, using the right tools can make or break projects.

数据科学是一个已经拥抱并充分利用开源平台的领域。 虽然可以使用多种语言进行数据分析,但是使用正确的工具可以创建或破坏项目。

Data Science libraries are flourishing in the Python and R ecosystems. See here for an infographic on Python vs R for data analysis.

数据科学图书馆在PythonR生态系统中蓬勃发展。 参见此处获取有关Python与R进行数据分析的信息图 。

Whichever language you choose, Jupyter Notebook and RStudio makes our life much easier. They allow us to visualize data while manipulating it. Follow this link to read more on the features of Jupyter Notebook.

无论选择哪种语言, Jupyter NotebookRStudio 都能使我们的生活变得更加轻松。 它们使我们能够在处理数据时可视化数据。 单击此链接以阅读有关Jupyter Notebook功能的更多信息。

Kaggle, Analytics Vidhya, MachineLearningMastery and KD Nuggets are some of the active communityies where data scientists all over the world enrich each other’s learning.

Kaggle,Analytics Vidhya,MachineLearningMastery和KD Nuggets是活跃的社区,全世界的数据科学家都在此相互学习。

Machine Learning has been democratized by online courses or MOOCs from Coursera, EdX and others, where we learn from amazing professors at world class universities. Here’s a list of the top MOOCs on data science available right now.

机器学习已被CourseraEdX等公司的在线课程或MOOC民主化,我们从世界一流大学的杰出教授那里学习。 这是目前可用的数据科学顶级MOOC列表 。

巩固基金会-第三阶段 (Cement the Foundation — Stage 3)

拉斐尔·纳达尔(Rafael Nadal)掌握了基本动作 (Rafael Nadal learned the basic shots)

Nadal’s coach taught him the forehand and backhand shots. This is the main foundation of tennis. Rafael could play the match competently with these basic shots.

纳达尔的教练教给他正手和反手射击。 这是网球的主要基础。 拉斐尔可以凭借这些基本投篮胜任比赛。

学习操纵数据 (Learn to manipulate data)

Data scientists, according to interviews and expert estimates, spend 50 percent to 80 percent of their time mired in the mundane labor of collecting and preparing unruly digital data, before it can be explored for useful nuggets. - Steve Lohr of New York Times

根据采访和专家估计,数据科学家将其50%至80%的时间都花在了收集和准备不规则数字数据的繁琐工作上,然后才可以探索有用的块。 -纽约时报的史蒂夫·洛尔

‘Data Crunching’ is the soul of the whole Machine Learning workflow. To help with this process, the Pandas library in python or R’s DataFrames allow you to manipulate and conduct analysis. They provide data structures for relational or labeled data.

“数据处理”是整个机器学习工作流程的灵魂。 为了帮助完成此过程,可以使用python或R's DataFrames中的Pandas库来操纵和进行分析。 它们提供关系数据或标记数据的数据结构。

Data science is more than just building machine learning models. It’s also about explaining the models and using them to drive data-driven decisions. In the journey from analysis to data-driven outcomes, data visualization plays a very important role of presenting data in a powerful and credible way.

数据科学不仅仅是构建机器学习模型。 它还涉及解释模型并使用它们来驱动数据驱动的决策。 在从分析到以数据为依据的结果的过程中,数据可视化扮演着以强大而可靠的方式呈现数据的非常重要的角色。

Matplotlib library in Python or ggplot in R offer complete 2D graphic support with very high flexibility to create high quality data visualizations.

Matplotlib Python中的库或R中的ggplot提供了完整的2D图形支持,并且具有很高的灵活性,可以创建高质量的数据可视化。

These are some of the libraries you will be spending most of your time on when conducting the analysis.

这些是进行分析时将花费大部分时间的一些库。

日复一日地练习—阶段4 (Practice day in and day out — Stage 4)

Rafael Nadal, when asked how much he trained:

当被问及他接受了多少训练时,拉斐尔·纳达尔(Rafael Nadal):

“I train four hours a day, 210 days a year. If we add to that I play around 80 matches per year, each one lasting an average of two hours. That is 1000 hours playing tennis per year — and that is without counting the training days during tournaments.”

“我一年210天,每天训练四个小时。 如果再加上我每年参加约80场比赛,平均每场比赛持续2个小时。 那就是每年打网球1000个小时-这还不包括比赛期间的训练天数。”

学习机器学习算法并进行实践 (Learn Machine Learning algorithms and practice them)

After the foundation is set, you get to implement the Machine Learning algorithms to predict and do all the cool stuff.

设置好基础之后,您就可以实现机器学习算法来预测和完成所有有趣的工作。

The Scikit-learn library in Python or the caret, e1071 libraries in R provide a range of supervised and unsupervised learning algorithms via a consistent interface.

Python中的Scikit-learn库或R中的carete1071库通过一致的接口提供了一系列有监督和无监督的学习算法。

These let you implement an algorithm without worrying about the inner workings or nitty-gritty details.

这些使您可以实现算法,而不必担心内部工作原理或细节问题。

Apply these machine learning algorithms in the use cases you find all around you. This could either be in your work, or you can practice in Kaggle competitions. In these, data scientists all around the world compete at building models to solve problems.

在周围发现的用例中应用这些机器学习算法。 这可以在您的工作中,也可以在Kaggle比赛中进行练习。 在这些工具中,世界各地的数据科学家都在竞争解决问题的模型构建方面。

Simultaneously, understand the inner workings of one algorithm after another. Starting with ‘Hello World!’ of Machine Learning, Linear Regression then move to Logistic Regression, Decision Trees to Support Vector Machines. This will require you to brush up your statistics and linear algebra.

同时,了解一种算法的内部工作原理。 从“ Hello World!”开始 机器学习, 线性回归然后转向逻辑回归决策树 支持向量机 。 这将要求您重新整理统计信息和线性代数。

Coursera Founder Andrew Ng, a pioneer in AI has developed a Machine Learning course which gives you a good starting point to understanding inner workings of Machine Learning algorithms.

Coursera创始人AI的先驱Andrew Ng开发了机器学习课程 ,为您提供了一个很好的起点,让您了解机器学习算法的内部工作原理。

学习高级技能-阶段5 (Learn the advanced skills— Stage 5)

拉斐尔·纳达尔(Rafael Nadal)学会了打高手 (Rafael Nadal learned to play advanced shots)

Nadal, while concentrating on the fundamental play, also was introduced to the advanced shots. The shots that only professionals who play tennis day in and day out are able to pull off.

纳达尔(Nadal)在专注于基本比赛的同时,也向他介绍了高级投篮。 只有日复一日打网球的专业人士才能投篮。

学习复杂的机器学习算法和深度学习架构 (Learn complex Machine Learning Algorithms and Deep Learning architectures)

While Machine Learning as a field was established long back, the recent hype and media attention is primarily due to Machine Learning applications in AI fields like Computer Vision, Speech Recognition, Language Processing. Many of these have been pioneered by the tech giants like Google, Facebook, Microsoft.

虽然机器学习作为一个领域早已建立,但最近的炒作和媒体关注主要归因于AI领域中的机器学习应用,例如计算机视觉,语音识别,语言处理。 其中许多都是由Google,Facebook,Microsoft等科技巨头开创的。

These recent advances can be credited to the progress made in cheap computation, the availability of large scale data, and the development of novel Deep Learning architectures.

这些最新进展可以归功于廉价计算,大规模数据的可用性以及新型深度学习架构的发展。

To work in Deep Learning, you will need to learn how to process unstructured data — be it free text, images, or sounds.

要在深度学习中工作,您将需要学习如何处理非结构化数据-无论是自由文本,图像还是声音。

You will learn to use platforms like TensorFlow or Torch, which lets us apply Deep Learning without worrying about low level hardware requirements. You will learn Reinforcement learning, which has made possible modern AI wonders like AlphaGo Zero.

您将学习使用TensorFlowTorch之类的平台,这使我们能够应用深度学习,而不必担心底层硬件的需求。 您将学习强化学习,这使诸如AlphaGo Zero之类的现代AI奇迹成为可能。

立即迈出学习机器学习的第一步! (Take your first step towards learning Machine Learning now!)

  1. Install Anaconda and use Jupyter to write Python安装Anaconda并使用Jupyter编写Python

Go through some Python tutorials and learn its fundamental data structures and syntax.

通过一些Python教程 ,学习其基本数据结构和语法。

2. Surround yourselves with Data Science. Create account at:

2.自己掌握数据科学。 在以下位置创建帐户:

● Kaggle and checkout the kernels written by top data scientists. Kaggle helps you to lubricate and establish a standard workflow to adhere to any Data Science Problem

● Kaggle并签出由顶级数据科学家编写的内核。 Kaggle可帮助您润滑并建立标准的工作流程以遵守任何数据科学问题

● Analytics Vidhya: This website is a goto place for many data scientists. This site boasts of a 4 million unique visitors per month and has a very active community.

● Analytics Vidhya :该网站是许多数据科学家的首选之地。 该网站每月拥有400万唯一身份访问者,并且拥有非常活跃的社区。

●Checkout YouTube pyData Channel. pyData is a conference arranged by the open source community to educate analysts with the latest developments in Data Science. This gives you

●结帐YouTube pyData Channel 。 pyData是一个由开源社区组织的会议,目的是教育分析人员了解数据科学的最新发展。 这给你

● Use podcasts to learn about the latest tools and technology in AI. Podcasts is a great way to spend time on your daily chores, be it jogging, to arranging your closet or while commuting. If you are new to podcasts, download the Podcast addict app onto your phone.

●使用播客了解AI中的最新工具和技术。 播客是一种在日常琐事上花费时间的好方法,无论是慢跑,安排壁橱还是上下班途中。 如果您不熟悉播客,请将播客上瘾者应用程序下载到手机上。

Machine Learning — Software Engineering Daily | Every week Jeff interviews people from the heart of Data Science. It gives you the very rare early peek into what’s going on in silicon valley, helping you to get onto new techniques and technologies. It gives you so many new ideas to implement into your work. Can’t recommend this enough.

机器学习—软件工程日报| 杰夫每周都会采访来自数据科学中心的人们。 它为您提供了非常罕见的早期窥视硅谷动态的信息,可帮助您掌握新技术。 它为您提供了许多新想法,可以在您的工作中实施。 不能推荐这个。

● Medium

●中

Follow some of the Machine Learning publications here on Medium:

在Medium上关注一些机器学习出版物:

  • Towards Data Science

    走向数据科学

  • Artificial Intelligence.

    人工智能 。

● Go to Coursera and Edx, and check out the various Machine Learning courses available.

●转到Coursera和Edx ,并查看可用的各种机器学习课程。

I will end this post with this quote by Robin Sharma:

我将以Robin Sharma的话作为结尾:

Every Pro was Once an Amateur.

每个职业选手都曾经是业余选手。

Every Expert was Once a Beginner.

每个专家都是初学者。

So Dream Big.

所以梦想大。

And Start Now.

并立即开始。

Please comment below to tell us why you are planning to start your Machine Learning journey, and how you plan to do so.

请在下面发表评论,以告诉我们您为何计划开始您的机器学习之旅,以及您打算如何开始。

And for all you Machine Learning pros, give us the nuances of what works and what doesn’t. Please comment below on how you started your Machine Learning journey and what expedited and hindered your learning process.

对于所有机器学习专家来说,请告诉我们哪些有效和哪些无效。 请在下面评论您是如何开始机器学习之旅的,以及加速和阻碍学习过程的因素。

翻译自: https://www.freecodecamp.org/news/baby-steps-to-learn-machine-learning-from-a-tennis-fan-d4171f51c23f/

《成为一名机器学习工程师》

《成为一名机器学习工程师》_成为机器学习的拉斐尔·纳达尔相关推荐

  1. 机器学习 量子_量子机器学习:神经网络学习

    机器学习 量子 My last articles tackled Bayes nets on quantum computers (read it here!), and k-means cluste ...

  2. 小时转换为机器学习特征_通过机器学习将pdf转换为有声读物

    小时转换为机器学习特征 This project was originally designed by Kaz Sato. 该项目最初由 Kaz Sato 设计 . 演示地址 I made this ...

  3. 机器学习 预测模型_使用机器学习模型预测心力衰竭的生存时间-第一部分

    机器学习 预测模型 数据科学 , 机器学习 (Data Science, Machine Learning) 前言 (Preface) Cardiovascular diseases are dise ...

  4. python 机器学习管道_构建机器学习管道-第1部分

    python 机器学习管道 Below are the usual steps involved in building the ML pipeline: 以下是构建ML管道所涉及的通常步骤: Imp ...

  5. 机器学习 生成_使用机器学习的Midi混搭生成独特的乐谱

    机器学习 生成 AI Composers present ideas to their human partners. People can then take certain elements an ...

  6. 机器学习回归预测_通过机器学习回归预测高中生成绩

    机器学习回归预测 Introduction: The applications of machine learning range from games to autonomous vehicles; ...

  7. 字节跳动喜欢招聘这样的机器学习工程师

    机器学习工程师是不是已经饱和了?初级的算法岗位到底还好不好找工作?行业里需要怎样的机器学习工程师?如果我现在想从事AI行业的话,到底该怎么进入? 这些都是用户的普遍问题.最近这一年多时间,随着人工智能 ...

  8. Python工程能力进阶、数学基础、经典机器学习模型实战、深度学习理论基础和模型调优技巧……胜任机器学习工程师岗位需要学习什么?...

    咱不敢谈人工智能时代咋样咋样之类的空话,就我自己来看,只要是个营收超过 5 亿的互联网公司,基本都需要具备机器学习的能力.因为大部分公司盈利模式基本都会围绕搜索.推荐和广告而去. 就比如极客时间,他的 ...

  9. 亚马逊机器学习工程师面试怎么过?

    作者 | Terence Shin 译者 | 苏本如,责编 | 夕颜 出品 | CSDN(ID:CSDNnews) 你是否好奇亚马逊的企业文化.招聘流程和面试?本文将带你深入了解一下! 简介 从最初的 ...

最新文章

  1. SQL 经典回顾:JOIN 表连接操作不完全指南
  2. 全球及中国明装灯具市场规模预测及产量需求渠道分析报告2022-2027年
  3. 【CodeForces - 129C】Statues(思维,bfs)
  4. 群晖ffmpeg_群晖Video station支持DTS和EAC3
  5. 当我的生活只剩下写代码时
  6. 《C++程序设计语言(特别版)》——忠告
  7. 01_Navicat的快捷键学习
  8. Educode--头歌 《软件工程》实验作业6-软件开发计划
  9. 屌丝变身海归精英?揭秘芝麻信用分黑色产业链
  10. 轻松两步实现了接口限流
  11. git --amend用法
  12. Windows Azure 解决方案系列: Real World Windows Azure: 与微软杰出工程师, Sean Nolan的访谈...
  13. xxx-1.0-SNAPSHOT.jar中没有主清单属性的解决方法
  14. 基于 HTML5 WebGL 的 CPU 仿真 3D 可视化
  15. [软件逆向]实战Mac系统下的软件分析+Mac QQ和微信的防撤回
  16. 百度换肤JavaScript功能
  17. 【强化学习】GAIL
  18. imp 00017 由于 oracle 错误 6550,imp 导入dmp文件报错 IMP-00017: 由于 ORACLE 异常 20005 求大神!...
  19. 银行板块行情发令枪已打响12月7日天弘中证银行ETF发售1天
  20. 广和通LTE-A模组FG101FM101系列全线量产,提升Cat 6新体验

热门文章

  1. mysql数据库实用教程答案
  2. PHP----学生管理系统
  3. 夯实基础——P2084 进制转换
  4. 30秒的PHP代码片段-MATH
  5. dos常用文件操作命令
  6. 关于Linux的总结(三)
  7. C# 函数 传入 C++动态库中 做回调函数
  8. Windows XP和Windows 7双系统安装和启动菜单修复
  9. JVM初探:内存分配、GC原理与垃圾收集器
  10. 【转】SASS用法指南