如何成为数据科学家

目录 (Table of Contents)

  1. Introduction介绍
  2. Variety of Skills各种技能
  3. Uniqueness独特性
  4. Impact影响力
  5. Remote远程
  6. Pay工资
  7. Summary摘要

介绍 (Introduction)

As we inch further into the year, I have seen more and more postings for data science positions, especially on LinkedIn, and other similar job-posting sites. After an expected lull due to current events, companies have figured out their budget and focus. Some of those companies include newer data science positions that they need to hire as soon as possible or in the near future.

随着时间的流逝,我看到越来越多的数据科学职位发布,尤其是在LinkedIn和其他类似的职位发布网站上。 在由于当前事件而出现预期的停顿之后,公司已经确定了预算和重点。 其中一些公司包括较新的数据科学职位,他们需要尽快或在不久的将来聘用这些职位。

There are several reasons for becoming a data scientist. I am going to highlight five main reasons I became a data scientist, and hopefully, it can align with some of the reasons why you would become one as well.

成为数据科学家有几个原因。 我将重点介绍我成为数据科学家的五个主要原因,并希望它可以与您也成为数据科学家的一些原因保持一致。

各种技能 (Variety of Skills)

As with many positions that have any general set of expected skills, data science is no exception, and can usually be thought to have these skills that I will outline below. Of course, there are others, but I will focus on the skills I come across the most at various companies as a data scientist.

与许多具有一般预期技能的职位一样,数据科学也不例外,通常可以认为我具备以下这些技能。 当然,还有其他人,但是我将重点介绍我作为数据科学家在各种公司中最常遇到的技能。

  • Python (R)

    Python(R)

— the heavily debated Python versus R is usually controversial, but ultimately, it just depends on what the company is already using as their main programming language. Sometimes, data scientists can work alone and form models and output results directly to a stakeholder, and usually refer more to R in this case. However, in my experience, it has been easier to work cross-functionally with both data engineers and software engineers with the use of Python. This language is oftentimes used for deployment purposes, so, it can be easier to start with Python from the start. The benefit is that in the process of learning data science, you will learn Python or R, which will help you earn a variety of skills that can support you better down the road if you chose a different career path such as software development.

—争论激烈的Python与R通常是有争议的,但最终,它仅取决于公司已将其用作主要编程语言。 有时,数据科学家可以单独工作,并直接将模型和结果输出给利益相关者,在这种情况下,通常会更多地参考R。 但是,以我的经验来看,使用Python与数据工程师和软件工程师进行跨功能的工作变得更加容易。 该语言通常用于部署目的,因此从一开始就可以更轻松地使用Python。 好处是,在学习数据科学的过程中,您将学习Python或R,这将帮助您获得各种技能,如果您选择了不同的职业道路,例如软件开发,这些技能将为您提供更好的支持。

  • SQL

    SQL

— another popular skill for data scientists is SQL. Sometimes, online courses and universities neglect to stress the importance of how widely used this language is for data scientists. It is nearly used for every project I work on because the dataset is not simply given to you. You have to make your own dataset, and that involves querying your database tables with SQL. Like Python (and somewhat R), learning SQL is useful not only for data science but for data engineering and data analytics as well.

—数据科学家的另一项流行技能是SQL。 有时,在线课程和大学忽略了强调这种语言对数据科学家的广泛使用的重要性。 它几乎用于我从事的每个项目,因为数据集不是简单地提供给您的。 您必须创建自己的数据集,这涉及使用SQL查询数据库表。 像Python( 和R )一样,学习SQL不仅对数据科学有用,而且对数据工程和数据分析也很有用。

  • Business

    商业

— while this skill is not a programming language, it is still important. Business, more so a concept, is something every data scientist learns. Similarly to SQL, it is not taught in education settings nearly as much as it should. What I mean by the business is that you need to really get used to jumping into situations that are not strictly just data science. The business uses data scientists to either make a process more efficient or find insights that will change the business in the future. Oftentimes, education for data science will focus so much on obtaining the highest accuracy for say, segmenting different types of customers. It can be great to achieve 98% accuracy, but if you are not able to come up with a plan for how you would implement the model and its results thereafter, then your model is useless.

—尽管此技能不是编程语言,但它仍然很重要。 业务,更是一个概念,是每个数据科学家都学习的东西。 与SQL相似,它在教学环境中的教授程度也不尽如人意。 我的业务意思是,您需要真正习惯于跳入严格不只是数据科学的情况。 该业务使用数据科学家来提高流程效率或寻找可改变未来业务的见解。 通常,数据科学教育将重点放在获得最高准确度上,例如,细分不同类型的客户。 达到98%的准确性可能会很棒,但是如果您无法针对如何实现模型及其结果制定一个计划,那么您的模型将毫无用处。

You need to know that stakeholders, CEO’s, C-Suite/higher leadership, will ask what you will do with your results to change the business. So in turn, you would want to apply those customer segmentation groups to a marketing campaign through various, targeted emails. Then, you would create a test of some sorts to see how the emails performed, say with an AB test. As you can see, just having an extremely accurate model is just one part of the data science and business process. Practicing this business process over and over again is extremely beneficial.

您需要知道,利益相关者,CEO,C-Suite /高层领导会问您将如何处理结果以改变业务。 因此,您又希望通过各种针对性的电子邮件将这些客户细分组应用于营销活动。 然后,您将创建某种测试,例如AB测试,以查看电子邮件的性能。 如您所见,只有一个非常准确的模型只是数据科学和业务流程的一部分。 一遍又一遍地实践这个业务流程是非常有益的。

  • Statistics

    统计

— there was more focus on statistics in school, and it can prove to solve many problems for a data scientist. Knowing statistics is critical for data scientists, as it is the foundation of machine learning models. Practicing analysis of variance, or population sampling, etc, is useful in several forms of the business, say marketing campaigns again, or AB testing.

—在学校,人们更加关注统计数据,它可以证明可以解决数据科学家的许多问题。 了解统计数据对于数据科学家至关重要,因为它是机器学习模型的基础。 进行方差分析或总体抽样等,对几种形式的业务很有用,例如再次进行营销活动或AB测试。

独特性 (Uniqueness)

The growing field of data science may, at first, seem that the position is not as unique as it used to be. However, it is still just as unique, and even more unique at the specific company you will be working at. There may be other roles like security engineers that could possibly be more unique, but data science is one-of-a-kind.

起初,数据科学领域的发展似乎似乎并不像以前那样独特。 但是,它仍然是唯一的,甚至在您将要工作的特定公司中也更加独特。 可能还有其他角色(例如安全工程师)可能会更加独特,但是数据科学是独一无二的

  • Small Headcount

    小人数

To expound on small headcount, software engineers, where even a small tech company can be comprised of near 30 developers, will usually have anywhere from one to four data scientists. When your role is this unique, you can learn valuable skills, touch multiple departments, and impact your company significantly. That is not to say the other aforementioned fields lack these benefits, but I do believe you are more likely to encounter various parts of the business in data science. Ultimately, you will feel great about your everyday work. This benefit leads me to my next point — impact.

为了说明人数少的问题,即使是一家小型科技公司也可以由近30名开发人员组成的软件工程师,通常将拥有1-4名数据科学家。 当您的角色如此独特时,您可以学习宝贵的技能,联系多个部门并显着影响公司。 这并不是说上述其他领域没有这些好处,但我确实相信您更有可能在数据科学领域遇到业务的各个部分。 最终,您将对日常工作感到满意。 这种好处将我引向我的下一个要点-影响。

影响力 (Impact)

After working as a data scientist at multiple companies, it has become clear that even just one project can impact a business indefinitely with significant benefits.

在多家公司担任数据科学家之后,很明显,即使只有一个项目也可以无限期地对企业产生重大利益。

The impact a data scientist can make is outstanding. You can automate previously manual processes, saving the company thousands or even millions of dollars. You can save your company time, and allocate time better spent. The projects you will work on are various in nature and importance.

数据科学家可以产生的影响是杰出的。 您可以使以前的手动流程自动化,从而为公司节省数千甚至数百万美元。 您可以节省公司的时间,并分配更好的时间。 您将从事的项目的性质和重要性各不相同。

For example, I worked on a project that automated a large portion of a manual process, with high accuracy. It was truly amazing to feel how impactful you can be on the business. The best feeling, however, is the impact you can make on society, health, etc. There are countless ways to have a positive impact on something with data science, and your day-to-day work is no exception.

例如,我参与了一个项目,该项目以很高的准确性使手动过程的大部分自动化。 感觉到您对企业的影响力真是太神奇了。 然而,最好的感觉是您可以对社会,健康等产生的影响。通过数据科学,有无数种方法可以对事物产生积极影响,而且您的日常工作也不例外。

远程 (Remote)

Before the current state of The World, remote work was already a prevalent benefit of tech roles, especially that are in data science. Unfortunately, there are several types of careers that cannot benefit from this point, which I admire extremely and am thankful for.

在当前的世界状态之前,远程工作已经是技术角色的普遍利益,尤其是在数据科学领域。 不幸的是,有几种类型的职业不能从这一点中受益,我对此表示非常钦佩和感谢。

If you like to work from home, then data science will be an excellent opportunity for you. There are severe tools and platforms that aid in creating a successful environment without a physical office. You can use video conferencing, messaging, and project management as well as versioning tools. Tools include, but are not limited to:

如果您喜欢在家工作,那么数据科学将是您的绝佳机会。 有严格的工具和平台可帮助您在没有物理办公室的情况下创建成功的环境。 您可以使用视频会议,消息传递和项目管理以及版本控制工具。 工具包括但不限于:

* Zoom* Slack* GitHub* Jira* Confluence

Working from home is personally a huge benefit for me. I see it as an opportunity to enjoy my day more. Living in a city that can give you hours of traffic can be not the best feeling, so being able to eliminate that completely is an enormous positive.

个人而言,在家工作对我来说是一个巨大的好处。 我认为这是一个享受我的一天的机会。 在一个可以给您带来数小时交通流量的城市里生活并不是最好的感觉,因此能够完全消除这种状况是巨大的积极意义。

工资 (Pay)

Yes, data science pays well. I wanted to make sure I included this as the last benefit, as not only is it well known already, but it is not the most important factor in deciding on a career. While more money is great, if you do not like your field, then you will be miserable. However, if you enjoy data science, and expect to build your brand and career in it, then you can expect to have high payouts.

是的,数据科学的收益很高。 我想确保将这作为最后的好处,因为这不仅已经广为人知,而且不是决定职业的最重要因素。 虽然更多的钱是伟大的,但是如果您不喜欢自己的领域,那么您将很痛苦。 但是,如果您喜欢数据科学,并希望在其中建立自己的品牌和职业,那么您可以期望获得很高的回报。

According to Glassdoor [2], the average base pay for a data scientist is $113,309 / yr.

根据Glassdoor [2],数据科学家的平均基本工资为每年113,309美元。

Of course, there are variants between states and even cities in those states, so you can expect different ranges depending on where you live. Some companies offer large bonuses annually as well. Because your role is incredibly impactful, you can also expect shares or stocks in a company at some companies.

当然,各州之间甚至各州的城市之间都存在变体,因此您可以根据自己的居住地预计会有不同的范围。 一些公司每年也提供巨额奖金。 由于您的角色具有令人难以置信的影响力,因此您还可以期望某些公司的公司股票或股票。

Additionally, depending on the job description or job functionality, you can expect variations in salary. Points to consider when negotiating for a data scientist salary include, but are not limited to:

此外,根据职位描述或职位功能,您可以期望薪水有所不同。 谈判数据科学家薪金时要考虑的要点包括但不限于:

  • skills (SQL, Python, R, etc)

    技能(SQL,Python,R等)

  • seniority

    资历

  • who you report to

    向谁报告

  • undergraduate or master’s/Ph.D. required

    本科或硕士/博士学位 需要

  • years of experience

    多年经验

  • machine learning expected/deployment

    机器学习预期/部署

  • data engineering expected

    预期的数据工程

摘要 (Summary)

Photo by Patrick Perkins on Unsplash [3].
Patrick Perkins在Unsplash上拍摄的照片[3]。

As you can see, there are several reasons for becoming a data scientist, especially in 2020. The top five reasons to become a data scientist are: the variety of skills you will learn along the way, uniqueness in your company, impact on your company, remote — work from home, and pay. Data science may not go away for a while and could very well become even more of a popular career. It is important to keep in mind that there are branches of data science like business intelligence, software engineering, and machine learning that are also great careers. Hopefully, you will become a data scientist, and will at least experience these five beneficial reasons for yourself.

如您所见,成为数据科学家的原因有很多,尤其是在2020年。成为数据科学家的五个原因是:您将学习的各种技能,公司的独特性,对公司的影响,远程-在家工作并付费。 数据科学可能不会消失一阵子,并且很可能会成为流行的职业。 重要的是要牢记,诸如商业智能,软件工程和机器学习之类的数据科学分支也是很不错的职业。 希望您将成为数据科学家,并且至少会为自己经历这五个有益的原因。

I hope you found this article interesting and useful. Thank you for reading! Feel free to comment down below your experience or reach out to me!

希望您觉得本文有趣而有用。 感谢您的阅读! 随意在您的经历下留言或与我联系!

翻译自: https://towardsdatascience.com/the-top-5-reasons-to-become-a-data-scientist-cc5492e8cdd7

如何成为数据科学家


http://www.taodudu.cc/news/show-863518.html

相关文章:

  • 大脑比机器智能_机器大脑的第一步
  • 嵌入式和非嵌入式_我如何向非技术同事解释词嵌入
  • ai与虚拟现实_将AI推向现实世界
  • bert 无标记文本 调优_使用BERT准确标记主观问答内容
  • 机器学习线性回归学习心得_机器学习中的线性回归
  • 安全警报 该站点安全证书_深度学习如何通过实时犯罪警报确保您的安全
  • 现代分层、聚集聚类算法_分层聚类:聚集性和分裂性-解释
  • 特斯拉自动驾驶使用的技术_使用自回归预测特斯拉股价
  • 熊猫分发_实用熊猫指南
  • 救命代码_救命! 如何选择功能?
  • 回归模型评估_评估回归模型的方法
  • gan学到的是什么_GAN推动生物学研究
  • 揭秘机器学习
  • 投影仪投影粉色_DecisionTreeRegressor —停止用于将来的投影!
  • 机器学习中的随机过程_机器学习过程
  • ci/cd heroku_在Heroku上部署Dash或Flask Web应用程序。 简易CI / CD。
  • 图像纹理合成_EnhanceNet:通过自动纹理合成实现单图像超分辨率
  • 变压器耦合和电容耦合_超越变压器和抱抱面的分类
  • 梯度下降法_梯度下降
  • 学习机器学习的项目_辅助项目在机器学习中的重要性
  • 计算机视觉知识基础_我见你:计算机视觉基础知识
  • 配对交易方法_COVID下的自适应配对交易,一种强化学习方法
  • 设计数据密集型应用程序_设计数据密集型应用程序书评
  • pca 主成分分析_超越普通PCA:非线性主成分分析
  • 全局变量和局部变量命名规则_变量范围和LEGB规则
  • dask 使用_在Google Cloud上使用Dask进行可扩展的机器学习
  • 计算机视觉课_计算机视觉教程—第4课
  • 用camelot读取表格_如何使用Camelot从PDF提取表格
  • c盘扩展卷功能只能向右扩展_信用风险管理:功能扩展和选择
  • 使用OpenCV,Keras和Tensorflow构建Covid19掩模检测器

如何成为数据科学家_成为数据科学家的5大理由相关推荐

  1. 如何成为数据科学家_成为数据科学家需要了解什么

    如何成为数据科学家 Data science is one of the new, emerging fields that has the power to extract useful trend ...

  2. 数据科学 python_为什么需要以数据科学家的身份学习Python的7大理由

    数据科学 python As a new Data Scientist, you know that your path begins with programming languages you n ...

  3. 趣味数据故事_坏数据的好故事

    趣味数据故事 Meet Julia. She's a data engineer. Julia is responsible for ensuring that your data warehouse ...

  4. 数据创造价值_展示数据并创造价值

    数据创造价值 To create the maximum value, urgency, and leverage in a data partnership, you must present th ...

  5. 数据增强_浅析数据增强

    与计算机视觉中使用图像进行数据增强不同,NLP中文本数据增强是非常罕见的.这是因为图像的一些简单操作,如将图像旋转或将其转换为灰度,并不会改变其语义.语义不变变换的存在使增强成为计算机视觉研究中 举个 ...

  6. echart 数据视图_关于数据可视化图表的制作,你需要关注的30个小技巧

    优秀的数据可视化图表只是罗列.总结数据吗?当然不是!数据可视化其真正的价值是设计出可以被读者轻松理解的数据展示,因此在设计过程中,每一个选择,最终都应落脚于读者的体验,而非图表制作者个人. 今天就给大 ...

  7. VTK修炼之道13:数据读写_图像数据的读写

    1.前言 VTK应用程序所需的数据可以通过两种途径获取: 第一种是生成模型 ;第二种是从外部存储介质里导入相关的数据文件,(如vtkBMPReader读取 BMP图像) .VTK 也可以将程序中处理完 ...

  8. exce中让两列数据一一对应_表格数据对比眼花缭乱、痛苦不堪,找对方法,1秒搞定...

    [温馨提示]亲爱的朋友,阅读之前请您点击[关注],您的支持将是我最大的动力!#学问分享官# 在我们日常工作中,经常碰到两列数据或者两个表格对比,找出差异数据,如果表格的数据太多,靠肉眼一行行对比,即使 ...

  9. influxdb数据过期_为什么腾讯QQ的大数据平台选择了InfluxDB数据库?

    导读:本文带你了解一个开源的.高性能的时序型数据库--InfluxDB. 作者:韩健 来源:华章科技 00 为什么QQ要选择InfluxDB? 从2016年起,笔者在腾讯公司负责QQ后台的海量服务分布 ...

最新文章

  1. ubuntu下codeblocks起步
  2. 【原创】SSRS (SQL Serve Reporting Service) 访问权限的问题
  3. 250鲁大师跑分_鲁大师跑分20万起步的闲鱼二手电脑能买么?只要四招轻松告别套路...
  4. ElasticSearch第二天
  5. Go语言JSON与Byte[]转化
  6. java 对象引用传递
  7. 【程序员面试干货】资深面试官告诉你:测试工程师面试要注意什么?
  8. 在Java中VO , PO , BO , QO, DAO ,POJO是什么意思
  9. Springboot启动扩展点超详细总结,再也不怕面试官问了
  10. 关于Android的学习
  11. 程序员面试金典——17.12整数对查找
  12. 【前端基础】querySelector
  13. java复制文件拒绝访问权限_关于IO流在复制文件时出现java.io.FileNotFoundException: D:\xxx (拒绝访问。) 拒绝访问的问题...
  14. 【关系抽取】深入浅出讲解实体关系抽取(介绍、常用算法)
  15. dota2服务器切换账号,畅爽竞技必看 DOTA2服务器选择指南
  16. JAVA发布栅格图层_简单实现栅格布局的两种方式
  17. rl滤波器原理_滤波器的基础知识
  18. Matlab:i 和j其实是MATLAB内置函数(built-in function)
  19. 征服统计学08|天天在用的P值到底是个啥?
  20. Problem E: 薪酬计算

热门文章

  1. Charles基本使用
  2. C#代码:获取与指定颜色相似的.NET自带颜色
  3. 【培训稿件】构建WCF面向服务的应用程序(包含ppt,源代码)
  4. 浏览器兼容性问题解决方案· 总结
  5. ios 横向滚轮效果_ios横向菜单+页面滑动
  6. 三星Note3水货/行货各版本区别 N900/N9002/N9005/N9006/N9008/N9009有什么不同
  7. MySQL engine/type类型InnoDB/MYISAM/MERGE/BDB/HEAP的区别
  8. android9系统webview崩溃,Android WebView已开始在Android 9上崩溃
  9. mysql5717开发设置怎么调回来_华为手机这几个默认设置,一定要关闭,再也不卡顿...
  10. java测试不成功_java – 测试@NotNull时集成测试失败