by Richard Freeman, PhD

理查德·弗里曼(Richard Freeman)博士

根据我的经验如何进行数据科学,人工智能或大数据工作 (How to work in Data Science, AI, or Big Data based on my experience)

In summer 2013, I interviewed for a lead role in the data science and analytics team at tech-for-good company JustGiving. During the interview, I said I planned to deliver batch machine learning, graph analytics and streaming analytics systems, both in-house and in the cloud.

2013年夏天,我采访了以技术换品公司JustGiving的数据科学和分析团队的领导角色。 在采访中,我说过我计划在内部和云中提供批处理机器学习,图形分析和流分析系统。

A few years later, my former boss Mike Bugembe and I were both presenting at international conferences, winning awards and becoming authors!

几年后,我和我的前老板迈克·布根贝 ( Mike Bugembe)都参加了国际会议,获奖并成为作家!

Here is my story, and what I learnt on the journey — plus my recommendations for you.

这是我的故事,以及我在旅途中学到的东西-以及对您的建议。

为什么选择大数据工程和数据科学? (Why Big Data Engineering and Data Science?)

I’ve always been interested in artificial intelligence (AI), machine learning (ML) and natural language processing (NLP). In particular, I’ve been interested in scalable systems, and making robots more intelligent and responsive.

我一直对人工智能(AI),机器学习(ML)和自然语言处理(NLP)感兴趣。 特别是,我一直对可伸缩系统感兴趣,并使机器人更加智能,响应能力更强。

My interest in data engineering comes from my background as a solutions architect. In that role, I enjoyed building cloud-based systems to store and process data to derive new insight and knowledge.

我对数据工程的兴趣来自于我作为解决方案架构师的背景。 在担任该职务期间,我喜欢构建基于云的系统来存储和处理数据以获取新的见识和知识。

I also develop big data and ML pipelines to automate the whole ML process. This helps data scientists and analysts save time preparing data for training and testing their algorithms, running metrics and deriving key performance indicators at scale.

我还开发了大数据和ML管道来自动化整个ML过程。 这有助于数据科学家和分析师节省准备数据的时间,以训练和测试其算法,运行指标并大规模导出关键性能指标。

Data preparation is particularly important. Data scientists typically spend about 80% of their time on it. Having access to data shaped in the right way makes them more productive and happier.

数据准备特别重要。 数据科学家通常花费大约80%的时间在上面。 能够以正确的方式访问经过整形的数据,可以使它们更加高效和快乐。

我以前的背景 (My previous background)

I previously earned a Masters degree in computer systems engineering, and a PhD in ML and NLP. I completed both at the University of Manchester.

我之前获得了计算机系统工程的硕士学位,以及ML和NLP的博士学位。 我在曼彻斯特大学都读完了。

Rather than join a specialised vendor in my Ph.D. area of expertise, I decided to broaden my skills and gain more client exposure by joining Capgemini. Capgemini are a large global consulting, technology and outsourcing services company.

而不是加入我的博士学位的专业供应商。 在专业领域,我决定加入Capgemini来扩展自己的技能并获得更多客户。 凯捷是一家大型的全球咨询,技术和外包服务公司。

I worked my way from being a developer to a solution architect. There, I helped deliver large scale projects for Fortune Global 500 companies in sectors including insurance, retail banking, financial services, and central government.

从开发人员到解决方案架构师,我一直在努力。 在那儿,我为保险,零售银行,金融服务和中央政府等部门的《财富》全球500强公司提供了大型项目。

I then joined PageGroup. There, I worked as an lead developer and architect on a global transformation programme across 34 countries. I led the technical delivery of search, multi-channel communication, business intelligence, text analytics, job board integration, and advertising solutions.

然后,我加入了PageGroup。 在那里,我担任过首席开发人员和架构师,负责横跨34个国家/地区的全球转型计划。 我领导了搜索,多渠道通信,商业智能,文本分析,工作委员会集成和广告解决方案的技术交付。

现任职务 (Current roles)

Now I am a lead big data and machine learning engineer at JustGiving. JustGiving is a tech-for-good company that’s helped 26 million users in 164 countries raise $5 billion for good causes. It was acquired in 2017 by Blackbaud — the world’s leading software company powering social good.

现在,我是JustGiving的首席大数据和机器学习工程师。 JustGiving是一家高科技公司,为164个国家/地区的2600万用户提供了50亿美元的善款。 它于2017年被Blackbaud收购, Blackbaud是支持社会公益的全球领先软件公司。

I currently lead the delivery and architecture of our in-house data science platform RAVEN and production ML systems. These were initially deployed with Azure, but later hosted in AWS. I also dive in as a data scientist specialising in scalable streaming analytics, ML and NLP algorithms.

我目前负责内部数据科学平台RAVEN和生产ML系统的交付和架构。 这些最初是通过Azure部署的,但后来托管在AWS中。 我还将作为一名数据科学家加入,专门研究可伸缩流分析,ML和NLP算法。

I share my technical experience and knowledge internally and externally relating to AWS, stream processing, serverless stacks, ML and NLP. I also present regularly at industry conferences, open source my code and write technical blog posts on Medium and for AWS such as Analyze a Time Series in Real Time.

我在内部和外部分享有关AWS,流处理,无服务器堆栈,ML和NLP的技术经验和知识。 我还定期在行业会议上演讲, 开源我的代码,并在Medium和AWS上撰写技术博客文章,例如“ 实时分析时间序列” 。

I’m also an independent freelance advisor and consultant helping organisations with cloud architecture, serverless computing and ML at Starwolf.

我还是独立的自由顾问和顾问 ,在Starwolf帮助组织进行云架构,无服务器计算和ML的组织。

办公室里典型的一天 (A typical day in the office)

JustGiving is still a start-up at heart, so there is no typical day. I get involved in various tasks, such as data and report requirements capture, engineering new data pipeline, investigating operational issues, running data experiments, analysing unstructured data looking for useful patterns, exploring new ways to use the data to answer questions, presenting a data story, and sharing my knowledge and experience. This means that I work closely with marketing, product managers and product analysts to understand their data needs and what metrics and predictions are important for them.

JustGiving仍然是初创企业,因此没有典型的日子。 我参与了各种任务,例如数据和报告需求捕获,设计新数据管道,调查运营问题,运行数据实验,分析非结构化数据以寻找有用的模式,探索使用数据回答问题的新方法,呈现数据故事,并分享我的知识和经验。 这意味着我与市场营销,产品经理和产品分析师紧密合作,以了解他们的数据需求以及对他们而言重要的指标和预测。

Speaking to others outside your specialist area helps to broaden your views, gives you a new perspective, and new areas you can apply your skills.

与您专业领域以外的其他人交谈有助于扩大您的观点,为您提供新的视角,并可以应用新技能。

On the technical side, I work with engineers, data analysts, developers, business intelligence analysts, operations, and data scientists to support their data and platform requirements.

在技​​术方面,我与工程师,数据分析师,开发人员,商业智能分析师,运营和数据科学家合作,以支持其数据和平台要求。

我喜欢工作的事情 (Things I enjoy about work)

I am passionate about working with huge data sets, as you face different kinds of performance, costs and operational issues that require you to think differently in order to scale your data warehouse, ETL processes, and algorithms and how you present your results. A lot of what you know about data warehousing with their millions of records goes out the roof when you hit hundreds of billions rows and need to iterate or do complex joins to run ML data preparation queries.

我热衷于处理庞大的数据集,因为您面临各种性能,成本和运营问题,这些问题要求您采取不同的思维方式来扩展数据仓库,ETL流程和算法以及如何展示结果。 当您达到数千亿行并且需要迭代或执行复杂的联接以运行ML数据准备查询时,您所掌握的有关其数百万条记录的数据仓库的许多知识将一扫而空。

Building and running large-scale data infrastructure and distributed model training are active areas in academia and industry. They are evolving at a fast pace, with new tooling being introduced every few months. I like to use cloud solutions in an innovative way to improve our in-house data science platform, enhance our business processes, and make data insights available to internal and external users.

建立和运行大规模数据基础架构以及分布式模型培训是学术界和工业界的活跃领域。 它们发展Swift,每隔几个月就会引入新的工具。 我喜欢以创新的方式使用云解决方案,以改善我们的内部数据科学平台,增强我们的业务流程并向内部和外部用户提供数据见解。

I’ve found that a lot of companies give their power away by using 3rd parties for their web analytics solutions, rather than building their own. That data is then siloed in marketing or sales departments, is difficult if not possible to get back in its raw form, and cannot be streamed back for example preventing you from making real-time ML recommendation or predictions directly in your product.

我发现, 很多公司通过使用第三方来提供Web分析解决方案,而不是建立自己的,从而释放了自己的力量 。 然后,这些数据将在市场或销售部门中孤立起来,如果无法以原始形式返回就很难了,并且也无法将其流回,例如阻止您直接在产品中进行实时ML建议或预测。

At JustGiving we built an in-house web analytics product called KOALA and have this data available in real-time as an AWS Serverless stack. This allowed us to have a full suite of data pipelines for ML training and analytics in-house, and the likes of MAGPIE that allows us to create real-time metrics and insighs that we can serve back to the users.

在JustGiving,我们构建了一个名为KOALA的内部Web分析产品,并将此数据作为AWS无服务器堆栈实时提供。 这使我们能够在内部拥有一整套用于ML培训和分析的数据管道,以及MAGPIE之类的东西,使我们能够创建实时的指标和提示,并回馈给用户。

For example here is early version shown in this Tweet during a crowdfunding campaign for the Manchester attacks victims families’ in May 2017.

例如,此推文中的早期版本是2017年5月曼彻斯特袭击受害者家庭的众筹活动期间显示的。

In addition KOALA allows us to make predictions from streaming data. It is extremely costs effective solution compared to paying for a vendor product. If you compare it to a vendor solution based on the same web traffic, KOALA is 10x cheaper, more developer friendly, and we get the raw streamed data back in real-time, rather than in batches or having to use a propitiatory locked down querying or reporting system.

另外,KOALA允许我们根据流数据进行预测。 与购买供应商产品相比,这是一种极具成本效益的解决方案。 如果将其与基于相同Web流量的供应商解决方案进行比较,则KOALA的价格便宜10倍,对开发人员更友好,而且我们可以实时返回原始流数据,而不是成批返回或必须使用临时锁定查询或报告系统。

I am also a big fan of Python and have successfully encouraged its uptake in the company and wider community for the data pipelines, ML and serverless computing. Why Python? It has extensive ML Libraries, scales with the likes of pySpark, and easy to read / write.

我也是Python的忠实拥护者,并且成功地鼓励Python在公司和更广泛的社区中采用,以用于数据管道,ML和无服务器计算。 为什么是Python? 它具有广泛的ML库,可以与pySpark之类进行缩放,并且易于读取/写入。

I also enjoy working with different organisations, charities, universities and giving back to the wider technical community with my experience and time such as at the AWS and British Heart Foundation Hackathon recently.

我也喜欢与不同的组织,慈善机构,大学合作,并以我的经验和时间回馈更广泛的技术社区,例如最近在AWS和英国心脏基金会黑客马拉松上 。

大数据,数据科学和人工智能的未来 (The Future of Big Data, Data Science and AI)

I see more people using ML, real-time analytics, graph analytics and NLP in their products and applications, not just offline on their laptops. This is accelerating as the cloud providers offer ML and NLP application program interfaces (APIs).

我看到越来越多的人在其产品和应用程序中使用ML,实时分析,图形分析和NLP,而不仅仅是在笔记本电脑上离线使用。 随着云提供商提供ML和NLP应用程序接口(API),这种情况正在加速。

For real-time analytics, there is a growing demand from consumers that are much more data aware and impatient. For example they want to know what is happening right now, see the results of their action, and use more intelligent applications and websites that adapt as they are interacting with them.

对于实时分析,消费者的需求不断增长,他们更加了解数据并且急躁。 例如,他们想知道当前正在发生的事情,查看其行动的结果,并使用更智能的应用程序和网站来适应与他们的交互。

On the infrastructure side, I see serverless computing and Platform as a Service (PaaS) infrastructure in the public cloud such as AWS and Azure becoming more prominent. Functions in serverless computing are particularly interesting for me, as they can auto-scale in less than a 100 milliseconds, are highly available and are low cost. They are low cost as you only pay for the time your code is executed, rather than for an always-on machine or container like in more traditional cloud infrastructure. I’ve even shown that you can implement most of the existing container-based microservices patterns using a serverless stack.

在基础架构方面,我看到诸如AWS和Azure之类的公共云中的无服务器计算和平台即服务(PaaS)基础架构变得更加突出。 无服务器计算中的功能对我来说特别有趣,因为它们可以在不到100毫秒的时间内自动扩展,高度可用且成本低廉。 它们的成本很低,因为您只需为代码执行时间付费,而不是像在更传统的云基础架构中那样为永远在线的机器或容器付费。 我什至展示了您可以使用无服务器堆栈来实现大多数现有的基于容器的微服务模式 。

The open source frameworks and programming languages will also continue to grow compared to closed vendor specific products and languages, e.g. Apache Spark framework, Python, R, SQL. The same goes for data storage and access: cloud storage, data warehouses and data lakes will store data in more open rather than proprietary formats, and this will be more accessible over standard APIs or open protocols.

与封闭的供应商特定产品和语言相比,开源框架和编程语言也将继续增长,例如Apache Spark框架,Python,R,SQL。 数据存储和访问也是如此:云存储,数据仓库和数据湖将以更加开放的方式而非专有格式存储数据,并且可以通过标准API或开放协议更轻松地访问数据。

There will also be growing requirements to analyse unstructured and multimedia data sources, and again the cloud providers will have a growing role to play.

分析非结构化和多媒体数据源的需求也将不断增长,并且云提供商将再次扮演越来越重要的角色。

We will also see more companies making the transition from using strategies decided by a few on gut instinct at the top, to becoming more experiment-based, evidence-based, and data-driven as described by my former CAO Mike Bugembe in his book. For example the testing of new products or features, identifying new opportunities and strategic decisions will come more and more from the data analysis, insight and predictions.

我们还将看到越来越多的公司从使用少数人本能地决定的策略过渡到我的前CAO Mike Bugembe在他的书中描述的基于实验,证据和数据驱动的转变 。 例如,对新产品或功能的测试,确定新机会和战略决策将越来越多地来自数据分析,洞察力和预测。

This will require more staff to get involved in data capture, data preparation, running experiments using algorithms, data visualisation and presenting results.

这将需要更多的员工参与数据捕获,数据准备,使用算法进行实验,数据可视化和呈现结果。

As such, new data orientated jobs based on creating and training data models will emerge, disrupting some of the existing specialist fields such as health care, accountancy and law. AI, Internet of things (IoT) and robotics will also replace some existing blue and white collar jobs so we will need to think about training and upskilling people to the changing landscape, and possibly introduce some kind of universal basic income.

因此,将出现基于创建和培训数据模型的面向数据的新工作,这将扰乱某些现有的专业领域,例如医疗保健,会计和法律。 人工智能,物联网(IoT)和机器人技术还将取代一些现有的蓝领和白领工作,因此我们将需要考虑对人们进行培训和使其适应不断变化的环境,并可能引入某种普遍的基本收入。

You can draw parallels with the shift seen during the industrial revolution from the agrarian or pre-industrial times. For AI to take off, we need two things to happen: the cost of human workers becomes higher than the AI alternative, and for AI to be deployed in a scalable way.

您可以将其与农业革命或农业革命之前的转变相提并论。 为了使AI起飞,我们需要发生两件事:人工成本高于AI替代方案,并且AI以可扩展的方式部署。

In the much the longer term, quantum computing will also disrupt the field again in terms of how we process, analyse and store data, and will transform areas like cyber security, banking and existing AI.

从长远来看,量子计算还将在我们处理,分析和存储数据的方式方面再次颠覆该领域,并将改变网络安全,银行业务和现有AI等领域。

如何激发人们从事数据科学事业 (How to inspire people to pursue careers in data science)

I think it’s a lot easier to get people interested in big data and data science than it used to be, thanks to the likes of Google and Facebook that make it fashionable to be smart and work within technology.

我认为,要让人们对大数据和数据科学产生兴趣比以往要容易得多,这要归功于Google和Facebook之类的技术,使之变得更加智能并可以在技术中工作。

In addition, the growing number of young and flexible startup companies with infrastructures in the public cloud are successfully competing and winning market shares from large established companies. Employers need to be willing to educate and upskill existing staff or graduates rather than solely recruit people with existing data engineering or data science skills.

此外,越来越多的年轻且灵活的初创公司在公共云中拥有基础架构,它们正在成功竞争并赢得大型老牌公司的市场份额。 雇主需要愿意教育和提高现有员工或毕业生的技能,而不是仅仅招聘具有现有数据工程或数据科学技能的人员。

For inspiring existing staff, we need to show the benefits, use cases and data sources most relevant to them, which makes them more productive and their jobs easier. With more data exploration tools available, staff in other departments outside IT or finance, such as customer support, marketing and product managers will be self-serving on the data and insights.

为了激励现有员工,我们需要显示与他们最相关的收益,用例和数据源,这使他们的工作效率更高,工作更轻松。 有了更多可用的数据探索工具,IT或财务部门以外的其他部门的员工(例如客户支持,市场营销和产品经理)将可以自助服务于数据和见解。

For people who have not worked in industry, I think we need to start early in schools and then universities. Teachers and lecturers in non-computer science subjects could make data more visual and interactive in their respective fields.

对于那些没有在工业界工作过的人,我认为我们需要从学校然后大学开始。 非计算机科学专业的教师和讲师可以使各自领域的数据更加可视化和交互式。

I think that almost any subject can benefit — for example even in English literature you can draw a relationship graph of the characters and their connections linked to main themes, events and locations. In history classes, you could have and interactive visual maps and time evolving graph representations of key events their dependencies.

我认为几乎所有主题都可以受益-例如,即使在英语文学中,您也可以绘制人物关系及其与主要主题,事件和位置的联系的关系图。 在历史记录类中,您可以具有交互式可视化地图以及关键事件及其依存关系的时间演化图形表示。

我的建议是,给那些考虑从事大数据和数据科学职业的人 (Advice I would you give to someone considering a career in Big Data and Data Science)

Whether you are a graduate, already working in an organisation or not from a technical background, you can benefit from analysing and understanding data. For example, data journalists are typically not from a technical or scientific background, yet are able to do simple analysis and create an interesting data story for the general public.

无论您是应届毕业生还是已经在组织中工作的技术人员,都可以从分析和理解数据中受益。 例如,数据记者通常不具有技术或科学背景,但能够进行简单的分析并为公众创建有趣的数据故事。

It’s about self-motivation: when things move at such a fast pace, you can look broadly across the sector to gain a general understanding. But you also need to focus your energy on one specific course or project and complete it. The industry also tends to repackage old technologies with some improvements as new trending ones, like cyber security, cognitive computing, chatbots, virtual reality and deep learning at the moment. So I would follow your heart for the areas you are truly interested in and want to focus on rather than the latest trend.

这是关于自我激励的:当事情以如此快的速度发展时,您可以在整个行业中广泛地了解,以获得一般的了解。 但是,您还需要将精力集中在一个特定的课程或项目上并完成它。 业界还倾向于将旧技术进行重新包装,并进行一些改进,以适应新趋势,例如网络安全,认知计算,聊天机器人,虚拟现实和当前的深度学习。 因此,我会跟随您的心,对您真正感兴趣的领域并希望专注于而不是最新趋势。

Behind each viral trend there have usually been early explorers that have worked and struggled on that area for years!

在每种病毒趋势的背后,通常都有早期的探险家在该地区工作多年并奋斗了!

In terms of gaining the knowledge, it is a lot easier than it used to be. For example in the past you had to pay for specific vendor training and there was the cost of the product itself. You can now access the learning materials, data sources, and tools all for free, so there is no excuse not to get started today!

在获取知识方面,它比以前容易得多。 例如,过去您必须为特定的供应商培训付费,并且产品本身存在成本。 您现在可以免费访问所有的学习资料数据源工具 ,因此没有任何借口就可以立即开始!

For the learning materials, a lot of the content is available for free in massive open online courses, forms, blogs, and source code repositories. Equally there are numerous free data sources like ML datasets, open data, news feeds and social media you can use.

对于学习材料 ,大量内容可在大规模开放式在线课程,表格,博客和源代码存储库中免费获得。 同样,您可以使用大量免费数据源,例如ML数据集,开放数据,新闻提要和社交媒体。

There are many tools out there. Some are graphical, but in my view you should learn to program in SQL, Python, or R. All three have the ability to do data science at scale thanks to frameworks like Apache Spark. I particularly like Python as it benefits from being an efficient development language with a solid test framework and numerous data science packages.

有很多工具 。 其中一些是图形的,但是在我看来,您应该学习使用SQL,Python或R进行编程。这都归功于Apache Spark这样的框架,这三者都具有进行大规模数据科学的能力。 我特别喜欢Python,因为它得益于成为具有可靠测试框架和众多数据科学软件包的高效开发语言。

As an ML engineer or data scientist, expect to spend a lot of time on data preparation. This is an important process to master, which involves the cleaning, parsing, enriching and shaping the data so that it can be used in the ML algorithms and experiments. Overall, remember that the processes, tools and data sources are always evolving, so there is no one-off unicorn training course you can do. You will need be self-motivated and open to constantly learn and adapt to the data ecosystem.

作为ML工程师或数据科学家,期望花费大量时间进行数据准备。 这是一个重要的重要过程,涉及清理,解析,丰富和整形数据,以便可以在ML算法和实验中使用。 总体而言,请记住,流程,工具和数据源一直在不断发展,因此您无法进行一次性的独角兽培训课程。 您将需要有自我激励和开放的态度,以不断学习和适应数据生态系统。

I would recommend that you learn another language such as Mandarin (1.1 billion speakers) or Spanish (0.5 billion speakers), to remain mobile, get more career opportunities, and be competitive within this interconnected world. This will also open your mind and give you an insight into other cultures and values, and how they use their data.

我建议您学习另一种语言,例如普通话(11亿说话者)或西班牙语(5亿说话者),以保持移动性,获得更多的职业机会并在这个相互联系的世界中保持竞争力。 这也将打开您的视野,并让您深入了解其他文化和价值观以及它们如何使用其数据。

Cloud computing also means that you no longer need a physical presence in a country to operate in it, so you need to be open to building systems across regions and analysing data from many countries. Start using collaborative tools and participate in tech for good communities.

云计算还意味着您不再需要在某个国家/地区进行物理运营,因此您需要开放以跨区域构建系统并分析来自许多国家/地区的数据。 开始使用协作工具,并为良好社区加入技术 。

Some jobs and professions will be replaced, and some human expertise will be lost, but we will still rely on the data and algorithms. For example, once driverless transportation is widely adopted and considered safer, cheaper and more convenient than human drivers, future generations may not wish to drive a car or even have a driving license. However humans will still be involved in the systems that automate the driving, the creative analysis of the telemetry and IoT data, the supervision and monitoring of the ecosystem, and the wider participation in the transport industry and sharing economy.

一些工作和职业将被替换,一些人的专业知识将丢失,但我们仍将依赖数据和算法。 例如,一旦无人驾驶运输被广泛采用并被认为比人类驾驶员更安全,更便宜,更方便,则后代可能不希望驾驶汽车甚至没有驾驶执照。 但是,人类仍将参与自动驾驶系统,遥测和物联网数据的创造性分析,生态系统的监督和监控以及交通行业和共享经济的更广泛参与。

摘要 (Summary)

If you want to have a career in data science, ML, or data engineering, the business needs still drive the software development and analysis. Think about the metrics you want to calculate that will benefit your business decisions, or the hypothesis you want to validate with an experiment.

如果您想从事数据科学,机器学习或数据工程工作,那么业务需求仍将推动软件开发和分析。 考虑一下您要计算的有助于您的业务决策的指标,或者您想通过实验验证的假设。

What actions will your audience take with your results? What growth or cost savings opportunities for a business are out there? Then work back to see what data, models and infrastructure you need for the task. I think that being curious, inquisitive, and having an experimental mind are important qualities.

受众会对您的结果采取什么行动? 企业有哪些增长或节省成本的机会? 然后回头查看该任务需要哪些数据,模型和基础结构。 我认为好奇,好奇心和实验能力是重要的素质。

Feel free to connect with me on LinkedIn, follow me on Twitter, or message me for comments and questions. If you want to have a more personalised chat with me, based on your requests, I’m offering short 30min Skype calls on career advice or mentoring for a small fee. I also do short term consultancy, and provide expert advice and audit services to organisations building and running big data and data science platforms in the cloud.

请随时在LinkedIn上与我联系,在Twitter上关注我,或给我发送评论和问题的信息。 如果您希望根据您的要求与我进行更具个性化的聊天,我将为您提供30分钟的Skype短时求职咨询或指导,但需要支付少量费用。 我还提供短期咨询服务,并为在云中构建和运行大数据和数据科学平台的组织提供专家建议和审计服务。

翻译自: https://www.freecodecamp.org/news/recommendations-for-working-in-data-science-ai-and-big-data-based-on-my-personal-experience-8dbc24be368c/

根据我的经验如何进行数据科学,人工智能或大数据工作相关推荐

  1. 如何学习大数据,到底怎么学?数据科学概论与大数据学习误区在哪

    数据科学家走在通往无所不知的路上,走到尽头才发现,自己一无所知." 最近不少网友向我咨询如何学习大数据技术?大数据怎么入门?怎么做大数据分析?数据科学需要学习那些技术?大数据的应用前景等等问 ...

  2. 大数据到底怎么学: 数据科学概论与大数据学习误区

    数据科学家走在通往无所不知的路上,走到尽头才发现,自己一无所知."-Will Cukierski,Head of Competitions & Data Scientist at K ...

  3. 数据科学 怎样进行大数据的入门级学习?

    转:数据科学 怎样进行大数据的入门级学习? 数据科学并没有一个独立的学科体系,统计学,机器学习,数据挖掘,数据库,分布式计算,云计算,信息可视化等技术或方法来对付数据. 但从狭义上来看,我认为数据科学 ...

  4. 【大数据AI人工智能】大数据、云计算和人工智能:未来最热门专业的要点和技能要求

    [大数据&AI人工智能]大数据.云计算和人工智能:未来最热门专业的要点和技能要求 文章目录 [大数据&AI人工智能]大数据.云计算和人工智能:未来最热门专业的要点和技能要求 I. 大数 ...

  5. python人工智能大数据_人工智能及大数据中的Python

    2016年,Python取代Java成为高校中最受欢迎的语言.2018年三大语言榜单中,Python陆续登上了IEEE.PYPL排行榜单之首.薪酬调查结果显示,Python开发人员是收入最高的开发人员 ...

  6. 大数据技术怎么自学?大数据开发如何自学?

    大数据技术怎么自学?大数据开发如何自学? 我们在学习大数据开发前需要先找到适合自己的方式方法,首先需要审视一下自身的情况,是否是以兴趣为出发点,对大数据是不是自己是真的感兴趣吗,目前对大数据的了解有多 ...

  7. 大数据怎么学习:大数据学习的关键技术知识体系、学习路径和误区

    由于大数据技术涉及内容太庞杂,大数据应用领域广泛,而且各领域和方向采用的关键技术差异性也会较大,难以三言两语说清楚,本文从数据科学和大数据关键技术体系角度,来说说大数据的核心技术什么,到底要怎么学习它 ...

  8. 荐号 | 11个人工智能与大数据相关的个人、企业优质号

    AlphaGo Zero都会自学了,作为刚刚步入AI大门的我们,应该如何选择合适自己的知识平台呢?今天小编为你甄选了几个高质量的技术公众号. 这些号更多的不是讲授枯燥的理论,而是从行业资讯.一线技术. ...

  9. 零基础转行大数据怎么学习?大数据学习路线

    大数据要怎么学,本文来说说到底要怎么学习它,以及怎么避免大数据学习的误区,以供参考.数据科学特点与大数据学习误区 (1)大数据学习要业务驱动,不要技术驱动:数据科学的核心能力是解决问题. 大数据的核心 ...

  10. 人工智能与大数据的应用

    两个概念 人工智能--人造的智能,通过研究人类的智能,了解人类智能(看.听.说.写.闻.思考等能力)的实质,生产出具有人类智能的机器. 大数据–密度大.体量大.维度多.价值高的数据. 人工智能与大数据 ...

最新文章

  1. 修复mysql编码错乱的数据_关于MySQL数据库编码修复相关问题
  2. 不仅仅是世界500强--华为经典教程大集合
  3. 安卓连接mysql客户端_安卓客户端与mysql服务器端数据交互
  4. mvp 在 flutter 中的应用
  5. Python爬虫之旅_ONE
  6. use regular expression instead of ABAP function module to parse attachment
  7. ASP.NET Core Web API下事件驱动型架构的实现(二):事件处理器中对象生命周期的管理
  8. 初级工程师该如何去学习,如何去研发开关电源?
  9. Visio 2010导入中UML2.2模板说明
  10. 他是中国最牛X的黑客,曾让6个国家束手无策,却被怀疑是精神病
  11. CAN通讯程序C语言,AT90CAN单片机CAN通信模块介绍及软件编程
  12. electron Mac版截图功能实现
  13. 听说软件测试工程师们都在考ISTQB?
  14. Windows下mysql数据库的下载、安装、使用(详细)(有后续)
  15. 《软技能》读书笔记(上)
  16. 高可用架构之高可用的应用和服务
  17. RDKit入门教程(2)——利用RDKit获取分子指纹
  18. 薅羊毛 | Python 自动化带你轻松赚钱(完结版)
  19. 第一次登上CSDN的博客
  20. 使用SSD网络模型进行Tensorflow物体检测(V1.1摄像头检测)

热门文章

  1. 旅游类APP原型模板分享——爱彼迎
  2. 调制与变频、基带信号与射频信号中的IQ调制(又称矢量调制)
  3. opencv及图像基本处理
  4. html+css 制作简单QQ登录页面
  5. celery 停止任务_celery 停止执行中 task
  6. 【职场心路】一个老DBA的自白
  7. 滑稽,使用paddle轻松搞定抠图,妈妈再也不用担心我不会抠图了
  8. hgame-week1-web-fujiwara tofu shop
  9. VCSA 6.7.U3n 离线打补丁
  10. matlab中magy是什么意思,MATLAB入门基本知识——音频处理