ai与虚拟现实

If you fit one of these profiles, this article is for you:

如果您适合这些配置文件之一，那么本文适合您：

● You are a data science manager. You’d like to improve your team’s productivity with some best practices.

● 您是数据科学经理。 您想通过一些最佳实践来提高团队的生产力。

● You are a data scientist. You’d like to learn what happens downstream: How your model turns into a product.

● 您是一名数据科学家。 您想了解下游发生了什么：您的模型如何变成产品。

● You are a software architect. You are designing or expanding a platform to support data science use cases.

● 您是一名软件架构师。 您正在设计或扩展一个平台来支持数据科学用例。

I recently completed an online course that I think you should check out. It’s called Full Stack Deep Learning. It covers the full lifecycle of an AI application, from ideation through deployment but it does not cover theory or model fitting. If you are an intermediate data scientist and want to “zoom out” from your niche, this course will show you how the sausage is made, tracking it from one station to the next.

我最近完成了在线课程，我认为您应该退出该课程。这就是所谓的全栈深度学习。它涵盖了从构思到部署的AI应用程序的整个生命周期，但不涵盖理论或模型拟合。如果您是中级数据科学家，并且想从自己的细分市场中“脱颖而出”，那么本课程将向您展示如何制作香肠 ，并从一个工作站到另一个工作站进行跟踪。

The course started out as a pricy SF-based bootcamp in 2018 but is now available for free. It features some industry heavyweights, including Tesla’s Andrej Karpahy and fast.ai’s Jeremy Howard. I took it because I wanted to compare my own practices against what the celebrities do. By way of background, I am a partner at Genpact, a consulting company. I help clients transform their processes and sometimes their business using AI. In practice, this means I create a proof of concept (POC) to demonstrate potential value and then run the team that implements the solution to capture that value.

该课程于2018年作为基于SF的价格昂贵的训练营开始，但现在免费提供。它具有一些行业重量级人物，包括特斯拉的安德烈·卡帕希(Andrej Karpahy)和fast.ai的杰里米·霍华德(Jeremy Howard)。我之所以这么做，是因为我想将自己的做法与名人的行为进行比较。作为背景，我是咨询公司Genpact的合伙人。我帮助客户使用AI来改变他们的流程，有时甚至是他们的业务。在实践中，这意味着我将创建概念证明(POC)来展示潜在价值，然后运行实施该解决方案的团队来获取该价值。

“Full Stack Deep Learning” exceeded my expectations. It is organized into six content areas as well as hands-on labs and guest lectures from AI luminaries. Here is what I found most new and useful:

“全栈深度学习”超出了我的期望。它分为六个内容区域，以及来自AI专家的动手实验室和客座演讲。这是我发现的最新颖和有用的内容：

1.设置ML项目 (1. Setting up ML Projects)

This is an “executive” module that discusses planning, prioritizing, staffing and scheduling AI projects.

这是一个“执行”模块，讨论了AI项目的计划，优先级划分，人员配备和日程安排。

I did not find much new here, but it is a good, concise executive overview. Since the course focuses on deep learning as opposed to traditional ML, it brings up three important points:

我在这里没有发现太多新内容，但这是一个很好，简洁的执行概述。由于该课程侧重于深度学习而不是传统ML，因此它提出了三个重点：

● Deep learning (DL) , unlike more traditional machine learning, is “still research.” You should not plan for a 100% success rate

●与更传统的机器学习不同，深度学习(DL)是“仍在研究中”。您不应该计划100％的成功率

● If you are “graduating” from “classical” ML to DL, plan on spending a lot more time and money on labeling than you are used to…

●如果您要从“经典” ML逐渐“升级”到DL，则计划花费更多的时间和金钱来贴标签，这比您过去习惯的要多得多。

● …but don’t throw out your playbook. In both cases, you are looking for settings where cheap prediction will have a large business impact

●…但不要丢掉您的剧本。在这两种情况下，您都在寻找便宜的预测会对业务产生重大影响的设置

2.基础设施和工具 (2. Infrastructure and Tooling)

Kyle Head on Kyle Head的UnsplashUnsplash图片

This is the module I found most helpful. It sets up a comprehensive framework for developing an AI/ML application, from the lab through production. At each layer or category, it covers key functionality, how it fits with other layers and the major tool choices.

这是我发现最有用的模块。它为从实验室到生产的AI / ML应用程序开发建立了一个全面的框架。在每个层或类别中，它涵盖了关键功能，如何与其他层配合以及主要的工具选择。

Let me emphasize: what makes this course different is how comprehensive the framework is. Most public AI/ML content is focused on model development. Some sources cover just data management or just deployment. Commercial vendors often understate the complexity of the process and skip steps. This is the most “panoramic” picture I’ve seen if you are trying to understand the AI/ML pipeline from alpha to omega.

让我强调一下：使本课程与众不同的是框架的全面程度。大多数公共AI / ML内容都集中在模型开发上。一些资源仅涉及数据管理或部署。商业供应商常常低估了流程的复杂性，并跳过了步骤。如果您试图了解从alpha到omega的AI / ML管道，这是我所见过的最“全景”图片。

The course is “opinionated” — it sometimes calls “category winners” which is helpful if you’re placing bets. For example, it calls Kubernetes as a winner in the “resource management” category. I agree with most of these calls, but not with all. For example, among cloud providers it picks AWS and pans Azure as having a “bad user experience.” While AWS is excellent, several of our clients (rightly) chose Azure, particularly those that already have a Microsoft stack (Excel, MS SQL, etc.)

该课程是“有针对性的”-有时称为“类别优胜者”，这对您下注很有帮助。例如，它称Kubernetes为“资源管理”类别的赢家。我同意这些电话中的大多数，但不是全部。例如，在云提供商中，它选择AWS并将Pan Azure视为具有“糟糕的用户体验”。虽然AWS非常出色，但我们的几个客户(正确地)选择了Azure，特别是那些已经具有Microsoft堆栈的客户(Excel，MS SQL等)

After setting up the overall framework, this module digs into Development and Training/Evaluation. I found three areas particularly interesting:

在建立了总体框架之后，本模块将深入研究开发与培训/评估。我发现三个方面特别有趣：

● Prototyping: I’m always looking for quick and easy ways to create a proofs of concept (POCs) for clients. I need to produce a visually attractive, interactive POC that is easily accessible over a public or semi-public URL. My ideal solution would give me code-level control over the model while not making me code a lot of HTML or Javascript. One-click deployment is a plus. I’ve been using Shiny but would like to do something similar with Python. The course introduced me to streamlit, which I will be investigating further. Also interesting is dash, which is curiously not covered.

● 原型制作 ：我一直在寻找快速简便的方法来为客户创建概念证明(POC)。我需要制作一个视觉上吸引人的交互式POC，可以轻松地通过公共或半公共URL进行访问。我理想的解决方案将使我能够对模型进行代码级控制，而又不会使我编写大量HTML或Javascript。一键式部署是一个加号。我一直在使用Shiny，但想使用Python做类似的事情。本课程将我介绍给streamlit ，我将对其进行进一步研究。有趣的是dash ，奇怪的是没有涵盖。

● Experiment Management is an interesting category: It keeps track of how well your model performs under a variety of configuration options (experiments). I coded my own version of this for competing on Kaggle. I didn’t know this was a category with a name. I will be checking out a few of the tools recommended by this course, including Weights and Biases.

● 实验管理是一个有趣的类别：它跟踪模型在各种配置选项(实验)下的性能。我编写了自己的版本，以便在Kaggle上竞争。我不知道这是一个带有名称的类别。我将检查本课程推荐的一些工具，包括Weights和Biases 。

● All-in-one: There was a nice, informative comparison between all the all-in-one platforms available. AWS SageMaker and GCP AI look like the best choices at the moment. If pressed, I would bet others will be acquired or copied by the cloud providers.

● 多合一： 在所有可用的多合一平台之间进行了很好的，信息丰富的比较。 AWS SageMaker和GCP AI看起来是目前的最佳选择。如果按下，我敢打赌其他人将被云提供商收购或复制。

3.数据管理 (3. Data Management)

This module discusses how to store and manage datasets related to your pipeline. I did not find much new here. The material on data augmentation was interesting, but mostly applies to computer vision, which I have not done much of.

本模块讨论如何存储和管理与管道相关的数据集。我在这里没有发现太多新东西。关于数据增强的材料很有趣，但主要适用于计算机视觉，而我并未做太多工作。

4.机器学习团队 (4. Machine Learning Teams)

This module discusses the HR portion of the project: roles, team structure, managing projects, etc. In my view, this content belongs in module 1 above — Setting up ML Projects.

本模块讨论项目的人力资源部分：角色，团队结构，管理项目等。在我看来，此内容属于上面的模块1 —设置ML项目。

There were some interesting points about how to get a job in the field — for hiring managers and candidates. There is also a good summary of the typical roles in an ML project:

关于如何在该领域找到一份工作，有一些有趣的观点，即招聘经理和候选人。 ML项目中的典型角色也有很好的总结：

5.培训和调试 (5. Training and Debugging)

This module discusses the process of getting a model to work in the lab. It should really be required reading for every data scientist, and is similar to the workflow I used to win several Kaggle contests. You can also get this content in many other places, but this is a well-organized and succinct presentation:

本模块讨论使模型在实验室中可用的过程。每位数据科学家都必须阅读该书，并且该书与我赢得过几次Kaggle竞赛的工作流程相似。您还可以在许多其他地方获得此内容，但这是一个组织良好且简洁的演示文稿：

The discussion around debugging DL models was particularly good: Get your model to run, overfit a single batch and compare to a known result. Just to illustrate the depth, here is the subsection on overfitting a single batch:

关于调试DL模型的讨论特别好：让您的模型运行，过度拟合单个批处理并与已知结果进行比较。只是为了说明深度，这是关于过度拟合单个批次的小节：

More tips on overfitting are at the end of the article.

本文的末尾提供了更多关于过度拟合的技巧。

6.测试与部署 (6. Testing and Deployment)

This module discusses how to get your model from the lab to the real world. It’s the module I was originally looking for when I took the class. I found several useful nuggets here:

本模块讨论如何将模型从实验室转移到现实世界。这是我上课时最初寻找的模块。我在这里找到了几个有用的块：

Testing an ML system is very different from testing traditional software because its behavior is driven by the data as well as the algorithm:

测试ML系统与测试传统软件有很大不同，因为它的行为是由数据和算法驱动的：

You need to adjust your test suite accordingly. The course provides an excellent checklist for doing just that, taken from the now-famous paper Hidden Technical Debt in Machine Learning Systems.

您需要相应地调整测试套件。本课程提供了一个出色的清单，该清单摘自如今著名的论文《机器学习系统中的隐藏技术债务》。

The course recommends you check that training-time and production-time variables have approximately consistent distributions (Monitoring Test 3, above). This is a critical test. It can help you detect a runtime error, such as blanks in the data feed. It can also tell you it may be time to re-train the model because the input is different than what you expected. A simple way to accomplish this is to plot training data vs production-time data, variable by variable. The Domino Data Lab tool does this.

本课程建议您检查培训时间和生产时间变量是否具有大致一致的分布(上面的监控测试3)。这是一项关键测试。它可以帮助您检测运行时错误，例如数据馈送中的空白。它还可以告诉您可能是时候重新训练模型了，因为输入的内容与您的预期不同。一种简单的方法是绘制训练数据与生产时间数据，并逐变量绘制。 Domino Data Lab工具可以执行此操作。

A better way, which is not covered in the course, is to use adversarial validation: Train an auxiliary model (in production) which tries to classify an observation as belonging to train or prod data. If this model is successful at distinguishing the two, you have a significant distribution shift. You can then inspect the model to find the most important variables that drive that shift.

更好的方法(本课程中未涉及)是使用对抗性验证 ：训练(生产中的)辅助模型，该模型试图将观察结果分类为属于训练或生产数据。如果此模型可以成功地区分两者，则您的分配将发生重大变化。然后，您可以检查模型，以找出驱动这一转变的最重要变量。

Deployment is covered with a good introduction to Kubernetes and Docker, as well as GPU-based model serving.

Kubernetes和Docker以及基于GPU的模型服务都很好地介绍了部署。

客座讲座 (Guest Lectures)

The course includes guest lectures from industry heavyweights. The quality is highly variable. Some speakers are polished and prepared, others… not so much. I was most impressed with two guests:

该课程包括来自行业重量级人物的客座演讲。质量变化很大。有些扬声器是经过抛光和准备的，而另一些则不是。两个客人给我留下了最深刻的印象：

● Jeremy Howard of fast.ai: This talk provided lots of “news you can use” in terms of improving model performance.

● fast.ai的 杰里米·霍华德 ( Jeremy Howard ) ：在提高模型性能方面，此演讲提供了许多“您可以使用的新闻”。

o The Fast.ai library is designed to use fewer resources (human and machine) to get good results. For example, training ImageNet in 3 hours for $25. This focus on efficiency is very much aligned with what our clients are looking for.

o Fast.ai库旨在使用更少的资源(人力和机器资源)来获得良好的结果。例如，以3美元的价格在3个小时内培训ImageNet 。对效率的关注与我们的客户所寻找的非常一致。

o Howard asks “Why are people trying to automate machine learning?” The idea is we can get much better results working together. He calls this “AugmentML” vs. “AutoML.” Platform.ai is a case in point. It is a labeling product that allows the labeler to have an interactive “conversation” with a neural network. Each iteration improves both the labels and the model. I’ve never seen anything like it, and it seems to work, at least on the video he shared.

o霍华德问：“为什么人们试图自动化机器学习？” 我们的想法是，我们可以一起获得更好的结果。他将其称为“ AugmentML”与“ AutoML”。 Platform.ai就是一个很好的例子。它是一种贴标产品，允许贴标者与神经网络进行交互式“对话”。每次迭代都会改善标签和模型。我从未见过类似的东西，而且至少在他分享的视频上，它似乎奏效了。

o Howard shares a box of tricks for improving model performance, particularly for computer vision tasks. I found Test Time Augmentation (TTA) particularly eye-opening. Will have to try it in my next project.

霍华德(Howard)分享了一些技巧，以提高模型性能，特别是对于计算机视觉任务。我发现测试时间增强 (TTA)尤其令人大开眼界。将不得不在我的下一个项目中尝试。

● Andrej Karpathy of Tesla: This talk was interesting as well, although the audio wasn’t great. Karpathy discussed his Software 2.0 concept, the idea that we will increasingly use optimization methods like gradient descent to solve problems probabilistically rather than devising fixed software rules or heuristics to solve them. Like many others, I found this mental model compelling.

● 特斯拉(Tesla)的安德烈(Andrej Karpathy) ：这个演讲也很有趣，尽管音频效果不佳。 Karpathy讨论了他的Software 2.0概念，即我们将越来越多地使用梯度下降等优化方法来概率地解决问题，而不是设计固定的软件规则或试探法来解决问题。像许多其他人一样，我发现这种心理模型令人信服。

离别的想法 (Parting Thoughts)

The course is not perfect. A lot of this material was created in 2018 and is starting to show its age. Three examples:

课程并不完美。许多此类材料创建于2018年，并开始显示其年代。三个例子：

● Richard Socher, chief scientist at Salesforce.com, is arguing for a unified NLP model with something called decaNLP. BERT has since taken over this niche, and GPT3 is an exciting recent development.

●Salesforce.com的首席科学家Richard Socher主张使用称为decaNLP的统一NLP模型。从那以后， BERT接管了这个利基市场，而GPT3是令人振奋的最新发展。

● Model Explainability has developed rapidly over the past few years, but is not well represented

●在过去的几年中，模型解释能力得到了快速发展，但代表性不足

● As mentioned above, Microsoft Azure has been making strides since and does not get a fair shake in my view

●如上所述，自此以来，Microsoft Azure一直在取得长足进步，在我看来并没有引起太大的动摇

Despite these nits, I think the course packs a lot of value into a compact and well-organized frame. The price is right, and I recommend it to anyone interested in understanding how AI/ML applications are built.

尽管有这些技巧，但我认为该课程将很多价值打包到一个紧凑且组织良好的框架中。价格合适，我向有兴趣了解如何构建AI / ML应用程序的任何人推荐。

Lastly, and for no particular reason, I hope you will enjoy this thrilling conclusion:

最后，出于特殊原因，我希望您会喜欢这个令人振奋的结论：

翻译自: https://towardsdatascience.com/moving-ai-to-the-real-world-e5f9d4d0f8e8

ai与虚拟现实

查看全文

http://www.taodudu.cc/news/show-863515.html

bert 无标记文本调优_使用BERT准确标记主观问答内容
机器学习线性回归学习心得_机器学习中的线性回归
安全警报该站点安全证书_深度学习如何通过实时犯罪警报确保您的安全
现代分层、聚集聚类算法_分层聚类：聚集性和分裂性-解释
特斯拉自动驾驶使用的技术_使用自回归预测特斯拉股价
熊猫分发_实用熊猫指南
救命代码_救命！如何选择功能？
回归模型评估_评估回归模型的方法
gan学到的是什么_GAN推动生物学研究
揭秘机器学习
投影仪投影粉色_DecisionTreeRegressor —停止用于将来的投影！
机器学习中的随机过程_机器学习过程
ci/cd heroku_在Heroku上部署Dash或Flask Web应用程序。简易CI / CD。
图像纹理合成_EnhanceNet：通过自动纹理合成实现单图像超分辨率
变压器耦合和电容耦合_超越变压器和抱抱面的分类
梯度下降法_梯度下降
学习机器学习的项目_辅助项目在机器学习中的重要性
计算机视觉知识基础_我见你：计算机视觉基础知识
配对交易方法_COVID下的自适应配对交易，一种强化学习方法
设计数据密集型应用程序_设计数据密集型应用程序书评
pca 主成分分析_超越普通PCA：非线性主成分分析
全局变量和局部变量命名规则_变量范围和LEGB规则
dask 使用_在Google Cloud上使用Dask进行可扩展的机器学习
计算机视觉课_计算机视觉教程—第4课
用camelot读取表格_如何使用Camelot从PDF提取表格
c盘扩展卷功能只能向右扩展_信用风险管理：功能扩展和选择
使用OpenCV，Keras和Tensorflow构建Covid19掩模检测器
使用Python和OpenCV创建自己的“ CamScanner”
cnn图像进行预测_CNN方法：使用聚合物图像预测其玻璃化转变温度
透过性别看世界_透过树林看森林

ai与虚拟现实_将AI推向现实世界相关推荐

基于ai的预测_基于AI的预测性维护可增强战备状态，减少飞行故障
基于ai的预测 By Philong Duong, Senior Product Manager 高级产品经理Philong Duong As a leading provider of AI-ena ...
游戏ai 行为树_游戏AI –行为树简介
游戏ai 行为树游戏AI是一个非常广泛的主题,尽管有很多资料,但我找不到能以较慢且更易理解的速度缓慢介绍这些概念的东西. 本文将尝试解释如何基于行为树的概念来设计一个非常简单但可扩展的AI系统. 什 ...
中国ai chip初创公司_这个AI事实检查初创公司正在做Facebook和Twitter不会做的事情
中国ai chip初创公司 By Jared Newman 杰里德·纽曼(Jared Newman) In late April, an investigation by The Guardian t ...
中国ai创业公司排行榜_加入AI创业公司之前，您需要问6个问题
中国ai创业公司排行榜意见(Opinion) Clark Stanley, also known as the 'Rattlesnake King', was an (in)famous entr ...
ai声音模仿_该AI只需聆听5秒钟即可克隆您的声音
ai声音模仿 This post is about some fairly recent improvements in the field of AI-based voice cloning. If ...
ai包装插件_找AI插件很费劲，一次给你66款AI插件合集！每一款都是设计师常用...
作为设计师我们都知道,PS插件很常见也很多,但AI的却不好找,但其实Adobe Illustrator软件除了能够绘制高精度的矢量图之外,也可以为线稿提供较高的精度和控制,适合生产任何小型设计到大型的 ...
ai前世识别_百度ai人脸扫描前世身份安装,百度ai人脸扫描前世身份安装app软件预约 v1.0-开心路...
在各种社交平台上看过这个扫描自己来测试你的前世身份的短视频相信很多网友们也想尝试使用吧,说不定你检测你时可以匹配到历史上非常伟大的名人了,大家可以放心来使用,不用担心安全问题,而且整个软件不仅使用娱乐 ...
怎么用ai恢复老照片_基于AI的照片恢复
怎么用ai恢复老照片 Hi everybody! I'm a research engineer at the Mail.ru Group computer vision team. In this ...
ai边缘平滑_关于AI边缘运算，你应该知道这些！
原标题:关于AI边缘运算,你应该知道这些! 来源:内容来自「新电子」,谢谢. 人工智能(AI)发展愈加快速,并开始大举进军终端装置,运算分析已开始从云端转向终端节点,边缘运算发展可说是目前半导体产业热 ...

ai与虚拟现实_将AI推向现实世界