Here is a very abstract question — What does an AI or data science model look like? We are all using data science models in our day to day life. Most people that aren’t data scientists have experienced a data science model but have never seen one. So, let me reveal the secret. It may look scary. Here is what a data science model looks like

这是一个非常抽象的问题-AI或数据科学模型是什么样的? 我们每天都在使用数据科学模型。 多数不是数据科学家的人都经历过数据科学模型,但是从未见过。 所以,让我透露这个秘密。 它可能看起来很吓人。 这是数据科学模型的样子

It is a mathematical formula encrypted into alphanumeric characters. But make no mistake, this strange looking thing is the secret sauce for making your enterprise successful and blow away the competition. It can help you perform your business operations with some cutting-edge advanced analytics. Diverse business cases such as product recommendation to increase revenue, fraud detection to prevent revenue loss, asset failure prediction to safeguard your asset value –all have predictive models behind them

这是一个加密为字母数字字符的数学公式。 但是请不要误会,这看上去很奇怪的东西是使您的企业成功并击败竞争对手的秘诀。 它可以通过一些尖端的高级分析来帮助您执行业务运营。 各种业务案例,例如增加收入的产品推荐,防止收入损失的欺诈检测,保护您的资产价值的资产失效预测-所有这些背后都有预测模型

Because models are so crucial in creating business value, we need to handle them with care. Let us look at different ways these models can be handled

由于模型对于创造业务价值至关重要,因此我们需要谨慎处理。 让我们看一下这些模型的不同处理方式

可能需要最差的照顾-笔记本电脑上剩下的型号: (The worst care possible — the model left on laptop:)

The worst type of care is that these models are left on a laptop, usually where it was originally created. Imagine treating your enterprise secret sauce as a person left abandoned on an island. This situation is somewhat comparable.

最差的护理是将这些模型留在笔记本电脑上,通常是它最初创建时的原始位置。 想象一下,将您的企业秘密调味料当作一个遗弃在岛上的人对待。 这种情况有些可比。

Unfortunately, this happens a lot of times. Models created by data scientists using analytic tools on a laptop or pc remain there. A large amount of effort and brain power was used to create them and they contain elements critical for your enterprise success. However, as they remain on the local machine and are never operationalised, this is the worst thing which can happen to such beautiful pieces of data science work

不幸的是,这种情况经常发生。 数据科学家使用笔记本电脑或PC上的分析工具创建的模型仍然存在。 他们花费了大量的精力和脑力来创造它们,它们包含了对您的企业成功至关重要的要素。 但是,由于它们仍然保留在本地计算机上并且从未运行过,因此,对于如此精美的数据科学工作而言,这是最糟糕的事情

变得更好—将模型放入容器 (Getting Better — Putting models in containers)

A better approach is to put models in docker containers. In this way you are taking one step closer to treating the model in a better and more justified way. You are now putting the models in containers, which means that they are secured and isolated within the container, as well as easier to operationalise.

更好的方法是将模型放入docker容器中。 这样,您将朝着更好,更合理的方式对待模型迈出一步。 现在,您将模型放入容器中,这意味着它们已在容器内固定和隔离,并且易于操作。

Though the model is in a safe container, it is still isolated. Which means that if you want to use the model , you need to send data to the docker container and use an API to get back the results. This means that data movement is increased, which may not be the desired situation for all business operations

尽管模型位于安全的容器中,但仍处于隔离状态。 这意味着,如果要使用模型,则需要将数据发送到docker容器,并使用API​​取回结果。 这意味着增加了数据移动,这可能不是所有业务操作都需要的情况

战略方针-像数据一样对待模型 (Strategic Approach — Treating Models like data)

In recent times data has become a valuable asset for any company. Many advances in technology have been in managing data as a valuable asset, for example, Data warehousing and big data storage platforms all revolve around keeping the data safe, managed and make it easily available to benefit a business

近年来,数据已成为任何公司的宝贵资产。 技术方面的许多进步都将数据作为宝贵的资产进行管理,例如,数据仓库和大数据存储平台都围绕着保持数据的安全性,管理性以及使其易于用于企业的利益而展开。

So if we start thinking of models as data, we can leverage all the benefits of data management and apply it to models. By treating models like data, we ensure that models will also become as strategic to business operations as the data is.

因此,如果我们开始将模型视为数据,则可以利用数据管理的所有优势并将其应用于模型。 通过将模型视为数据,我们确保模型也将与数据一样具有战略意义。

Here are some points on why treating models as data is an interesting proposition


模型由数据制成 (Models are made from data)

Models are not created from thin air or by magic wand. They are created from the application of an algorithm to data. You can consider it as a mathematical projection of data. So, it makes sense to consider them as part of data.

模型不是凭空或魔术棒创建的。 它们是通过对数据应用算法而创建的。 您可以将其视为数据的数学投影。 因此,将它们视为数据的一部分是有意义的。

模型结果需要数据使它们有意义 (Model results need data to make sense of them)

Say your model alerts you of a critical asset failure in coming days. In order that any action can be taken, you need to know more details about this asset such its location and its value. You will also need an assessment of whether it makes sense to carry out an urgent repair or take the risk of waiting until the next scheduled maintenance is due.

假设您的模型会在未来几天内提醒您重大资产故障。 为了可以采取任何措施,您需要了解有关此资产的更多详细信息,例如其位置和价值。 您还需要评估进行紧急维修是否有意义,或者冒着等待下一次预定维护到期的风险。

As you realise by now that the output of the model was just an alert trigger. The real action needs to be done and converting model output to something tangible needs data about the asset in question. So, if you have your model as part of the data, that is stored in the system as your data in tables, you can easily integrate the output of models with other data, this makes sense out of the model output and also makes it more actionable

到现在为止,您已经知道模型的输出只是一个警报触发器。 需要执行实际操作,并将模型输出转换为有关资产的有形需求数据。 因此,如果您将模型作为数据的一部分存储在系统中,并且作为表中的数据存储在系统中,则可以轻松地将模型的输出与其他数据集成在一起,这在模型输出中是有意义的,并且使其更加有用。可行的

管理数百万个模型 (Managing millions of models)

In the book, “Prediction Machines” ( , the authors write that AI predictions are becoming cheaper and this means we will use more of it. This also means that there will more and more models.

在《预测机器》(一书中,作者写道,人工智能预测正在变得越来越便宜,这意味着我们将使用更多的预测。 这也意味着将会有越来越多的模型。

Use-cases, where millions of models are required, is not science fiction. Accurate retail stock forecasting requires a model for each product in each store. Fraud detection requires modelling of normal customer behaviour in order to predict any deviation from normal behaviour. As normal behaviour for a customer X may be different from the normal behaviour of customer Y, you will need as many models as customers.

需要数百万个模型的用例不是科幻小说。 准确的零售库存预测需要每个商店中每种产品的模型。 欺诈检测需要对正常客户行为进行建模,以便预测与正常行为的任何偏差。 由于客户X的正常行为可能与客户Y的正常行为不同,因此您将需要与客户一样多的模型。

With enterprises managing millions of products and millions of customers, suddenly the need to have millions of models becomes inevitable.


In such a scenario, it is better to treat models like data and apply all big data management principles also to models


模型是您企业的知识产权-确保它们的安全 (Models are the Intellectual Property of your enterprise — keep them safe)

Models are made from data, and they encode how your enterprise works. For example, a fraud detection model encodes how you intend to detect fraud. It is an intellectual property for your company and therefore should be managed and kept safe.

模型是由数据构成的,它们编码企业的工作方式。 例如,欺诈检测模型对您打算如何检测欺诈进行编码。 它是贵公司的知识产权,因此应加以管理并保持安全。

Imagine the fraud detection model is stolen and decrypted or even worse, the decrypted model is put on the internet for everyone to know how you detect fraud. Suddenly you will be left vulnerable to fraud attacks.

想象一下欺诈检测模型被盗,被解密甚至更糟,解密模型被放到互联网上,每个人都知道您如何检测欺诈。 突然,您将容易受到欺诈攻击。

However, managing models like data and applying all security principles of data also to models will help to make your intellectual property safer.


管理模型的经济学 (Managing the economics of your model)

There is a cost to develop a model and there is a cost to manage your models and keep them operational. If you invest in specialised systems to manage the models, you increase the cost of the model. So you need to give think carefully about the total costs involved in creating and managing a model.

开发模型需要付出代价,而管理模型并使模型保持可操作性也需要付出代价。 如果您投资于专用系统来管理模型,则会增加模型的成本。 因此,您需要仔细考虑创建和管理模型所涉及的总成本。

As good models come from good and integrated data, if you have some good models, you already have a data management platform. So if you leverage the data management platform also to manage your models, you are keeping the overall cost of model low. This helps in long run to keep your models economic and profitable

因为好的模型来自好的和集成的数据,所以如果您有一些好的模型,那么您已经有了一个数据管理平台。 因此,如果您还利用数据管理平台来管理模型,则可以使模型的总体成本保持较低。 从长远来看,这有助于保持模型的经济性和盈利性

Now as you have seen why it makes sense to manage models like data, let me briefly describe what goes into it. These are some of the building blocks of you would like to treat models as data

现在,您已经了解了管理诸如数据之类的模型的重要性,让我简要介绍一下其中的内容。 这些是您想将模型视为数据的一些构建基块

Model Repository — This is a place where your models are stored as data. Generally, it is a table with specialized fields to hold the model encrypted definition

模型存储库 -这是将模型存储为数据的地方。 通常,它是具有专用字段的表,用于保存模型加密的定义

Model Metadata — Models are strange looking and hard to read for humans. You will need some kind of metadata which describes what the model is about. This is where model metadata is used. It has information such as purpose of model, what kind of algorithms it is using, and information about model accuracy

模型元数据 -模型看起来很奇怪,人类很难读懂。 您将需要某种元数据来描述模型的含义。 这是使用模型元数据的地方。 它具有诸如模型目的,所使用的算法类型以及模型精度信息等信息。

Model lineage — Like data, you will also need to know how the model was built as well as how it is used. You need to capture information on the data which went into building the model. This is very useful in traceability or audit situation

模型沿袭 -与数据一样,您还需要了解如何构建模型以及如何使用模型。 您需要捕获有关构建模型的数据的信息。 这对于可追溯性或审核情况非常有用

Design Patterns for bringing external models inside database — Models are like data. Most of them have origins outside a data management platform. If you have to manage models like data, you need to bring them inside the database. This requires design patterns, which describes different ways in which the external model can be brought inside the database

将外部模型带入数据库的设计模式 -模型就像数据。 它们中的大多数起源于数据管理平台之外。 如果必须管理诸如数据之类的模型,则需要将它们带入数据库。 这需要设计模式,该模式描述了将外部模型带入数据库的不同方式

In conclusion, if you treat models like data, they will be managed like the valuable assets that they are.



  1. Practise 5.2测试与封装(黑白盒
  2. 录屏软件,可导出成swf.名字DemoCreator
  3. 树莓派使用 USB 摄像头做网络监控
  4. 七月算法机器学习 8 信息论、最大熵模型与EM算法
  5. 190121每日一句
  6. Atitit 跨平台跨语言图像处理与node.js图像处理之道 attilax著 1. 著名跨语言类库 ImageMagick简介、GraphicsMagick、命令行 1 1.1. opencv
  7. Atititi atiitt eam pam资产管理 购物表去年.xlsx
  8. Atitit.sql where条件表达式的原理  attilax概括
  9. paip.-Djava.library.path -Djava.ext.dirs= 的区别
  10. paip.按键精灵调用其它程序及DLL以及EXE命令行的方法