淘宝省钱机器人是什么原理

by Soundarya Balasubramani

通过Soundarya Balasubramani

我们如何使用机器学习减少食物浪费和省钱 (How We Reduced Food Waste and Saved Money Using Machine Learning)

Welcome to a story of five simple students with one big goal: reducing food waste. In the US alone, pitched food weighs in at over 100 Empire State Buildings per year. Just how do five students dream of tackling this monumental task you ask? Well, this is our story of using data for good.

欢迎来到一个故事,其中有五个简单的学生有一个大目标:减少食物浪费。 仅在美国,每年在100座帝国大厦中投放的食品就占很大比重。 五个学生如何梦想着完成您要求的这项艰巨任务? 好吧,这是我们永远使用数据的故事。

In Columbia Business School’s, Analytics in Action, we partnered with an innovative food delivery startup to minimize their waste and cut expenses. The course pairs teams of 4–6 students with real companies to solve problems through analytics.

在哥伦比亚商学院的Analytics in Action中 我们与一家创新的食品配送初创公司合作,以最大程度地减少浪费和削减开支。 该课程将4至6个学生的团队与实际公司配对,以通过分析解决问题。

Our diverse team was comprised of three MBAs and two Data Scientists from the School of Engineering and Applied Sciences. Our backgrounds include finance, venture capital, engineering, and submarining. We paired with Good Uncle, an innovative, tech-enabled startup which brings the best food in the country onto college campuses nationwide.

我们多元化的团队由工程与应用科学学院的三位MBA和两名数据科学家组成。 我们的背景包括金融,风险投资,工程和潜水艇。 我们与Good Uncle配对, Good Uncle是一家具有创新技术的初创公司,可将全国最好的食物带到全国的大学校园。

问题 (The Problem)

All of Good Uncle’s food prep starts in a large central kitchen in Delaware, nearly a week before a customer places their order. This business model leaves no time for the company to adjust to demand; put simply, food waste is extra sensitive to the accuracy of their demand forecast.

Good Uncle的所有食物准备工作都是在特拉华州的一个大型中央厨房中进行的,大约在客户下订单之前一周。 这种商业模式使公司没有时间去适应需求。 简而言之,食物浪费对其需求预测的准确性格外敏感。

Other food businesses monitor their inventory and can order replenishment that arrive before the restaurant runs out. Good Uncle needs to accurately order tomatoes and mozzarella several days before the thought of ordering pizza-rolls crosses the patron’s mind.

其他食品企业会监控其库存,并可以订购在餐厅用完之前到达的补给。 好叔叔需要在几天前准确点菜西红柿和芝士,然后才开始订购披萨卷。

我们的旅程 (Our Journey)

We first met with Matt, the CEO and Founder of Good Uncle, at his HQ office in Midtown Manhattan. After discussing the ins and outs of the business, we parted ways with the Spring 2018 data for Syracuse University and put on our cleaning gloves.

我们首先在其位于曼哈顿中城的总部办公室与Good Uncle的首席执行官兼创始人Matt会面。 在讨论业务的来龙去脉后,我们与锡拉丘兹大学(Syracuse University)的2018年Spring数据分开,戴上了清洁手套。

We added every external feature we could imagine, including weather from DarkSky, events from StubHub, and, of course, the academic calendar from the school’s website. Armed with an arsenal of descriptive features, we began fitting models right away. Lots of models.

我们添加了我们可以想象的所有外部功能,包括DarkSky的天气,StubHub的事件,当然还有学校网站的学术日历。 拥有大量描述性功能,我们立即开始拟合模型。 很多模型。

Our process started with the ambitious goal of modeling demand at the most granular level. When model after model failed miserably, we bottled our frustration and sought help from our invaluable professors and brilliant TA. We realized we had waged battle with a formidable foe: sparse-demand time-series forecasting.

我们的过程始于宏伟的目标,即在最细粒度的层次上建模需求。 当一个又一个的模型惨遭失败时,我们挫败了挫败感,并寻求宝贵的教授和出色的助教的帮助。 我们意识到我们已经与一个强大的敌人展开战斗:稀疏需求时间序列预测。

We dove into the data and searched for sensible ways to group sales points together. We needed to eliminate this sparsity by aggregating sales on a spatiotemporal basis. Because the food trucks rove through the drop points throughout the day, we needed to look at several methods of clustering.

我们深入研究数据,并寻找明智的方式将销售点分组在一起。 我们需要通过在时空基础上汇总销售来消除这种稀疏性。 由于食品卡车全天在停靠点徘徊,因此我们需要研究几种聚类方法。

With high double-digit combinations of modeling techniques and data clusters, we turned to bench-marking in order to hone in on our choice model and eventual product for Good Uncle.

通过建模技术和数据集群的高两位数组合,我们转向了基准测试,以磨练我们为Good Uncle选择的模型和最终产品。

Although our goal all along was demand prediction, we realized our real-life target was the bottom line. We quantified the monetary value of ordering too much or too little of a given item on the menu, and used that to set a target equation. To compare models, we optimized for profit and found XGBoosted Trees and Poisson Regression to be the obvious leaders of the pack. With some restored dignity and much more confidence, we made the shift to real-time data.

尽管我们的目标始终是需求预测,但我们意识到现实生活中的目标是盈利。 我们量化了在菜单上订购给定项目过多或过少的货币价值,并以此来设定目标方程式。 为了比较模型,我们对利润进行了优化,并发现XGBoosted Trees和Poisson Regression是最明显的领导者。 有了一些恢复的尊严和更多的信心,我们开始转向实时数据。

About halfway through the Fall 2018 semester, we pulled a data-dump from the company and started optimizing models in real time. The results speak for themselves in the section below.

2018年秋季学期的一半左右,我们从公司撤出了数据转储并开始实时优化模型。 结果在下面的部分中说明了一切。

解决方案:**注意:技术术语在后面** (The Solution: **CAUTION: Technical jargon ahead**)

We battled between more than half a dozen modeling techniques, constantly pivoting as new data and insights came into play. We worked with linear regression, auto-regressive modeling, Poisson regression, random forest, extreme gradient boosted decision trees, and so on. In the end, the perfect model was not one, but a combination of two different models.

我们在六种以上的建模技术之间进行了斗争,并随着新数据和见解的出现而不断发展。 我们进行了线性回归,自回归建模,泊松回归,随机森林,极限梯度增强决策树等工作。 最后,完美的模型不是一个,而是两个不同模型的组合。

We realized that this was not just a problem involving forecasting demand, but also forecasting inventory, so we combined the above machine learning models with the famous Newsvendor model used for inventory management.

我们意识到这不仅是预测需求的问题,而且还涉及库存预测,因此我们将上述机器学习模型与用于库存管理的著名Newsvendor模型结合在一起。

First, we fed the input data into Poisson Generalized Linear Model (GLM) and Gradient Boosted Tree models. The output of both the models was fed as inputs to the Newsvendor model, transforming the above equation into:

首先,我们将输入数据输入Poisson广义线性模型(GLM)和Gradient Boosted Tree模型。 这两个模型的输出均作为Newsvendor模型的输入,将上述方程式转换为:

The final output gave the demand forecast, and, by training the model and validating it with various service levels (ranging from 0.1 to 0.99), we were able to find the optimal one.

最终输出给出了需求预测,通过训练模型并在各种服务级别(范围从0.1到0.99)进行验证,我们可以找到最佳的模型。

结果: (Result:)

The graph below gives a glimpse into how our model outperforms the current method (let’s call it GU’s model). The best way to compare our new method to the old was to find the underage (supply less than demand) and overage (supply greater than demand), which has been plotted below.

下图简要介绍了我们的模型如何胜过当前方法(我们称其为GU模型)。 将我们的新方法与旧方法进行比较的最好方法是找出未成年人(供大于求)和超额(供大于供),如下图所示。

From this graph, we can see two major takeaways.

从这张图,我们可以看到两个主要的收获。

  • We can be flexible in setting our underage and overage levels, whereas this flexibility is not possible for GU’s model (which takes a constant value).我们可以灵活设置未成年人和超龄儿童的水平,而对于GU的模型(采用恒定值)则无法实现这种灵活性。
  • We can achieve lesser overage as well as underage compared to Good Uncle’s model for service levels between 0.67 to 0.91.

    与Good Uncle的模型相比,服务水平在0.67到0.91之间,我们可以实现更少的超额和未满额。

We realized that by setting the optimal service level at 0.68, our model was able to save ~$70 compared to GU’s model for a single food item per route per 10 days. But we wanted to go further. So we ran the model for the top 10 most bought food items across both routes and clusters, and got this handy table shown below:

我们意识到,通过将最佳服务水平设置为0.68, 与GU的模型相比,每10天每条路线一条食品的模型 ,我们的模型可以节省约70美元。 但是我们想走得更远。 因此,我们针对路线和集群中购买量最高的10种食品运行了模型,并获得了以下方便的表格:

Our model was able to save money on all items except for one (it just doesn’t like the BBQ Pulled Pork Plate!). Finally, to clearly show the power of the model, we extrapolated the dollar value to an entire semester by running it on all routes and clusters for the top 10 items.

我们的模型能够节省除一件物品外的所有物品的钱(只是不喜欢BBQ Pulled Pork Plate!)。 最后,为了清楚显示模型的功能,我们通过在前10个项目的所有路线和群集上运行美元价值,将其推算到整个学期。

We observed a potential savings of $29,256 for the top 10 most bought food items over all drop-points (route wise) in just 1 semester, at just 1 campus.

我们观察到,在短短1个学期,仅在1个校园中,在所有落点(路线明智)中,购买量最高的前10种食品的潜在节省额为29,256美元。

收盘时 (In Closing)

This has been the greatest academic opportunity of our tenure, reaching far beyond the walls of the classroom. We had such a great time working with new friends and we learned so much from the professors, and, of course, the wonderful people of Good Uncle. Not only did we drink from the fire-hose of data analytics, but we shared the journey of an innovative, fast-moving startup and learned from the best entrepreneurs in NYC.

这是我们任职期间最大的学术机会,远远超出了教室的范围。 我们与新朋友一起度过了愉快的时光,我们从教授,当然还有好叔叔的好人那里学到了很多东西。 我们不仅从数据分析的高手那里喝酒,而且分享了一个创新,快速发展的创业公司的历程,并向纽约市的优秀企业家学习。

团队 (The Team)

The team consisted of 5 members: Bowen Bao, Don Holder, Jack Spitsin, Nicolai Mouhin and yours truly. This article was written as a team effort.

该团队由5名成员组成: 鲍文宝 , 唐·霍尔德 , 杰克·斯匹辛 , 尼古拉·穆因和您的真正成员。 本文是团队合作编写的。

******************************************************************

****************************************************** ****************

If you found this to be useful, do Follow me for more articles. Did you know you can ? more than once? Try it out! ?I love writing about social issues, products, the technology sector and my graduate school experience in the US. Here is my personal blog. If you’re a curious soul looking to learn everyday, here’s a Slack Group that I created for you to join.

如果您发现此方法有用,请执行“ 关注我”以获取更多文章。 你知道吗? 比一次矿石吗? 试试看! 喜欢写有关社会问题,产品,技术部门和我在美国研究生院的经历的文章。 这是我的个人博客。 如果您是一个好奇的人,希望每天学习,那么这里是我创建的一个Slack团体,供您加入。

The best way to get in touch with me is via Instagram and Facebook. I share some interesting content there. To know more about my professional life, check out my LinkedIn. Happy reading!

与我联系的最好方法是通过InstagramFacebook 我在那里分享一些有趣的内容。 要了解有关我的职业生涯的更多信息,请查看我的LinkedIn 祝您阅读愉快!

翻译自: https://www.freecodecamp.org/news/how-we-reduced-food-wastage-and-saved-money-using-machine-learning-c462aa5a3b30/

淘宝省钱机器人是什么原理

淘宝省钱机器人是什么原理_我们如何使用机器学习减少食物浪费和省钱相关推荐

  1. 个人号微信淘宝客机器人SDK定制开发教程

    个人号微信淘宝客机器人SDK定制开发教程,来自秋天不穿秋裤,天冷也要风度的程序猿之手,必属精品! 今天给大家介绍微信个人号自动回复机器人的开发教程!使用微信机器人托管微信,可以避免不及时回复错过的消息 ...

  2. 淘宝自动回复机器人配置手册——售前模板功能介绍

    鉴于淘宝自动机器人的配置其实是一个非常庞大兼复杂的工程 (还有一大堆东西是你配置了不一定用的上的) 所以我打算先从最简单实用的开始(从免费的开始) 首先言讲的是售前自动模板 也就是当顾客咨询[人工客服 ...

  3. 淘宝自动回复机器人店小蜜配置手册——目录

    记录某只咸鱼在某个电商公司混吃等死的日子的文章合集 杂谈 2018年淘宝主流自动回复软件一览 https://blog.csdn.net/memoriesaier/article/details/81 ...

  4. 淘宝自动回复机器人配置手册——目前2018年淘宝主流自动回复软件一览

    最近找了个钱少事少离家近兼混吃等死的工作-- 好吧就是跑去某个销量还不错的天猫店(平均销售额>10w/天,貌似算比较高的?) 截止目前已经当了半个月(7月6日去应聘/7日上岗--这效率可以啊)的 ...

  5. 两分钟打造淘宝抢单机器人

    1 痛点 各大电商在一些特定的日子都会开启促销活动,如618.双十一等,有时还得盯着时间抢限量发售的商品,但你的成功率高吗?是否经常会遇到App一直加载,刷新后发现商品被一扫而光了?事实是,很多和你竞 ...

  6. 淘宝自然搜索机制排名原理介绍,如何做好店铺商品自然搜索排名?

    淘宝平台对于店铺商品的整体的自然搜索的排名机制是什么,影响搜索排名的因素有哪些?因为淘宝排序原则处理简单的销量排序,现在还增加视频排序,接下来就跟大家分享一下,店铺商品应该有个什么样的状态下排名才会比 ...

  7. 淘宝开放平台 产品数 查询_“开放”如何改变产品

    淘宝开放平台 产品数 查询 Karen Borchert在"万物开放"会议(10月22日至23日)上发表了有关"开放"如何改变产品的演讲. 我是Opensour ...

  8. 淘宝自动回复机器人配置手册——禁用语设置(敏感词关键词屏蔽)

    配售前模板的时候顺带发现的一个不错的小功能 就是淘宝的禁用语,可以禁止客服发送一些敏感的关键词 比如(QQ,扣扣,微信) 马云爸爸和小马哥相爱相杀的故事 进入步骤见:https://blog.csdn ...

  9. 扫地机器人返充原理_扫地机器人原理是什么?

    扫地机人出道已经有20多年了,它的出现正在慢慢改变人们的生活,它能凭借一定的人工智能,自动在房间内完成地板清理工作.但是一定有很多小伙伴问,扫地机器人的原理是什么呢? 扫地机器人的前身是其实是吸尘器, ...

最新文章

  1. [LintCode] Maximum Subarray 最大子数组
  2. 数据中心暖通设计若干思考
  3. 在MySQL中使用explain查询SQL的执行计划
  4. 将来不当科学家,今天不必做科研?
  5. 第三讲 配置SCCM客户端并添加角色
  6. 【GTK】如何得到控件的位置
  7. Zabbix 监控 MySQL
  8. linux chown命令_Linux chown命令示例
  9. Atitit. 解释器模式框架选型 and应用场景attilax总结 oao
  10. 美国最受欢迎的量化交易模型有哪些吗?
  11. MATLAB图像处理基本函数
  12. 滴滴出行app——网约车出行的背后(下)
  13. 电脑忘记密码了怎么办
  14. 东方博宜OJ 1863 - 【入门】特殊的数字四十
  15. 微机原理与接口技术:微型计算机输入输出接口 详细笔记与例题
  16. Linux Ext2文件系统
  17. 使用Qt获取系统版本
  18. 2022年危险化学品生产单位安全生产管理人员考试内容及危险化学品生产单位安全生产管理人员证考试
  19. 艾默生质量流量计的测量方法研究
  20. 读书笔记:无人机控制(六)

热门文章

  1. 算法给小码农堆魂器--铁血柔情
  2. 单张PPT转成单张PDF的PDF文件怎么设置打印出一页纸有6页PPT
  3. win10备份失败解决方案
  4. Easy Crypto 专题总结
  5. python字典的操作
  6. python-turtle画铜钱古币
  7. 钉钉火到日本,日本学生哭着打一星,哈哈哈哈
  8. 用Python爬虫+Crontab实现自动更换电脑壁纸
  9. Scratch 计算排列组合
  10. 对朴素贝叶斯的理解(python代码已经证明)