用动态实现扩展TVM
Extending TVM with Dynamic Execution
Outline
● Motivation for Dynamism
● Representing Dynamism
● Executing Dynamism
● Evaluation
Dynamic Neural Networks
● Networks are exhibiting more and more dynamism
○ Dynamic inputs: batch size, image size, sequence length, etc.
○ Control-flow, recursion, conditionals and loops (in Relay today).
○ Dynamically sized tensors
■ Output shape of some ops are data dependent: arange, nms, etc.
■ Control flow: concatenation within a while loop
● A central challenge is how do we both represent and execute these networks.
fn network(input: Tensor) -> … { … }
%t1: Tensor
%t2 : Tensor
if (%cond) { … } else { … } : Tensor
%start,%stop, %step : i32
arange(%start, %stop, %step) : Tensor
Dynamic Neural Networks
● A central challenge is how do we both represent and execute these networks.
● We will address these two challenges at various levels of the TVM stack and share initial promising results.
Outline
● Motivation for Dynamism
● Representing Dynamism
● Executing Dynamism
● Evaluation
Representing dynamics in TVM
● Add Relay support for dynamic dimension (Any-dim)
● Use shape functions to compute runtime shapes.
● Supporting Any in Tensor Expression (TE) IR.
Any: typing dynamic dimension in Relay
Any: represent an unknown dimension at compilation time.
Any: typing dynamic dimension in Relay
Any: represent an unknown dimension at compilation time.
Define a tensor type: Tensor
Any: typing dynamic dimension in Relay
Any: represent an unknown dimension at compilation time.
Define a tensor type: Tensor
Define type relation:
arange: fn(start:fp32, stop:fp32, step:fp32) -> Tensor

How to compute and check shape dynamically?
Challenges
● Static type checking cannot eliminate all errors
● Type checking system too heavy weight for runtime
How to compute and check shape dynamically?
Challenges
● Static type checking cannot eliminate all errors
● Type checking system too heavy weight for runtime Approach
● Instrument shape computing functions into the program

Shape function
● Register a shape function to each operator to check the type and compute the output shape
Shape function
● Register a shape function to each operator to check the type and compute the output shape
● Shape function has two modes (op_attrs, input_tensors, out_ndims) -> out_shape_tensors
○ Data independent (op_attrs, input_shapes, out_ndims) -> out_shape_tensors
○ Data dependent (op_attrs, input_data, out_ndims) -> out_shape_tensors




Outline
● Motivation for Dynamism
● Representing Dynamism
● Executing Dynamism
● Evaluation
Executing dynamics in TVM
● By extending the IR we now can represent dynamic programs but how do we execute them?
● To handle flexibly executing dynamic programs we introduce the Relay virtual machine.
● We must also generate code which handles dynamic shapes in kernels (work-in-progress):
○ Kernel dispatch for a single op
○ Dispatch for a (sub-)expression
Previous approach: Graph Runtime
● Existing executors are based on a graph traversal style execution.
● Set up a graph of operators and push data along every edge, compute the operation, and flow forward until finished.
● Simple design enables simple memory allocation, and executor.
● Design is complicated by control, and dynamic shapes.
Enter the virtual machine
● Instead we take inspiration from full programming languages and design a VM.
● The VM has special considerations
○ Primitives are tensors, and instructions operate on tensors (CISC-style, no-scalar instructions)
○ Instructions normally built in (+, -, etc.) are realized by code generated via TVM. ○ Control handled in standard way in VM.
○ In contrast to AoT compilation, VM is flexible
■ graph dispatch and bucketing can be easily implemented.



Generating code for dynamic shapes
● We now must solve the final problem of generating kernels that provide compelling performance for non-static shapes.
● The VM provides a framework for experimenting with different strategies, we will discuss in progress approaches:
○ Dynamic operator dispatch (WIP)
○ Graph Dispatch (https://github.com/apache/incubator-tvm/pull/4241)
● We believe there exists lots of future work in this area.
Outline
● Motivation for Dynamism
● Representing Dynamism
● Executing Dynamism
● Evaluation


Dynamic model performance

BERT model performance

Conclusions
● We have extended Relay/TVM with support for dynamic shapes.
● To support increased expressivity of Relay we have built a new execution mechanism the VM.
● We have begun exploring strategies for generating efficient kernels that support dynamic shapes with promising results.
● We believe the VM infrastructure can serve as a foundation for exploring future research into dynamic execution and code generation.
Outline
● Dynamic motivations ○ NLP, NMS, control, data structures ○ Integration with external code and runtimes
● Existing solution: graph runtime
○ Challenges with graph runtime
● Enter VM
○ Designed to be scaffold to build new dynamic functionality consisting of compiler and runtime improvements
● VM design
● Extensions
● Results
● Future Work
○ Dispatch, strategies?
Existing solution: graph runtime Challenges:
● Control flow (if, loop, etc)
● Dynamic shapes
○ Dynamic inputs: batch size, image size, sequence length, etc.
○ Output shape of some ops are data dependent: arange, nms, etc.
○ Control flow: concatenate within a while loop Limitation of TVM/graph runtime
● Cannot compile and run dynamic models
Dynamic codegen: op dispatch (proposal)
● Goal: support codegen for dynamic shape
● Challenges
○ Single kernel performs poor across different shapes
○ Different templates for the same op
○ TVM compute and schedule are coupled together




Why do we need graph dispatcher

  1. Minimal overhead: only one dispatching operation is required for each inference.
  2. Fit for operator such as conv2d_NCHWc. Graph tuning is well defined for each subgraph.
  3. Avoid runtime layout tracking system for operator requires layout transformation to optimize.

参考链接:
https://sampl.cs.washington.edu/tvmconf/slides/2019/Jared-Roesch-Haichen-Shen-RelayVM.pdf

用动态实现扩展TVM相关推荐

  1. mysql动态扩展_动态可扩展查询MYSQL5.7JSON+虚拟列+Mybatis

    背景:现有业务扩展字段,都存在feature字段,存在语义不清晰以及,难以利用索引查询问题 Mysql 5.7后推出利器,JSON+虚拟列,即实现了业务语义统一,也支持索引查询加速 一.简单描述 My ...

  2. Sentinel(十三)之动态规则扩展

    转载自  动态规则扩展 规则 Sentinel 的理念是开发者只需要关注资源的定义,当资源定义成功后可以动态增加各种流控降级规则.Sentinel 提供两种方式修改规则: 通过 API 直接修改 (l ...

  3. CyberMiles发布动态可扩展语言Lity,它凭什么叫板Solidity?

    版权声明:本文为博主原创文章, 转自https://blog.csdn.net/Blockchain_lemon/article/details/81904699,未经博主允许不得转载. 第一次见 M ...

  4. 2.5.13 动态内存扩展AME

    最后更新2021/08/01 AME技术目的是压缩内存中的数据,因此减少物理内存使用量,从某种意义上提升系统IO性能.个人看来,这是个有毒无用的功能,因为AME的基础是用CPU的部分算力,尽管可以是额 ...

  5. 阿里巴巴 Excel工具easyExcel 导出 - 动态列扩展。

    阿里巴巴 Excel工具easyExcel 导出 最近有个需要要求既要动态列扩展.又要列和数据对应.最重要的是一定要通用,满足当前项目的需求 无实体兼容实体对象动态列扩展. 1.实体类 import ...

  6. python动态创建类_Python中通过参数动态创建扩展类(class)

    class Bar: def super_cool_function(self): print("Cool") 1.利用Python闭包动态扩展类 通过在内部创建并从函数返回它来动 ...

  7. C# 动态语言扩展(学习笔记)

    1 概述: 在早期的.NET Framework中,"Var"关键字和匿名方法开启了C#走向动态的道路,在4.0中,动态类型被添加进去.尽管C#是一个静态类型的语言,但这些额外的添 ...

  8. 玩转SpringCloud Security OAuth2资源授权动态权限扩展

    点击关注公众号,实用技术文章及时了解 来源:blog.csdn.net/new_com/article/ details/104731154 在Spring Cloud Security 中,认证和授 ...

  9. linux下用phpize给PHP动态添加扩展

    使用php的常见问题是:编译php时忘记添加某扩展,后来想添加扩展,但是因为安装php后又装了一些东西如PEAR等,不想删除目录重装,别说,php还真有这样的功能. 我没有在手册中看到. 如我想增加b ...

最新文章

  1. 重磅直播 | 多传感器标定原理及方案介绍(阿里云AI Lab)
  2. 【译】Object Dumper: 函数式程序设计编码中的强大工具
  3. Python从菜鸟到高手(5):数字
  4. 《社会智能与综合集成系统》第1章1.节参考文献
  5. 谈卢梭的《爱弥尔》及其对于教育的现实意义
  6. 牧马人鼠标g13鼠标宏_达尔优第五代牧马人EM915游戏鼠标评测
  7. 【免费毕设】PHP课程网站络管理系统(源代码+论文)
  8. 【xsy1061】排列 树状数组
  9. 深度 | 刘群:基于深度学习的自然语言处理,边界在哪里?
  10. Linux下安装jq
  11. MATLAB信号处理——信号与系统的分析基础(2)
  12. PJzhang:漏洞渗透测试框架“天使之剑(AngelSword)”
  13. 量化选股 聚宽学习获取财务数据
  14. 使用ATL创建简单ActiveX控件(一) —— 创建ATL项目
  15. 什么是同比、环比与定基比
  16. Lu求解含积分的复杂非线性方程(组)
  17. 企业微信和个人微信在朋友圈上有何区别?
  18. 【0805作业】继承Thread类创建线程,输出20次数字,“你好”,线程名
  19. 2013.10.6日在109机房维护电脑心得_红蜘蛛出现异常
  20. Ubuntu 18.04下触控板右键失灵的解决方法

热门文章

  1. 2022-2028年中国地铁广告行业研究及前瞻分析报告
  2. 2022-2028年中国汽车零配件行业研究及前瞻分析报告
  3. etcd 笔记(02)— etcd 安装(apt 或 yum 安装 、二进制包安装、Docker 安装 etcd、etcd 前端工具etcdkeeper)
  4. Go 知识点(15)— 切片长度和容量
  5. ServletContext讲解
  6. 关于素数的简单算法整理
  7. 先验概率,后验概率,条件概率,贝叶斯
  8. Pytorch使用GPU
  9. LeetCode简单题之单值二叉树
  10. Darknet_Yolov3模型搭建