编译器=翻译器

编译器是每个软件开发人员每天都会使用几次的工具。这是因为网络绝大部分基于客户端的代码执行,并且许多这种客户端的程序都以源代码的形式被传递给浏览器。

这里我们就有一个比较重要的概念:尽管源代码通常是人类可读的,对于我们的CPU来说它们就像垃圾。在另一方面,虽然机器码是机器可读的,但是它几乎是人类不可读的。因此我们需要执行一步翻译过程。

最简单的编译器执行单步编译:从源代码到机器码。然而,实际上大多数编译器都会经过至少两阶段:从源代码到AST,再从AST到机器码。AST在这种情况下作为IR,它仅仅是源代码的另外一种组织形式。

对于阶段数并没有明确限定,每个新的阶段都会将源程序变得更像机器码。

优化阶段

然而,并不是所有的阶段都仅被用来翻译。许多编译器同样尝试来优化我们所写的代码。(我们所写的代码通常需要在高性能和代码优雅间取一个均衡)。

以下面的JavaScript代码为例子:

for (var i = 0, acc = 0; i < arr.length; i++)acc += arr[i];

如果编译器直接将它从AST翻译到机器码,那么它可能类似如下的形式(in very abstract and detached from reality instruction set):

acc = 0;
i = 0;
loop {// Load `.length` field of arrtmp = loadArrayLength(arr);if (i >= tmp)break;// Check that `i` is between 0 and `arr.length`// (NOTE: This is necessary for fast loads and// stores).checkIndex(arr, i);// Load valueacc += load(arr, i);// Increment indexi += 1;
}

上述代码还有很大的优化空间,在循环期间数组长度不变,并且范围检查不是必要的,理想情况下,代码应当如下所示:

acc = 0;
i = 0;
len = loadArrayLength(arr);
loop {if (i >= len)break;acc += load(arr, i);i += 1;
}

假设我们现有一个AST,并且我们想要直接从AST中生成机器码(注意,下面生成自esprima)。

{ type: 'ForStatement',//// This is `var i = 0;`//init:{ type: 'VariableDeclaration',declarations:[ { type: 'VariableDeclarator',id: { type: 'Identifier', name: 'i' },init: { type: 'Literal', value: 0, raw: '0' } },{ type: 'VariableDeclarator',id: { type: 'Identifier', name: 'acc' },init: { type: 'Literal', value: 0, raw: '0' } }],kind: 'var' },//// `i < arr.length`//test:{ type: 'BinaryExpression',operator: '<',left: { type: 'Identifier', name: 'i' },right:{ type: 'MemberExpression',computed: false,object: { type: 'Identifier', name: 'arr' },property: { type: 'Identifier', name: 'length' } } },//// `i++`//update:{ type: 'UpdateExpression',operator: '++',argument: { type: 'Identifier', name: 'i' },prefix: false },//// `arr[i] += 1;`//body:{ type: 'ExpressionStatement',expression:{ type: 'AssignmentExpression',operator: '+=',left: { type: 'Identifier', name: 'acc' },right:{ type: 'MemberExpression',computed: true,object: { type: 'Identifier', name: 'arr' },property: { type: 'Identifier', name: 'i' } } } }

上述JSON表示也可以通过如下方式进行可视化:

这是一个树形结构,因此我们可以很自然地从顶部到底部进行遍历,在我们访问AST节点的同时生成机器码。这种方式的问题在于the information about variables is very sparse, and is spread through the different tree nodes.

Again, to safely move the length lookup out of the loop we need to know that the array length does not change between the loop’s iterations. Humans can do it easily just by looking at the source code, but the compiler needs to do quite a lot of work to confidently extract those facts directly from the AST.

Like many other compiler problems, this is often solved by lifting the data into a more appropriate abstraction layer, i.e. intermediate representation. In this particular case that choice of IR is known as a data-flow graph (DFG). Instead of talking about syntax-entities (like for loops, expressions, …), we should talk about the data itself (read, variables values), and how it changes through the program.

数据流图

在我们的例子中,我们感兴趣的是变量arr的值。我们想要能够很轻易的观察到所有使用它的地方从而证明没有越界访问或者任何修改它的长度的操作。

This is accomplished by introducing “def-use” (definition and uses) relationship between the different data values. Concretely, it means that the value has been declared once (node), and that it has been used somewhere to create new values (edge for every use). Obviously, connecting different values together will form a data-flow graph like this:


我们着重关注图中红色的部分。从它伸出的实线代表对这个值的使用,通过遍历这些边,编译器可以推断出array的值用在:

  • loadArrayLength
  • checkIndex
  • load

这种图

Such graphs are constructed in the way that explicitly “clones” the array node, if its value was accessed in a destructive manner (i.e. stores, length sizes). Whenever we see array node and observe its uses - we are always certain that its value does not change.

It may sound complicated, but this property of the graph is quite easy to achieve. The graph should follow Single Static Assignment (SSA) rules. In short, to convert any program to SSA the compiler needs to rename all assignments and later uses of the variables, to make sure that each variable is assigned only once.

Example, before SSA:

var a = 1;
console.log(a);
a = 2;
console.log(a);

After SSA:

var a0 = 1;
console.log(a0);
var a1 = 2;
console.log(a1);

通过这种方式,我们可以确定当我们讨论a0时,we are actually talking about a single assignment to it. This is really close to how people do things in the functional languages!

Seeing that loadArrayLength has no control dependency (i.e. no dashed lines; we will talk about them in a bit), compiler may conclude that this node is free to move anywhere it wants to be and can be placed outside of the loop. By going through the graph further, we may observe that the value of ssa:phi node is always between 0 and arr.length, so the checkIndex may be removed altogether.

控制流图(Control Flow Graph,CFG)

We just used some form of data-flow analysis to extract information from the program. This allows us to make safe assumptions about how it could be optimized.

数据流表达在其他方面也是十分有用的。仅有的一个问题是通过将代码转化为这种图,我们其实在我们最终转化为机器码的过程中后退了一步。这种IR相比AST甚至更不适合生成机器码。

通常而言,这个问题通过将图中的节点组合为块来进行解决。这种表达方式被称为控制流图,下面是一个例子:

b0 {i0 = literal 0i1 = literal 0i3 = arrayi4 = jump ^b0
}
b0 -> b1b1 {i5 = ssa:phi ^b1 i0, i12i6 = ssa:phi ^i5, i1, i14i7 = loadArrayLength i3i8 = cmp "<", i6, i7i9 = if ^i6, i8
}
b1 -> b2, b3
b2 {i10 = checkIndex ^b2, i3, i6i11 = load ^i10, i3, i6i12 = add i5, i11i13 = literal 1i14 = add i6, i13i15 = jump ^b2
}
b2 -> b1b3 {i16 = exit ^b3
}

It is called a graph not without the reason. For example, the bXX blocks represent nodes, and the bXX -> bYY arrows represent edges. Let’s visualize it:


As you can see, there is code before the loop in block b0, loop header in b1, loop test in b2, loop body in b3, and exit node in b4.

Translation to machine code is very easy from this form. We just replace iXX identifiers with CPU register names (in some sense, CPU registers are sort of variables, the CPU has a limited amount of registers, so we need to be careful to not run out of them), and generating machine code for each instruction, line-by-line.

To recap, CFG has data-flow relations and also ordering. This allows us to utilize it for both data-flow analysis and machine code generation. However, attempting to optimize the CFG, by manipulating the blocks and their contents contained within it, can quickly become complex and error-prone.

Instead, Clifford Click and Keith D. Cooper proposed to use an approach called sea-of-nodes, the very topic of this blog post!

Sea-of-Nodes

Remember our fancy data-flow graph with dashed lines? Those dashed-lines are actually what make that graph a sea-of-nodes graph.

Instead of grouping nodes in blocks and ordering them, we choose to declare the control dependencies as the dashed edges in a graph. If we will take that graph, remove everything non-dashed, and group things a bit we will get:


With a bit of imagination and node reordering, we can see that this graph is the same as the simplified CFG graphs that we have just seen above:


Let’s take another look at the sea-of-nodes representation:

The striking difference between this graph and CFG is that there is no ordering of the nodes, except the ones that have control dependencies (in other words, the nodes participating in the control flow).

This representation is very powerful way to look at the code. It has all insights of the general data-flow graph, and could be changed easily without constantly removing/replacing nodes in the blocks.

TurboFan-Sea of Nodes概念讲解相关推荐

  1. Sea of nodes 中译文

    原文链接:https://darksi.de/d.sea-of-nodes/ 简介 这篇文章将讲述我最近学到的Sea of nodes编译器概念. 尽管不是完全必要,但在阅读本文之前,可以先看一下我以 ...

  2. 基于keras的深度学习基本概念讲解

    基于keras的深度学习基本概念讲解 Tensorflow1.0正式发布,谷歌首届Tensorflow开发者大会在山景召开,深度学习迎来新的高潮和狂欢.随着深度学习框架的普及和推广,会有越来越多人加入 ...

  3. EL:集成学习(Ensemble Learning)的概念讲解、问题应用、算法分类、关键步骤、代码实现等相关配图详细攻略

    EL:集成学习(Ensemble Learning)的概念讲解.算法分类.问题应用.关键步骤.代码实现等相关配图详细攻略 目录 集成学习Ensemble Learning 1.集成学习中弱分类器选择 ...

  4. STM32驱动开发(二)--USB Device RNDIS虚拟网卡(USB2.0 基础概念讲解)

    STM32驱动开发(二)–USB Device RNDIS虚拟网卡(USB2.0基础概念讲解) 一.简介   本文基于stm32 Rndis实例,github开源, 使用STM32F407单板.结合协 ...

  5. React微前端模块联盟概念讲解

    React&微前端&模块联盟概念讲解@react,webpack,微前端 什么是微前端? 微前端架构(micro frontends architecture)设计风格为应用程序的前端 ...

  6. OpenERP库存管理的若干概念讲解(新增库存价值)

    OpenERP库存管理的若干概念讲解(新增库存价值) « 于: 六月 03, 2011, 09:01:50 下午 » 一.复式库存(Double-Entry Stock Management)和库存移 ...

  7. 计算机语言中函数的概念,什么是函数,什么是公式,两者的概念讲解及比较

    在这个VBA与GO语言的平台上,我一直在不遗余力地推广我的观点:这个平台是主要讲VBA语言,但这里的VBA语言不单单只是抽象的代码,大多数是实战的例子.这些例子是以函数为载体,以GO思想为指导.无论函 ...

  8. maya 2014帮助手册中 三维概念讲解

    maya 2014 帮助手册中   三维概念讲解 多边形简介 三个或更多的边,   顶点    边    面  组成 经常使用三边形或四边形来建模   n边形不常用 单个多边形称为面   多个面连接到 ...

  9. 三种函数指针的表达以及函数指针概念讲解

    三种函数指针的表达以及函数指针概念讲解 函数指针的概念 三种表达函数指针的方法 函数指针的概念 如果在程序中定义了一个函数,那么在编译时系统就会为这个函数代码分配一段存储空间,这段存储空间的首地址称为 ...

  10. ROS wiki系列|ROS入门基础概念讲解

    上一期我们对ROS wiki中ROS部分进行了着重讲解,回顾戳这 这一期我们主要介绍ROS-getting started部分的一些基本概念 相关wiki页面:http://wiki.ros.org/ ...

最新文章

  1. MySQL连接问题【如何解决MySQL连接超时关闭】
  2. 关于offsetTop offsetHeight clientHeight scrollHeight scrollTop的区别研究
  3. 【数理知识】《数值分析》李庆扬老师-第5章-解线性方程组的直接方法
  4. 【必知】国内最设计感的 App推荐
  5. SQL语句AND 和 OR执行的优先级
  6. Anaconda 的安装、环境变量配置及使用
  7. Testlink1.9.5的安装配置
  8. 极简 响应式 html5,HTML5----响应式(自适应)网页设计
  9. 二叉树经典题之从前序和中序遍历构建二叉树
  10. 详解vmware安装
  11. Android开发笔记(一百一十六)网络学习资源
  12. 【推荐系统】那些年, 引用量超1000的经典推荐系统论文
  13. 【浙江大学PAT真题练习乙级】1007 素数对猜想 (20分) 真题解析
  14. 有关分组、帧、报文、比特流的问题
  15. 根据xsd文件逆向生成java类
  16. 试卷模板 html,试卷模板怎么转换a4Word
  17. vue 加headers_(vue.js)axios interceptors 拦截器中添加headers 属性
  18. Visual Studio 2008 测试版 2 自述文件
  19. 服务器无线桥接怎么设置,路由器怎么设置桥接方法 2个路由器无线桥接设置图解...
  20. 【算法】常见数据结构基本算法整理

热门文章

  1. ArcGIS中创建数据要素模板,便捷数据采集
  2. mysql查询最接近的记录
  3. Nginx性能提升--引入线程池性能提升9倍
  4. do while新用法--方便数据验证时跳出
  5. C++复习(五)(const、static、inline、引用与指针、new/delete)
  6. 微服务与虚拟化技术博客总结
  7. 非科班普通本科就注定进不了大厂?我不服
  8. 接口自动化测试框架搭建(1、环境、框架的思路及目录构成)--python+HTMLTestRunnerCN+request+unittest+mock+db
  9. 点击按钮传递参数并调用ajax,jQuery 单击使用 jQuery 的按钮并在 ajax 成功后将数据附加到响应...
  10. 没有编程基础可以学python_没有任何编程基础可以直接学习python语言吗?学会后能够做什么?...