在CPU 中,micro-operations (also known as a micro-ops or μops) 是用于一些设计复杂机器指令的详细底层指令。

通常,微操作在数据存储方面的一个或多个寄存器执行基本的操作,包括寄存器之间传输数据或寄存器与CPU的额外总线之间传输数据,还有在寄存器上的算术运算或逻辑运算。在典型的取值-译码-执行周期,在执行时微操作的每一步被分解,因此CPUCPU确定并逐步通过一系列微操作。微操作的执行是受控于CPU 控制单元----这决定着执行不同的优化方法如重排序,融合,缓存的执行。

优化

WIKI 原文:

Micro-operation

A high-level illustration showing the decomposition of machine instructions into micro-operations, performed during typical fetch-decode-execute cycles.[1]:1
In computer central processing units, micro-operations (also known as a micro-ops or μops) are detailed low-level instructions used in some designs to implement complex machine instructions (sometimes termed macro-instructions in this context).[2]:8–9

Usually, micro-operations perform basic operations on data stored in one or more registers, including transferring data between registers or between registers and external buses of the central processing unit (CPU), and performing arithmetic or logical operations on registers. In a typical fetch-decode-execute cycle, each step of a macro-instruction is decomposed during its execution so the CPU determines and steps through a series of micro-operations. The execution of micro-operations is performed under control of the CPU’s control unit, which decides on their execution while performing various optimizations such as reordering, fusion and caching.[1]

Optimizations

Various forms of μops have long been the basis for traditional microcode routines used to simplify the implementation of a particular CPU design or perhaps just the sequencing of certain multi-step operations or addressing modes. More recently, μops have also been employed in a different way in order to let modern CISC processors more easily handle asynchronous parallel and speculative execution: As with traditional microcode, one or more table lookups (or equivalent) is done to locate the appropriate μop-sequence based on the encoding and semantics of the machine instruction (the decoding or translation step), however, instead of having rigid μop-sequences controlling the CPU directly from a microcode-ROM, μops are here dynamically buffered for rescheduling before being executed.[3]:6–7, 9–11

This buffering means that the fetch and decode stages can be more detached from the execution units than is feasible in a more traditional microcoded (or hard-wired) design. As this allows a degree of freedom regarding execution order, it makes some extraction of instruction level parallelism out of a normal single-threaded program possible (provided that dependencies are checked etc.). It opens up for more analysis and therefore also for reordering of code sequences in order to dynamically optimize mapping and scheduling of μops onto machine resources (such as ALUs, load/store units etc.). As this happens on the μop-level, sub-operations of different machine (macro) instructions may often intermix in a particular μop-sequence, forming partially reordered machine instructions as a direct consequence of the out-of-order dispatching of microinstructions from several macro instructions. However, this is not the same as the micro-op fusion, which aims at the fact that a more complex microinstruction may replace a few simpler microinstructions in certain cases, typically in order to minimize state changes and usage of the queue and reorder buffer space, therefore reducing power consumption. Micro-op fusion is used in some modern CPU designs.[2]:89–91, 105–106[3]:6–7, 9–15

Execution optimization has gone even further; processors not only translate many machine instructions into a series of μops, but also do the opposite when appropriate; they combine certain machine instruction sequences (such as a compare followed by a conditional jump) into a more complex μop which fits the execution model better and thus can be executed faster or with less machine resources involved. This is also known as macro-op fusion.[2]:106–107[3]:12–13

Another way to try to improve performance is to cache the decoded micro-operations, so that if the same macroinstruction is executed again, the processor can directly access the decoded micro-operations from a special cache, instead of decoding them again. The Execution Trace Cache found in Intel NetBurst microarchitecture (Pentium 4) is a widespread example of this technique.[4] The size of this cache may be stated in terms of how many thousands of micro-operations it can store: kμops.[5]

【计算机系统结构】Micro-operation微操作相关推荐

  1. 考前自学系列·计算机组成原理·微程序微指令微命令微操作

    你要知道这些 一.术语 程序:计算机能识别和运行的指令 指令:指挥计算机工作的指示和命令(编程语言编写的语句) 机器指令:指令编译后的结果(编程语言转换为机器语言) 微指令:机器指令根据一个个操作细分 ...

  2. 计算机系统结构——概述

    计算机的实现包括两个方面:组成和硬件.组成一词包含了计算机设计的高阶内容,例如存储器系统,存储器互连,设计内部处理器 CPU (中央处理器--算术.逻辑.分支和数据传送功能都在内部实现).有时也用微体 ...

  3. 【计算机体系结构】计算机体系结构(1) 计算机系统结构的设计基础

    文章目录 1.1 计算机系统结构的基本概念 1.1.1 计算机系统的层次结构 1.1.2 计算机系统结构 1.1.3 计算机组成与实现 1.1.4 计算机系统结构的分类 1. `Flynn` 分类法 ...

  4. 计算机组成原理诺,计算机组成原理与系统结构 第8章 计算机系统结构.ppt

    文档介绍: 第八章计算机系统结构8.1超标量处理机和超流水线处理机8.2向量处理机8.3并行处理机8.4多处理机8.5互连网络8.6计算机系统结构新发展炸捐硼娩腔氓馒受赂胞支缨秆泼殃涌旦闷涤嘎哎辫贮专 ...

  5. 计算机组成原理微控器功能,(计算机组成原理)实验三微控器实验.ppt

    文档介绍: 计算机组成原理课程设计 实验三.微控器实验 葛扩院廖萝丽斯节宽裔萌宛敢蜘祭癌颂导罪仍囚誓棋尹侈速爹详凑移悸董(计算机组成原理)实验三微控器实验(计算机组成原理)实验三微控器实验 实验三微控 ...

  6. 操作系统第二章笔记---计算机系统结构

    本文内容整理自西安交通大学软件学院田丽华老师的课件,仅供学习使用,请勿转载 操作系统系列笔记汇总:操作系统笔记及思维导图汇总附复习建议_Qlz的博客-CSDN博客 文章目录 文章目录 文章目录 思维导 ...

  7. 计算机系统结构研究分支,“计算机系统结构” 课程教学探讨[J] 电子科技大学.doc...

    "计算机系统结构" 课程教学探讨[J] 电子科技大学.doc <计算机系统结构>课程教学探讨 吴晓华, 徐洁, 王雁东 (电子科技大学 四川 成都 610054) 摘要 ...

  8. 处理机调度实验总结_计算机系统结构总结

    系统结构总论 总目标:快 总原理:加快经常性事件 量化原理:Amdahl定理 Amdahl定理指出加快某部件执行速度所能获得的系统性能加速比,受限于该部件的执行时间占系统总时间的百分比. 加速比 = ...

  9. 第三节 计算机体系结构,计算机系统结构 第三节 输入输出系统.pdf

    课程内容课程内容 计算机系统结构 一.输输入输出系统概述输出系统概述 二二.总线设计总线设计 第三章第三章输入输出系统输入输出系统 三.中断系统 四.通道处理机 五.I/O处理机 刘 超 中国地质大学 ...

最新文章

  1. 相比薪酬,学习效率提升才是创业公司最有价值的报酬
  2. linux nat软件,linux下nat的应用(转)
  3. 国内芯片60个细分领域重要代表企业【收藏】
  4. python 字符串操作速度_强者一出,谁与争锋?与Python相比,C+的运行速度究竟有多快?|python|编程语言|字符串|示例|算法...
  5. hql删除mysql语句_hibernate hql删除异常
  6. 2020-11-17 一道有趣的求极限问题
  7. 内存条上面参数详解_【硬件篇】第4期:内存条知识(台式机)
  8. 图解TCPIP---第二章
  9. 北京最最最牛的IT公司都在这了 。。。
  10. 网站盈利模式分析总结
  11. 英文原著词汇数量测量
  12. 《C语言程序设计(第五版)》习题答案
  13. 有什么压缩图片的方法?这里有两个方法分享
  14. 【OpenCV】cv2.putText()函数用法
  15. rtsp流php播放插件,nginx+ffmpeg搭建rtmp转播rtsp流的flash服务器
  16. oracle 中YYYY-MM-DD HH24:MI:SS的使用
  17. 写给学生看的系统分析与验证笔记(十五)——计算树逻辑(Computation tree logic,CTL)
  18. vs2017编写的html需要打包,VS2017 安装打包插件的图文教程
  19. matlab 求解线性方程组之LU分解
  20. 宣传单,易拉宝,折页

热门文章

  1. php单击回复出现回复框,javascript - 评论回复框的显示与隐藏问题
  2. ubuntuv20启动界面美化_玩转Reno4手机,ColorOS 7.2界面清爽更贴心_手机通讯
  3. HTML+JS+CSS筋斗云导航栏效果
  4. Nouveau源代码分析(三):NVIDIA设备初始化之nouveau_drm_probe
  5. 在毫秒量级上做到“更快”!DataTester 助力飞书提升页面秒开率
  6. 【前端技术】一篇文章搞掂:uni-app
  7. Manjaro21-kde版安装全记录
  8. Android 一个简单手机响铃功能实现
  9. 自动登录yahoo邮箱
  10. 清空数据库所有表中的数据