结合这个图示来看:https://dl.dropboxusercontent.com/u/32077444/nsight.pdf

1) The bars you see in the Summary Page of the Profiler represent the % bottlenecked that unit was for the selected draw call(s). This gives you a feel for which part of the pipeline to go after for optimization opportunities, rather than just trying things and seeing if the FPS changes. So, in your case, you are showing ~75-80%, which means you can try and improve your shader source and that should help the performance of the 5 draw calls in your selected Draw Call Group. Note that a unit doesn’t have to be a 100% bottleneck for it to be worth investigating for changes. Even if it is a bottleneck 10% of the time it still prevented you from achieving the optimal throughput for a given call, so if you are, say, 20% texture bound you can still investigate the standard optimizations like filtering and mipmapping to see how it impacts perf.

2) The gaps in the Frame Timings graph are sometimes uncontrollable. It can be helpful to run an analysis session on your frames to get a feel for how full your command buffers might be and what might cause the gap (such as resource uploads, etc.). We don’t really give out more details in that screen and without a repro it is hard to tell exactly what caused the gap.

3) You asked about the 3 timing values in the Frame Timings and what might be considered “good”. The 3 values represent 2 ways to measure the draw call timing and 1 calculated value:

a.EPC/Empty Pipeline Cost: This is measuring each draw call, one at a time, as it flows from the top to the bottom of the pipe. We add a flush before and after each call so you can consider this an absolute cost for the draw call, not taking anything else like pipeline width, resource contention (both positive and negative), etc. into account. This is helpful to know how much each draw call costs in isolation.

b. FPC/Full Pipeline Cost: We measure this value with all draw calls in flight but bookended by pipeline reports that give us the start time for each draw call (first vertex being processed) to the end (last fragment being retired to the frame buffer). This means that any resource contention such as hitting the texture unit and either warming or dirtying the cache, having so many threads around that the shader units are fully occupied and cannot start on new work, is all taken into account. This gives you a “real world” cost for every draw call.

c. IDC/Incremental Draw Cost: This is a calculated value that takes into account any overlap you might see in draw calls. Say you have 2 identical draw calls, each one basically takes up ½ of the full pipeline width. Each one’s EPC and FPC are likely to be very close, but if they only take up ½ of the width the incremental or additional cost of that second might actually be 0…it is able to be executed fully in parallel with the first call. So, the FPS would be the same, 1 draw or 2, and the IDC would be full cost for the first call and 0 for the second.

4) On the Memory Screen, you asked if there was a breakdown per shader or draw request. This is what we have the state buckets for. By pressing the button on the tool bar, you can group draw calls be shared state (in this case you can say the shader in question) and then you will see the stats for just those draw calls. You can also do it based on performance markers, so you can group them pretty much however you want.

5) On the Memory Screen, you asked if the 330k was read or write and it is the sum. We don’t yet break out read vs write but could consider it for a future enhancement.

6) Your other question on the Memory screen was what the 3.6GB of bandwidth between L2 and Memory was and that is the number of bytes written. I must confess that I am puzzled by the number because it should be basically the sum of write operations that go through the L2 and most of them should come via the Framebuffer unit. If I can get access to your app it would help me understand if we have a bug there or just a number that isn’t reported.

7) On the Bottleneck screen, you asked about drilling into the shader bottleneck information. We don’t currently support this but it is a feature that we have considered and already laid some of the ground work for in our CUDA tools. I will add you to the list of requestors for that capability.

8) You asked how the Framebuffer could be a bottleneck if rendering a full screen quad and that is because in NVIDIA language, the Framebuffer represents basically the memory controller. All requests for memory, from the blending unit, texture unit, shader, etc. all go through the Framebuffer unit. Are you doing lots of lookups in draw call 116?

9) Utilization is generally trying to show you how much of the available horsepower you used for the amount of time the draw call took. To gain details I would need to know what your workload was and possibly sample additional data, but it is possible the shader unit is underutilized because it was bottlenecked waiting for data inside of the shader unit, like L1 values to return, local memory, or other resource contention.

转载于:https://www.cnblogs.com/Baesky/p/hellNsight.html

NSight统计数据的颜色,缩写意义是什么?来自NV Jeff Kiel 比较官方的解释!相关推荐

  1. Excel如何统计指定背景颜色数据个数

    今天跟大家分享一下Excel如何统计不同背景颜色数据个数 1.如下图表格中部分单元格被填充了不同的背景颜色,现在我们想要快速统计出绿色背景颜色数据个数. 2.全选表格区域,点击下图选项(Excel工具 ...

  2. Excel如何统计不同背景颜色单元格数据

    今天跟大家分享一下Excel如何统计不同背景颜色单元格数据 1.如下图表格单元格填充了不同的背景颜色,现在我们想要快速统计出各个背景颜色数据个数. 2.首先我们全选表格区域 3.然后我们点击下图选项( ...

  3. UE4 统计数据命令描述

    统计数据命令描述 统计数据命令描述 概述 执行命令 在编辑器统计查看统计数据 统计数据类型 Cycle Counter Stat(循环计数器统计数据) Memory Counter Stat(内存计数 ...

  4. OPNET学习笔记(一):创建一个小型局域网工程、场景并对比统计数据

    OPNET学习笔记(一):创建一个小型局域网并对比统计数据 前言 1.创建工程 2.配置场景 3.创建场景 4.选择统计量 5.结果显示 6.创建对比场景并对比 7.总结 前言 关于OPNET的安装教 ...

  5. 白话Elasticsearch35-深入聚合数据分析之案例实战更多metrics用法:统计每种颜色电视最大最小价格

    文章目录 概述 官方指导 Metrics Aggregations Min Aggregation Max Aggregation Sum Aggregation 案例:统计每种颜色电视最大最小价格 ...

  6. 白话Elasticsearch33-深入聚合数据分析之案例实战bucket + metrics 统计每种颜色电视平均价格

    文章目录 概述 官方说明Avg Aggregation 案例:统计每种颜色电视平均价格 概述 继续跟中华石杉老师学习ES,第33篇 课程地址: https://www.roncoo.com/view/ ...

  7. 白话Elasticsearch32-深入聚合数据分析之案例实战Terms Aggs 统计哪种颜色电视销量最高

    文章目录 概述 Terms Aggregation官方文档 案例一 : 统计哪种颜色电视销量最高 模拟数据 统计哪种颜色的电视销量最高 size 参数 示例 外层size terms节点下的size ...

  8. 运用计算机辅助电话调查的方法,第二章 统计数据的搜集、整理与显示

    "对统计学家来说,当今是统计学一切最重要活动的最重要的时期." "在花费同样的时间和劳动下,完整细致地检查数据的收集过程,或者说试验过程,常常会增加10倍或12倍的收益. ...

  9. mysql统计数据的代码_MySQL按时间统计数据的方法介绍(代码示例)

    本篇文章给大家带来的内容是关于MySQL按时间统计数据的方法介绍(代码示例),有一定的参考价值,有需要的朋友可以参考一下,希望对你有所帮助. 在做数据库的统计时,经常会需要根据年.月.日来统计数据,然 ...

  10. 你是怎样“被平均”的?细数统计数据中的那些坑

    导读:下面这则新闻能在多大程度上说服你? 新闻简报:经济获得了长足发展.上个月一个月我们的失业率就下降了一个百分点. 上面的论证压根儿就没法打动你.这个论证用数据欺骗了我们! 作者提出的证据当中最为常 ...

最新文章

  1. 亏本也要抢市场!谷歌亚马逊一路死磕到CES,争夺语音入口之路,谁都不是吃素的
  2. 企业级闪存弥补数据经济价值短板
  3. g(n)= d∣n ∑ f( d n )_专栏F|Cora单词25衣服 (下):衣服也要配饰来搭配
  4. python aes padding_python笔记43-加解密AES/CBC/pkcs7padding
  5. 每天一道LeetCode-----只可能有'.'和'*'的字符串正则匹配
  6. Tensorflow加载模型(进阶版):如何利用预训练模型进行微调(fintuning)
  7. 产品案例:微信状态,有多牛逼?
  8. java中注解操作redis_spring boot —— redis 缓存注解使用教程
  9. mysql.net连接器_关于mysql-connector-net在C#中的用法
  10. w8fuckcdn 通过扫描全网绕过CDN获取网站IP地址
  11. Gof 设计模式 完结
  12. linux玩游戏无声音,Linux下LumaQQ 无声音的解决方法与播放电影
  13. 解决iPhone模拟器无法启动的方法
  14. 软件工程(2019)第四次作业
  15. Apache Hadoop大数据集群及相关生态组件安装
  16. Yolov6解决常见报错(1)TypeError numpy.float64 object cannot be interpreted as an index
  17. 解读帖子:结构化编译器前端 Clang 介绍(VS2017编译clang)
  18. MySQL表查询关键字
  19. 网络协议 -- 最全的网络协议图
  20. vue仿今日头条_Vue实战篇(Vue仿今日头条)

热门文章

  1. 【转帖】LoadRunner监控Linux与Windows方法
  2. 2006-8-11 11:29:00 搜索算法及其在ACM竞赛中的应用(作者/刘力科 计算机系01级4班)...
  3. Android 四大组件学习之Activity五
  4. linux内核分析与应用 -- 进程与线程(下)
  5. Android-7.0-Nuplayer-启动流程
  6. linux上应用程序的执行机制
  7. Makefile 管理工具 — Automake and Autoconf
  8. linux下给qt4安装QSerialPort
  9. L3-013 非常弹的球 (30 分)
  10. deepin允许root登录_王者荣耀安卓免ROOT不用电脑修改战区2020最新版教程