sisoftware java测试_SiSoftware理论性能测试10980XE vs 3950X vs 7900X vs 9900K
本帖最后由 jerrytsao 于 2019-11-23 19:14 编辑
转总现在修身养性, 不方便上论坛, 帮贴
https://www.sisoftware.co.uk/2019/11/20/intel-core-i9-10980x-cascade-lake-review-benchmarks-cpu-18-core-36-thread-avx512/
10980XE 18C vs 3950X 16C vs 7900X 10C vs 9900K 8C
Native Dhrystone Integer (GIPS) 779 [+3%] 400 455 753 CSL-X is just 3% faster than Ryzen3.
Native Dhrystone Long (GIPS) 835 [+11%] 393 448 750 With a 64-bit integer workload – the gain is 11%.
Native FP32 (Float) Whetstone (GFLOPS) 459 [-1%] 236 262 464 With floating-point workload we have a tie.
Native FP64 (Double) Whetstone (GFLOPS) 379 [-4%] 196 223 393 With FP64 it is 4% slower than R3.
Native Integer (Int32) Multi-Media (Mpix/s) 2,341 [+25%] 985 1,430 1,873 In this vectorised integer test, AVX512 allows CSL-X a 25% win.
Native Long (Int64) Multi-Media (Mpix/s) 913 [+23%] 414 550 744 With a 64-bit AVX2 integer workload the gain is 23%.
Native Quad-Int (Int128) Multi-Media (Mpix/s) 12.92 [=] 6.75 9.58 12.98 This is a tough test using Long integers to emulate Int128 without SIMD it’s a tie.
Native Float/FP32 Multi-Media (Mpix/s) 2,676 [+36%] 914 1,740 1,970 In this floating-point vectorised test, CSL-X is 36% faster.
Native Double/FP64 Multi-Media (Mpix/s) 1,738 [+45%] 535 1,140 1,200 Switching to FP64 SIMD code, the gain is 45%.
Native Quad-Float/FP128 Multi-Media (Mpix/s) 56.4 [+21%] 23 38.7 46.5 In this heavy algorithm using FP64 to mantissa extend FP128 CSL-X is 21% faster.
Crypto AES-256 (GB/s) 33.9 [2.6x] 17.6 34 13 With AES/HWA support CSL-X wins due to 4-channels.
Crypto AES-128 (GB/s) 33.9 [2.6x] 17.6 34 13 No change with AES128.
Crypto SHA2-256 (GB/s) 33.5 [+17%] 12 26 28.6 Without SHA/HWA CSL-X still wins.
Crypto SHA1 (GB/s) 22.9 38 Less compute intensive SHA1 .
Crypto SHA2-512 (GB/s) 9 21 SHA2-512 is not accelerated by SHA/HWA.
Black-Scholes float/FP32 (MOPT/s) 276 344 With non vectorised workload.
Black-Scholes double/FP64 (MOPT/s) 497 [+61%] 238 277 308 Using FP64 CSL-X is 60% faster than Ryzen3.
Binomial float/FP32 (kOPT/s) 59.9 68.3 Binomial uses thread shared data thus stresses the cache & memory system.
Binomial double/FP64 (kOPT/s) 128 [+3%] 61.6 68 124 With FP64 code CSL-X is just 3% faster.
Monte-Carlo float/FP32 (kOPT/s) 56.5 257 Monte-Carlo also uses thread shared data but read-only thus reducing modify pressure on the caches.
Monte-Carlo double/FP64 (kOPT/s) 178 [-16%] 44.5 103 212 Switching to FP64 CSL-X is 16% slower.
SGEMM (GFLOPS) float/FP32 375 413 In this tough vectorised AVX2/FMA algorithm.
DGEMM (GFLOPS) double/FP64 240 [+45%] 209 212 165 With FP64 vectorised code, CSL-X is 45% faster.
SFFT (GFLOPS) float/FP32 22.3 28.6 FFT is also heavily vectorised but stresses the memory sub-system more.
DFFT (GFLOPS) double/FP64 22.07 [+2.6x] 11.21 14.6 8.56 With FP64 code, CSL-X is over 2x faster.
SNBODY (GFLOPS) float/FP32 557 638 N-Body simulation is vectorised but with more memory accesses.
DNBODY (GFLOPS) double/FP64 292 [-25%] 171 195 388 With FP64 code CSL-X is 25% slower.
Blur (3×3) Filter (MPix/s) 7,295 [+2.53x] 2,560 4,880 2,883 In this vectorised integer workload CSL-X is 2.5x faster.
Sharpen (5×5) Filter (MPix/s) 2,868 [+54%] 1,000 1,880 1,857 Same algorithm but more shared data still 54%.
Motion-Blur (7×7) Filter (MPix/s) 1,724 [+80%] 519 1,000 959 Same algorithm but even more data shared 80% faster.
Edge Detection (2*5×5) Sobel Filter (MPix/s) 2,285 [+44%] 827 1,500 1,589 Different algorithm but still vectorised workload 44% faster.
Noise Removal (5×5) Median Filter (MPix/s) 332 [+91%] 78 221 174 Still vectorised code again almost 2x faster.
Oil Painting Quantise Filter (MPix/s) 112 [+2.25x] 42.2 66.7 49.7 Even better improvement here of 2.25x
Diffusion Randomise (XorShift) Filter (MPix/s) 3,573 [2.37x] 4,000 3,084 1,505 With integer workload 2.5x faster.
Marbling Perlin Noise 2D Filter (MPix/s) 1,162 [+85%] 596 776 627 In this final test again with integer workload CSL-X is 85% faster.
结论
Thanks to AVX512 CSL-X manages to easily beat Ryzen3 in heavily vectorised algorithms (up to 50% faster) and also in memory-bandwidth heavy algorithms (due to its 4-channels memory sub-system). But, despite having 2 extra cores, on older AVX2/FMA we pretty much have a tie – not something we are used to see from Intel.
Also, the improvement over older SKL-X is exactly in-line with the increase in cores (18 vs 10 here) – thus there are no appreciable core improvements to boost performance. Without specific VNNI-accelerated algorithms, there is no point for SKL-X users to upgrade: you do get more cores for a lot less money but your hardware is also worth a lot less.
It shows how much Ryzen3 has improved (especially due to 256-bit width AVX2/FMA units) and ThreadRipper with 4-channels and even more cores (up to 32) and threads (up to 64) should nullify Intel's AVX512 benefit.
sisoftware java测试_SiSoftware理论性能测试10980XE vs 3950X vs 7900X vs 9900K相关推荐
- sisoftware java测试_SiSoftware Sandra测试及全文总结_内存硬盘评测-中关村在线
●SiSoftware Sandra测试 这是一套功能强大的系统分析评测工具,拥有超过30种以上的测试项目,主要包括有 CPU.Drives.CD-ROM/DVD.Memory.SCSI.APM/AC ...
- sisoftware java测试_Super PI及SiSoftware Sandra测试_内存硬盘评测-中关村在线
Super PI 1M计算性能 Super PI是一款平台计算性能测试工具,通过测量算完一定PI后一定位数需要的时间来评价平台的计算性能,性能的影响因素包括CPU和内存. DDR3 1600频率 首先 ...
- jmeter的java测试框架_性能测试学习之路 (四)jmeter 脚本开发实战(JDBC JMS 接口脚本 轻量级接口自动化测试框架)...
1.业务级脚本开发 登录脚本->思路:在线程组下新建两个HTTP请求,一个是完成访问登录页,一个是完成登录的数据提交. 步骤如下: 1) 访问登录页 2) 提交登录数据的HTTP PS:对于业务 ...
- jmeter之java代码性能测试_松勤软件性能测试-自定义编写的Java测试代码在Jmeter中如何使用...
原标题:松勤软件性能测试-自定义编写的Java测试代码在Jmeter中如何使用 我们在做性能测试时,有时需要自己编写测试脚本,很多测试工具都支持自定义编写测试脚本,比如LoadRunner就有很多自定 ...
- apachejmeter_java源码_自定义编写jmeter的Java测试代码
我们在做性能测试时,有时需要自己编写测试脚本,很多测试工具都支持自定义编写测试脚本,比如LoadRunner就有很多自定义脚本的协议,比如"C Vuser","JavaV ...
- Java测试工程师技术面试题库【持续补充更新】
请你说一下设计测试用例的方法 黑盒测试: 1.等价类划分等价类划分是将系统的输入域划分为若干部分,然后从每个部分选取少量代表性数据进行测试.等价类可以划分为有效等价类和无效等价类,设计测试用例的时候要 ...
- rabbitmq java 测试_RabbitMQ 简单测试
RabbitMQ 测试 RabbitMQ 基于Erlang 实现, 客户端可以用Python | Java | Ruby | PHP | C# | Javascript | Go等语言来实现.这里做个 ...
- 技嘉显卡性能测试软件,理论性能测试_技嘉 AORUS GTX 1070 Gaming Box_显卡评测-中关村在线...
理论性能测试:3DMark FireStrike 首先进行的是用来衡量显卡理论DX11性能的3DMark FireStrike测试,选择模式为Extreme,对应的是2K分辨率,测试结果如下: 3DM ...
- 软件性能测试平台,评测平台介绍及理论性能测试
评测平台介绍与说明:硬件平台 CPUIntel Core i7 4770K 主板华硕Z87-A 内存金士顿HyperX FURY DDR3-1600 8Gx2 硬盘系统盘:浦科特M5P 512G 测试 ...
最新文章
- nginx中的event模块
- HDU 5306 Gorgeous Sequence
- shader 4 杂 一些和函数名词、数据结构
- html中的点击事件
- jQuery获取iframe的document对象的方法
- C++this指针的用途
- mfc强制局部区域刷新_简述JVM内存区域划分
- QUIC/UDT/SRT
- for循环的嵌套,for循环的穷举迭代
- 计算机网络自顶向下方法第七版第六章答案,《计算机网络 自顶向下方法》(第7版)答案(第六章)(一)...
- 统计学中常被误用的分析方法
- 应届生拿到offer之后的流程_应届生雷区:拿到offer不想去了怎么办?小心登上HR黑名单!...
- 2019暑期建模培训简单总结
- CAS5.3自定义密码(LADP)认证(三)
- python程序基础网课答案_知到Python程序设计基础网课答案
- 如何看待996的工作模式
- scala学习笔记:各种奇怪的写法
- 联发科MT5592数字电视DTV芯片处理器参数介绍
- matlab 谐波电压含有量,电流平均值谐波检测方法MATLAB仿真
- 竞态条件(race condition)
热门文章
- 线性结构和非线性结构简单介绍
- 引用 如何开通雅虎免费邮箱的POP功能,自动转发邮件
- 护肤品微商如何在小红书引流?护肤品产品如何提升销量呢?
- C++字符串的输入输出
- 新锐摄影师罗冰个展《初绽》在洛杉矶举办 展现新概念东方文化美学
- Activti整合SSM的异常(DbSqlSession或者Error creating bean with name 'processEngine)
- SQL系统表及DBCC内容,不断加入
- 基于mybatis拦截器实现数据权限
- 搜源网 (一个国内非常优秀的源代码搜索引擎)
- “军装照”背后——天天P图如何应对10亿流量的后台承载