sisoftware java测试_SiSoftware理论性能测试10980XE vs 3950X vs 7900X vs 9900K

本帖最后由 jerrytsao 于 2019-11-23 19:14 编辑

转总现在修身养性, 不方便上论坛, 帮贴

https://www.sisoftware.co.uk/2019/11/20/intel-core-i9-10980x-cascade-lake-review-benchmarks-cpu-18-core-36-thread-avx512/

10980XE 18C vs 3950X 16C vs 7900X 10C vs 9900K 8C

Native Dhrystone Integer (GIPS) 779 [+3%] 400 455 753 CSL-X is just 3% faster than Ryzen3.

Native Dhrystone Long (GIPS) 835 [+11%] 393 448 750 With a 64-bit integer workload – the gain is 11%.

Native FP32 (Float) Whetstone (GFLOPS) 459 [-1%] 236 262 464 With floating-point workload we have a tie.

Native FP64 (Double) Whetstone (GFLOPS) 379 [-4%] 196 223 393 With FP64 it is 4% slower than R3.

Native Integer (Int32) Multi-Media (Mpix/s) 2,341 [+25%] 985 1,430 1,873 In this vectorised integer test, AVX512 allows CSL-X a 25% win.

Native Long (Int64) Multi-Media (Mpix/s) 913 [+23%] 414 550 744 With a 64-bit AVX2 integer workload the gain is 23%.

Native Quad-Int (Int128) Multi-Media (Mpix/s) 12.92 [=] 6.75 9.58 12.98 This is a tough test using Long integers to emulate Int128 without SIMD it’s a tie.

Native Float/FP32 Multi-Media (Mpix/s) 2,676 [+36%] 914 1,740 1,970 In this floating-point vectorised test, CSL-X is 36% faster.

Native Double/FP64 Multi-Media (Mpix/s) 1,738 [+45%] 535 1,140 1,200 Switching to FP64 SIMD code, the gain is 45%.

Native Quad-Float/FP128 Multi-Media (Mpix/s) 56.4 [+21%] 23 38.7 46.5 In this heavy algorithm using FP64 to mantissa extend FP128 CSL-X is 21% faster.

Crypto AES-256 (GB/s) 33.9 [2.6x] 17.6 34 13 With AES/HWA support CSL-X wins due to 4-channels.

Crypto AES-128 (GB/s) 33.9 [2.6x] 17.6 34 13 No change with AES128.

Crypto SHA2-256 (GB/s) 33.5 [+17%] 12 26 28.6 Without SHA/HWA CSL-X still wins.

Crypto SHA1 (GB/s) 22.9 38 Less compute intensive SHA1 .

Crypto SHA2-512 (GB/s) 9 21 SHA2-512 is not accelerated by SHA/HWA.

Black-Scholes float/FP32 (MOPT/s) 276 344 With non vectorised workload.

Black-Scholes double/FP64 (MOPT/s) 497 [+61%] 238 277 308 Using FP64 CSL-X is 60% faster than Ryzen3.

Binomial float/FP32 (kOPT/s) 59.9 68.3 Binomial uses thread shared data thus stresses the cache & memory system.

Binomial double/FP64 (kOPT/s) 128 [+3%] 61.6 68 124 With FP64 code CSL-X is just 3% faster.

Monte-Carlo float/FP32 (kOPT/s) 56.5 257 Monte-Carlo also uses thread shared data but read-only thus reducing modify pressure on the caches.

Monte-Carlo double/FP64 (kOPT/s) 178 [-16%] 44.5 103 212 Switching to FP64 CSL-X is 16% slower.

SGEMM (GFLOPS) float/FP32 375 413 In this tough vectorised AVX2/FMA algorithm.

DGEMM (GFLOPS) double/FP64 240 [+45%] 209 212 165 With FP64 vectorised code, CSL-X is 45% faster.

SFFT (GFLOPS) float/FP32 22.3 28.6 FFT is also heavily vectorised but stresses the memory sub-system more.

DFFT (GFLOPS) double/FP64 22.07 [+2.6x] 11.21 14.6 8.56 With FP64 code, CSL-X is over 2x faster.

SNBODY (GFLOPS) float/FP32 557 638 N-Body simulation is vectorised but with more memory accesses.

DNBODY (GFLOPS) double/FP64 292 [-25%] 171 195 388 With FP64 code CSL-X is 25% slower.

Blur (3×3) Filter (MPix/s) 7,295 [+2.53x] 2,560 4,880 2,883 In this vectorised integer workload CSL-X is 2.5x faster.

Sharpen (5×5) Filter (MPix/s) 2,868 [+54%] 1,000 1,880 1,857 Same algorithm but more shared data still 54%.

Motion-Blur (7×7) Filter (MPix/s) 1,724 [+80%] 519 1,000 959 Same algorithm but even more data shared 80% faster.

Edge Detection (2*5×5) Sobel Filter (MPix/s) 2,285 [+44%] 827 1,500 1,589 Different algorithm but still vectorised workload 44% faster.

Noise Removal (5×5) Median Filter (MPix/s) 332 [+91%] 78 221 174 Still vectorised code again almost 2x faster.

Oil Painting Quantise Filter (MPix/s) 112 [+2.25x] 42.2 66.7 49.7 Even better improvement here of 2.25x

Diffusion Randomise (XorShift) Filter (MPix/s) 3,573 [2.37x] 4,000 3,084 1,505 With integer workload 2.5x faster.

Marbling Perlin Noise 2D Filter (MPix/s) 1,162 [+85%] 596 776 627 In this final test again with integer workload CSL-X is 85% faster.

结论

Thanks to AVX512 CSL-X manages to easily beat Ryzen3 in heavily vectorised algorithms (up to 50% faster) and also in memory-bandwidth heavy algorithms (due to its 4-channels memory sub-system). But, despite having 2 extra cores, on older AVX2/FMA we pretty much have a tie – not something we are used to see from Intel.

Also, the improvement over older SKL-X is exactly in-line with the increase in cores (18 vs 10 here) – thus there are no appreciable core improvements to boost performance. Without specific VNNI-accelerated algorithms, there is no point for SKL-X users to upgrade: you do get more cores for a lot less money but your hardware is also worth a lot less.

It shows how much Ryzen3 has improved (especially due to 256-bit width AVX2/FMA units) and ThreadRipper with 4-channels and even more cores (up to 32) and threads (up to 64) should nullify Intel's AVX512 benefit.

sisoftware java测试_SiSoftware理论性能测试10980XE vs 3950X vs 7900X vs 9900K相关推荐

sisoftware java测试_SiSoftware Sandra测试及全文总结_内存硬盘评测-中关村在线
●SiSoftware Sandra测试这是一套功能强大的系统分析评测工具,拥有超过30种以上的测试项目,主要包括有 CPU.Drives.CD-ROM/DVD.Memory.SCSI.APM/AC ...
sisoftware java测试_Super PI及SiSoftware Sandra测试_内存硬盘评测-中关村在线
Super PI 1M计算性能 Super PI是一款平台计算性能测试工具,通过测量算完一定PI后一定位数需要的时间来评价平台的计算性能,性能的影响因素包括CPU和内存. DDR3 1600频率首先 ...
jmeter的java测试框架_性能测试学习之路（四）jmeter 脚本开发实战(JDBC JMS 接口脚本轻量级接口自动化测试框架)...
1.业务级脚本开发登录脚本->思路:在线程组下新建两个HTTP请求,一个是完成访问登录页,一个是完成登录的数据提交. 步骤如下: 1) 访问登录页 2) 提交登录数据的HTTP PS:对于业务 ...
jmeter之java代码性能测试_松勤软件性能测试-自定义编写的Java测试代码在Jmeter中如何使用...
原标题:松勤软件性能测试-自定义编写的Java测试代码在Jmeter中如何使用我们在做性能测试时,有时需要自己编写测试脚本,很多测试工具都支持自定义编写测试脚本,比如LoadRunner就有很多自定 ...
apachejmeter_java源码_自定义编写jmeter的Java测试代码
我们在做性能测试时,有时需要自己编写测试脚本,很多测试工具都支持自定义编写测试脚本,比如LoadRunner就有很多自定义脚本的协议,比如"C Vuser","JavaV ...
Java测试工程师技术面试题库【持续补充更新】
请你说一下设计测试用例的方法黑盒测试: 1.等价类划分等价类划分是将系统的输入域划分为若干部分,然后从每个部分选取少量代表性数据进行测试.等价类可以划分为有效等价类和无效等价类,设计测试用例的时候要 ...
rabbitmq java 测试_RabbitMQ 简单测试
RabbitMQ 测试 RabbitMQ 基于Erlang 实现, 客户端可以用Python | Java | Ruby | PHP | C# | Javascript | Go等语言来实现.这里做个 ...
技嘉显卡性能测试软件,理论性能测试_技嘉 AORUS GTX 1070 Gaming Box_显卡评测-中关村在线...
理论性能测试:3DMark FireStrike 首先进行的是用来衡量显卡理论DX11性能的3DMark FireStrike测试,选择模式为Extreme,对应的是2K分辨率,测试结果如下: 3DM ...
软件性能测试平台,评测平台介绍及理论性能测试
评测平台介绍与说明:硬件平台 CPUIntel Core i7 4770K 主板华硕Z87-A 内存金士顿HyperX FURY DDR3-1600 8Gx2 硬盘系统盘:浦科特M5P 512G 测试 ...

sisoftware java测试_SiSoftware理论性能测试10980XE vs 3950X vs 7900X vs 9900K

sisoftware java测试_SiSoftware理论性能测试10980XE vs 3950X vs 7900X vs 9900K相关推荐

最新文章

热门文章