env: MANJARO ARM aarch64
commit id: fff16a025d21feb11ae51e86365abd8bfd86e900
2022 Apr 28

写在前面的话:

raspberry os在64位同样也做了测试,历经坎坷,但是收效甚微,而且vulkan驱动不好找。在MANJARO系统使用vulkan更容易一些,这里放出来NCNN在raspberry pi4上的benchnark。先说结论:分配512MB的显存情况下,树莓派GPU+vulkan的表现明显弱于CPU的算力。应该是树莓派的Vulkan支持尚不完善的原因。
另一个vulkan:https://qengineering.eu/install-vulkan-on-raspberry-pi.html

thread 1 cpu:

loop_count = 10
num_threads = 1
powersave = 0
gpu_device = -1
cooling_down = 1squeezenet  min =   87.23  max =   88.53  avg =   87.95squeezenet_int8  min =   76.35  max =   77.11  avg =   76.71mobilenet  min =  140.10  max =  142.05  avg =  140.97mobilenet_int8  min =   95.47  max =   95.79  avg =   95.63mobilenet_v2  min =  101.54  max =  102.40  avg =  101.99mobilenet_v3  min =   82.33  max =   83.70  avg =   82.99shufflenet  min =   50.17  max =   51.85  avg =   50.94shufflenet_v2  min =   48.87  max =   49.48  avg =   49.10mnasnet  min =   92.76  max =   93.17  avg =   92.96proxylessnasnet  min =  111.36  max =  112.04  avg =  111.67efficientnet_b0  min =  178.04  max =  178.42  avg =  178.24efficientnetv2_b0  min =  202.40  max =  203.09  avg =  202.80regnety_400m  min =  122.74  max =  123.09  avg =  122.91blazeface  min =   15.64  max =   15.91  avg =   15.79googlenet  min =  271.19  max =  272.28  avg =  271.75googlenet_int8  min =  239.69  max =  241.40  avg =  240.20resnet18  min =  216.32  max =  217.22  avg =  216.87resnet18_int8  min =  179.47  max =  179.86  avg =  179.68alexnet  min =  202.26  max =  202.81  avg =  202.54vgg16  min = 1286.14  max = 1291.54  avg = 1287.90vgg16_int8  min =  994.59  max = 1002.22  avg =  999.48resnet50  min =  613.59  max =  628.67  avg =  618.64resnet50_int8  min =  487.12  max =  489.30  avg =  488.22squeezenet_ssd  min =  201.68  max =  202.58  avg =  202.05squeezenet_ssd_int8  min =  174.25  max =  176.63  avg =  175.01mobilenet_ssd  min =  280.41  max =  281.18  avg =  280.76mobilenet_ssd_int8  min =  192.00  max =  192.72  avg =  192.36mobilenet_yolo  min =  631.44  max =  642.08  avg =  635.85mobilenetv2_yolov3  min =  346.23  max =  347.11  avg =  346.83yolov4-tiny  min =  430.36  max =  432.57  avg =  431.51nanodet_m  min =  118.20  max =  118.70  avg =  118.47yolo-fastest-1.1  min =   59.48  max =   60.90  avg =   60.00yolo-fastestv2  min =   49.94  max =   50.71  avg =   50.22

thread 1 gpu with vulkan

[0 V3D 4.2]  queueC=0[1]  queueG=0[1]  queueT=0[1]
[0 V3D 4.2]  bugsbn1=0  bugbilz=0  bugcopc=0  bugihfa=0
[0 V3D 4.2]  fp16-p/s/a=1/1/0  int8-p/s/a=1/1/0
[0 V3D 4.2]  subgroup=16  basic=1  vote=0  ballot=0  shuffle=0
loop_count = 10
num_threads = 1
powersave = 0
gpu_device = 0
cooling_down = 1squeezenet  min =  308.47  max =  309.27  avg =  308.87squeezenet_int8  min =   83.22  max =   83.79  avg =   83.47mobilenet  min =  345.16  max =  345.46  avg =  345.27mobilenet_int8  min =   99.83  max =  101.45  avg =  100.58mobilenet_v2  min =  244.97  max =  245.23  avg =  245.08mobilenet_v3  min =  231.20  max =  231.39  avg =  231.28shufflenet  min =  143.88  max =  144.13  avg =  144.01shufflenet_v2  min =  192.31  max =  192.94  avg =  192.42mnasnet  min =  249.60  max =  249.76  avg =  249.70proxylessnasnet  min =  265.37  max =  265.52  avg =  265.44efficientnet_b0  min =  374.82  max =  375.16  avg =  374.99efficientnetv2_b0  min =  625.12  max =  626.87  avg =  625.78regnety_400m  min =  318.96  max =  319.88  avg =  319.25blazeface  min =   52.95  max =   53.52  avg =   53.13googlenet  min =  803.00  max =  803.40  avg =  803.20googlenet_int8  min =  245.09  max =  247.60  avg =  246.22resnet18  min =  895.02  max =  896.33  avg =  895.97resnet18_int8  min =  181.80  max =  183.03  avg =  182.34alexnet  min =  499.71  max =  500.70  avg =  500.26vgg16  min = 4311.12  max = 4312.92  avg = 4312.02vgg16_int8  min =  998.62  max = 1003.33  avg = 1001.12resnet50  min = 2022.53  max = 2023.51  avg = 2023.05resnet50_int8  min =  490.57  max =  494.13  avg =  492.33squeezenet_ssd  min = 1143.39  max = 1144.38  avg = 1143.78squeezenet_ssd_int8  min =  177.60  max =  180.83  avg =  179.01mobilenet_ssd  min =  816.07  max =  816.63  avg =  816.43mobilenet_ssd_int8  min =  195.79  max =  196.89  avg =  196.37mobilenet_yolo  min = 1622.98  max = 1623.30  avg = 1623.19mobilenetv2_yolov3  min =  808.26  max =  808.45  avg =  808.37yolov4-tiny  min = 1704.12  max = 1704.79  avg = 1704.52nanodet_m  min =  389.94  max =  390.18  avg =  390.07yolo-fastest-1.1  min =  200.23  max =  200.50  avg =  200.36yolo-fastestv2  min =  164.09  max =  164.36  avg =  164.19

thread 4 cpu:

loop_count = 10
num_threads = 4
powersave = 0
gpu_device = -1
cooling_down = 1squeezenet  min =   51.93  max =   52.47  avg =   52.14squeezenet_int8  min =   42.81  max =   43.33  avg =   43.07mobilenet  min =   62.97  max =   66.78  avg =   63.72mobilenet_int8  min =   36.24  max =   39.39  avg =   36.69mobilenet_v2  min =   61.20  max =   62.30  avg =   61.62mobilenet_v3  min =   48.56  max =   75.32  avg =   51.63shufflenet  min =   34.52  max =   54.34  avg =   36.62shufflenet_v2  min =   27.39  max =   27.79  avg =   27.52mnasnet  min =   52.07  max =   54.51  avg =   52.62proxylessnasnet  min =   54.93  max =   56.66  avg =   55.43efficientnet_b0  min =   81.97  max =   82.88  avg =   82.32efficientnetv2_b0  min =   89.38  max =   90.46  avg =   89.88regnety_400m  min =   75.65  max =   76.17  avg =   75.81blazeface  min =   10.88  max =   11.08  avg =   10.98googlenet  min =  129.04  max =  131.39  avg =  129.72googlenet_int8  min =  106.56  max =  107.41  avg =  106.93resnet18  min =  152.15  max =  166.36  avg =  158.47resnet18_int8  min =   85.29  max =   86.35  avg =   85.82alexnet  min =  130.07  max =  132.20  avg =  130.82vgg16  min =  812.36  max = 1004.54  avg =  903.46vgg16_int8  min =  437.49  max = 1657.19  avg =  726.93resnet50  min =  315.49  max =  391.08  avg =  348.88resnet50_int8  min =  258.68  max =  396.38  avg =  286.31squeezenet_ssd  min =  177.35  max =  242.16  avg =  199.35squeezenet_ssd_int8  min =  119.77  max =  123.66  avg =  122.09mobilenet_ssd  min =  151.96  max =  176.89  avg =  162.62mobilenet_ssd_int8  min =   82.95  max =   98.27  avg =   87.34mobilenet_yolo  min =  336.06  max =  364.58  avg =  347.83mobilenetv2_yolov3  min =  194.39  max =  254.23  avg =  208.75yolov4-tiny  min =  250.72  max =  263.13  avg =  254.51nanodet_m  min =   71.37  max =   72.80  avg =   71.86yolo-fastest-1.1  min =   47.95  max =   57.21  avg =   49.25yolo-fastestv2  min =   38.46  max =   38.71  avg =   38.57

thread 4 gpu with vulkan

[0 V3D 4.2]  queueC=0[1]  queueG=0[1]  queueT=0[1]
[0 V3D 4.2]  bugsbn1=0  bugbilz=0  bugcopc=0  bugihfa=0
[0 V3D 4.2]  fp16-p/s/a=1/1/0  int8-p/s/a=1/1/0
[0 V3D 4.2]  subgroup=16  basic=1  vote=0  ballot=0  shuffle=0
loop_count = 10
num_threads = 4
powersave = 0
gpu_device = 0
cooling_down = 1squeezenet  min =  305.52  max =  306.38  avg =  305.92squeezenet_int8  min =   44.80  max =   54.47  avg =   46.48mobilenet  min =  342.60  max =  342.87  avg =  342.69mobilenet_int8  min =   37.23  max =   37.81  avg =   37.42mobilenet_v2  min =  245.07  max =  247.07  avg =  245.34mobilenet_v3  min =  230.95  max =  231.27  avg =  231.07shufflenet  min =  143.72  max =  144.89  avg =  144.03shufflenet_v2  min =  192.21  max =  192.48  avg =  192.31mnasnet  min =  249.26  max =  250.15  avg =  249.53proxylessnasnet  min =  265.09  max =  265.51  avg =  265.26efficientnet_b0  min =  374.56  max =  376.18  avg =  374.95efficientnetv2_b0  min =  624.94  max =  637.84  avg =  627.64regnety_400m  min =  318.95  max =  319.99  avg =  319.19blazeface  min =   53.02  max =   53.14  avg =   53.06googlenet  min =  803.82  max =  804.77  avg =  804.07googlenet_int8  min =  107.19  max =  119.33  avg =  109.47resnet18  min =  895.64  max =  897.14  avg =  896.53resnet18_int8  min =   86.94  max =   87.82  avg =   87.40alexnet  min =  499.26  max =  501.15  avg =  500.33vgg16  min = 4315.99  max = 4317.85  avg = 4316.88vgg16_int8  min =  412.25  max =  438.12  avg =  418.59resnet50  min = 2024.29  max = 2025.05  avg = 2024.64resnet50_int8  min =  223.42  max =  272.70  avg =  230.76squeezenet_ssd  min = 1144.16  max = 1144.95  avg = 1144.46squeezenet_ssd_int8  min =  112.33  max =  122.58  avg =  114.04mobilenet_ssd  min =  816.72  max =  817.11  avg =  816.89mobilenet_ssd_int8  min =   77.19  max =   77.76  avg =   77.53mobilenet_yolo  min = 1623.28  max = 1623.88  avg = 1623.53mobilenetv2_yolov3  min =  808.65  max =  808.88  avg =  808.77yolov4-tiny  min = 1704.79  max = 1706.01  avg = 1705.30nanodet_m  min =  389.68  max =  390.61  avg =  389.88yolo-fastest-1.1  min =  199.75  max =  200.16  avg =  199.87yolo-fastestv2  min =  163.96  max =  164.44  avg =  164.05

raspberry pi4B ncnn cpu vulkan benchmark相关推荐

  1. 监控树莓派Raspberry Pi的CPU/GPU的温度

    监控树莓派Raspberry Pi的CPU/GPU的温度 树莓派Raspberry Pi的CPU/GPU的温度对于Pi的温度.高效运行非常重要,所以我们要实时监控树莓派Raspberry Pi的CPU ...

  2. 【NCNN解读】——benchmark部分

    在打开github ncnn项目首页:https://github.com/Tencent/ncnn 你会看到整个项目包含的内容,其中第一个就是benchmark文件夹: 点进去看,readme.md ...

  3. VS2017中NCNN使用vulkan

    目录 1. 编译ncnn 2. 配置ncnn 3. 调用Vulkan 4. 总结 1. 编译ncnn 1. 编译opencv和portobuf,也可以直接下载windows版本的opencv 2. 安 ...

  4. 在win10+VS2019上编译支持Vulkan SDK的ncnn

    目录 Vulkan SDK下载与安装 nvidia显卡驱动更新 支持Vulkan的ncnn编译 参考官方:FAQ ncnn vulkan · Tencent/ncnn Wiki 事先说明,之前我编译过 ...

  5. pc_win10_x64安装ncnn,并使用vulkan

    只是简单记录,免得以后忘记 参考: https://github.com/Tencent/ncnn/wiki/how-to-build#build-for-windows-x64-using-visu ...

  6. 50、ubuntu18.0420.04+CUDA11.1+cudnn11.3+TensorRT7.2/8.6+Deepsteam5.1+vulkan环境搭建和YOLO5部署

    基本思想:想学习一下TensorRT的使用,随笔记录一下: 链接:https://pan.baidu.com/s/1uFOktdF-bHcDDsufIqmNSA  提取码:k55w  复制这段内容后打 ...

  7. 无人驾驶(ncnn学习)

    [ 声明:版权所有,欢迎转载,请勿用于商业用途.  联系信箱:feixiaoxing @163.com] ncnn是腾讯开放的一个深度学习库.它的主要使用场景是嵌入式设备.如果大家已经用tensorf ...

  8. [深度学习] ncnn编译使用

    文章目录 工程 ncnn工程编译使用(cpu) ncnn工程编译使用(vulkan) 参考 工程 ncnn工程编译使用(cpu) 在linux下建立如CMakeLists文件即可编译生成ncnn工程 ...

  9. 20、NanoDet训练、测试 以及使用ncnn部署Jetson Nano 进行目标检测和串口数据转发

    基本思想:最近想尝试一下nano 上部署nanodet,于是记录一下训练过程,手中有一份labelme标注的数据集,于是开始了一波操作~ 首先进行划分数据集分为训练集和验证集 31.TensorFlo ...

  10. 将Raspberry Pi用作台式PC的17个最佳Raspbian应用

    如果要使用Raspberry Pi替代台式机,找到在Raspberry Pi上运行的优秀应用程序很重要. 在撰写本文之前的两天内,我将Raspberry Pi用作台式机. 它向您说明了我发现哪些应用可 ...

最新文章

  1. struts(三)——struts框架实现登录示例
  2. wegame饥荒一直连接中_腾讯WeGame注册用户超3亿 国产游戏销量超500万
  3. linux自动挂起什么意思,Linux中进行挂起(待机)的命令说明
  4. 运行配置文件中指定类的指定方法
  5. HH SaaS电商系统的商品系统设计
  6. 计算机网络之物理层:5、数据的交换方式(电路交换、报文交换、分组交换)
  7. 全连接层的作用_全连接层实现
  8. bzoj 3165: [Heoi2013]Segment 线段树
  9. Leetcode 304.二维区域和检索-矩阵不可变
  10. android应用程序跳转到系统的各个设置页面
  11. python下载模块的两种方式(模块环境不一致问题解决)
  12. 程序员为什么要写博客?怎么写博客?
  13. win7不用破解工具,最简单的去黑屏办法
  14. 通过脚手架安装Ant+react+umi+dva项目(一)
  15. Vue弹性标题栏(收缩扩张标题栏背景)
  16. 四川省巴中市谷歌高清卫星地图下载
  17. gpg加密命令 linux_使用 GPG 加密和解密文件
  18. mysql按1-12月查询统计数据
  19. wxpython下载安装过程
  20. 图像区分平坦区域、边缘、角点区域

热门文章

  1. UVC系列3-研究UVC控制协议
  2. 【STM32F429】第5章 ThreadX NetXDUO网络协议栈介绍
  3. 今天给大家带来搜题公众号搭建教程(附赠搜题接口 还支持语音图片搜题)
  4. Mac无法打开“XX”,因为Apple无法检查其是否包含恶意软件。”的解决办法
  5. java正则表达式校验车牌号_车牌号校验正则表达式
  6. java小游戏飞机大战,java飞机大战小游戏
  7. 华为手机屏幕锁屏时间设置_华为手机锁屏时间怎么设置?
  8. 烽火狼烟丨Microsoft多个安全漏洞风险提示
  9. iar烧录程序步骤_如何利用IAR单片机编程软件建立烧录程序
  10. 简单c语言程序例子与运行结果图,C语言程序第一次作业