VGG Very Deep Convolutional Networks for Large-Scale Visual Recognition

Karen Simonyan and Andrew Zisserman ICLR, 2014 (PDF) (Citations 73354)

Contribution

  • 通过堆叠多个3x3的卷积核来替代大尺度卷积核(减少所需参数,两个3x3的卷积核和一个5x5的卷积核具有相同的感受野,三个3x3的卷积核和一个7x7的卷积核具有相同的感受野)。
  • AlexNet提出的LRN实际用处不大(可以使用BN)。

Details

  • ILSVRC’14 2nd in classification, 1st in localization
  • Use VGG16 or VGG19 (VGG19 only slightly better, more memory)
  • Use ensembles for best results
  • FC7 features generalize well to other tasks

参数计算(VGG16, not counting biases)

Layer input size memory params
INPUT [224×224×3] 224×224×3=150K 0
CONV3-64 [224×224×64] 224×224×64=3.2M (3×3×3)×64=1,728
CONV3-64 [224×224×64] 224×224×64=3.2M (3×3×64)×64=36,864
POOL2 [112×112×64] 112×112×64=800K 0
CONV3-128 [112×112×128] 112×112×128=1.6M (3×3×64)×128=73,728
CONV3-128 [112×112×128] 112×112×128=1.6M (3×3×128)×128=147,456
POOL2 [56×56×128] 56×56×128=400K 0
CONV3-256 [56×56×256] 56×56×256=800K (3×3×128)×256=294,912
CONV3-256 [56×56×256] 56×56×256=800K (3×3×256)×256=589,824
CONV3-256 [56×56×256] 56×56×256=800K (3×3×256)×256=589,824
POOL2 [28×28×256] 28×28×256=200K 0
CONV3-512 [28×28×512] 28×28×512=400K (3×3×256)×512=1,179,648
CONV3-512 [28×28×512] 28×28×512=400K (3×3×512)×512=2,359,296
CONV3-512 [28×28×512] 28×28×512=400K (3×3×512)×512=2,359,296
POOL2 [14×14×512] 14×14×512=100K 0
CONV3-512 [14×14×512] 14×14×512=100K (3×3×512)×512=2,359,296
CONV3-512 [14×14×512] 14×14×512=100K (3×3×512)×512=2,359,296
CONV3-512 [14×14×512] 14×14×512=100K (3×3×512)×512=2,359,296
POOL2 [7×7×512] 7×7×512=25K 0
D [1×1×4096] 4096 7×7×512×4096=102,760,448
FC [1×1×4096] 4096 4096×4096 = 16,777,216
FC [1×1×1000] 1000 4096×1000 = 4,096,000

TOTAL memory: 24M × 4 bytes ≈ 96MB / image (for a forward pass)

TOTAL params: 138M parameters

Notes:

  • Most memory is in early CONV
  • Most params are in late FC

References

  • cs231n

论文笔记 Very Deep Convolutional Networks for Large-Scale Visual Recognition - ICLR 2014相关推荐

  1. VGGNet论文翻译-Very Deep Convolutional Networks for Large-Scale Image Recognition

    Very Deep Convolutional Networks for Large-Scale Image Recognition Karen Simonyan[‡] & Andrew Zi ...

  2. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition 一般视觉识别的深度卷积刺激特征

    DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition 一般视觉识别的深度卷积刺激特征 Abstra ...

  3. 论文阅读——Quantizing deep convolutional networks for efficient inference: A whitepaper

    Quantizing deep convolutional networks for efficient inference: A whitepaper Abstract 本文针对如何对卷积神经网络的 ...

  4. 关于GCN的论文笔记--End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion

    用于知识图谱完成的端到端结构感知卷积网络 论文题目 End-to-end Structure-Aware Convolutional Networks for Knowledge Base Compl ...

  5. 论文阅读-VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

    作者: Karen Simonyan et al. 日期: 2015 类型: conference article 来源: ICLR 评价: veyr deep networks 论文链接: http ...

  6. 论文笔记《Fully Convolutional Networks for Semantic Segmentation》

    [论文信息] <Fully Convolutional Networks for Semantic Segmentation> CVPR 2015 best paper key word: ...

  7. 【深度学习论文笔记】DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

     时间:2014/7/29 10:00 论文题目:DeCAF: A Deep Convolutional Activation Featurefor Generic Visual Recognit ...

  8. 【论文笔记】Region-based Convolutional Networks for Accurate Object Detection and Segmentation

    <Region-based Convolutional Networks for Accurate Object Detection and Segmentation>是将卷积神经网络应用 ...

  9. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

    2018.4.22星期日 [1]Donahue J, Jia Y, Vinyals O, et al. DeCAF: A Deep ConvolutionalActivation Feature fo ...

最新文章

  1. 表单高级应用和语义化
  2. 2021 IEEE热门AI话题盘点:模仿生物大脑打造神经网络、GPT3“不当言论”惹关注…...
  3. IOS开发基础知识--碎片13
  4. c语言程序设计的一般错误的是,《C语言程序设计》第十章 程序常见错误分析.pdf...
  5. 【java】JVM中Perm区持续上涨问题
  6. observable java_java源码阅读Observable(观察者模式)
  7. 手把手教你如何建立自己的Linux系统
  8. 深入浅出Mysql 读书笔记
  9. java计算机毕业设计工会会员管理系统MyBatis+系统+LW文档+源码+调试部署
  10. 网站怎么移动适配?这些小技巧你必须要会
  11. 【寻找最佳小程序】13期:心算练习——寓教于乐,练就小朋友强大的算术能力...
  12. VB.NET学习笔记:使用Random类生成随机数(不重复、数字、字母)
  13. 搜狗浏览器个人数据丢失解决方案
  14. PySpark基础入门(3):RDD持久化
  15. 大数据+物联网智能交通系统
  16. 次元服务器的位置,暮色次元服务器介绍
  17. Flutter 状态管理之Bloc下
  18. 静态方法与非静态方法区别
  19. 有苦有乐的算法 --- 随机快排
  20. 写给朝九晚五的上班族的一封信(转)

热门文章

  1. 还不看看嘛!互联网技术面试常问问题汇总及回答技巧总结,听说看过的都面试上大厂了~(doge)
  2. 自己写的扒谱助手apk分享(永久0积分免费下载)
  3. opencv设置摄像头分辨率不生效
  4. mysql中的正则操作 匹配手机号,匹配中文,替换
  5. 图像卷积和滤波的区别
  6. Matlab三维数据画图和等高线数据提取
  7. 5个商用字体网站,建议收藏
  8. Sublime Text教程
  9. Unity 使用VideoPlayer做一个类似于视频播放器的界面
  10. 什么是订货软件?有哪些功能?订货软件和订货系统有什么区别?