LeNet 的权重总量计算


上面这个图的左下角就是每层的权重多少的计算。
注意-下采样(池化层)不需要训练权重

Conv 1: 1x6x5x5 +6 = 156
Conv3: 6x16x5x5 + 16 = 2416
6(上一层有几个Feature Map) x16(下一层有几个feature Map)x(5x5)(核大小)
FC1: 400x120 +120 = 48120
FC2: 120x84 + 84 = 10164
FC3: 84x10 + 10 = 850
总计: = 61706

下面这个图可以帮你理解卷积层的权重多少的计算

下面的Tensorflow的代码是LeNet的模型。

import tensorflow as tf
model = tf.keras.models.Sequential([tf.keras.layers.Reshape((32,32,1),input_shape=(32,32)),tf.keras.layers.Conv2D(filters=6, kernel_size=(5,5), padding='valid',activation='relu'),tf.keras.layers.MaxPool2D(pool_size=(2, 2)),tf.keras.layers.Conv2D(filters=16, kernel_size=(5,5), padding='valid'),tf.keras.layers.MaxPool2D(pool_size=(2, 2)),tf.keras.layers.Flatten(),tf.keras.layers.Dense(120,activation='relu'),tf.keras.layers.Dense(84, activation='relu'),tf.keras.layers.Dense(10, activation='softmax')
])model.summary()

调用summary() 方法就可以打印出所要训练的权重的多少或参数的多少

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
reshape (Reshape)            (None, 32, 32, 1)         0
_________________________________________________________________
conv2d (Conv2D)              (None, 28, 28, 6)         156
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 6)         0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 10, 10, 16)        2416
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 5, 5, 16)          0
_________________________________________________________________
flatten (Flatten)            (None, 400)               0
_________________________________________________________________
dense (Dense)                (None, 120)               48120
_________________________________________________________________
dense_1 (Dense)              (None, 84)                10164
_________________________________________________________________
dense_2 (Dense)              (None, 10)                850
=================================================================
Total params: 61,706
Trainable params: 61,706
Non-trainable params: 0

下面是一个Lenet5的完整代码,由于tensorflow提供的数据集是28x28的,所以输入的大小不是像上面的那样32x32。
还有就是我加了batchNormalization (如果是卷积,每一个channel需要4参数, 如果是全连接,每个神经元需要4个参数,
4个参数:gamma weights, beta weights, moving_mean(non-trainable), moving_variance(non-trainable)

import tensorflow as tfmnist = tf.keras.datasets.mnist(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
print(x_train.shape)model = tf.keras.models.Sequential([# Reshape into "channels last" setup.tf.keras.layers.Reshape((28,28,1),input_shape=(28,28)),tf.keras.layers.Conv2D(filters=6, kernel_size=(5,5), padding='valid'),tf.keras.layers.BatchNormalization(),tf.keras.layers.Activation('relu'),tf.keras.layers.MaxPool2D(pool_size=(2, 2)),# LayerNorm Layertf.keras.layers.Conv2D(filters=16, kernel_size=(5,5), padding='valid'),tf.keras.layers.BatchNormalization(),tf.keras.layers.Activation('relu'),tf.keras.layers.MaxPool2D(pool_size=(2, 2)),tf.keras.layers.Flatten(),tf.keras.layers.Dense(120, activation='relu'),tf.keras.layers.BatchNormalization(),tf.keras.layers.Dense(84, activation='relu'),tf.keras.layers.Dense(10, activation='softmax')
])model.summary()def save_weight(epoch, logs):print('save_weight', epoch, logs)model.save_weights('./weights/model.h5')batch_print_callback = tf.keras.callbacks.LambdaCallback(on_epoch_end=save_weight
)model.compile(optimizer='adam',loss='sparse_categorical_crossentropy',metrics=['accuracy'])callbacks = [tf.keras.callbacks.EarlyStopping(patience=4, monitor='val_loss'),batch_print_callback,tf.keras.callbacks.TensorBoard(log_dir='logs')
]model.fit(x_train, y_train, epochs=2, callbacks=callbacks, batch_size=128)
#model.evaluate(x_test, y_test)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
reshape (Reshape)            (None, 28, 28, 1)         0
_________________________________________________________________
conv2d (Conv2D)              (None, 24, 24, 6)         156
_________________________________________________________________
batch_normalization (BatchNo (None, 24, 24, 6)         24
_________________________________________________________________
activation (Activation)      (None, 24, 24, 6)         0
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 12, 12, 6)         0
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 8, 8, 16)          2416
_________________________________________________________________
batch_normalization_1 (Batch (None, 8, 8, 16)          64
_________________________________________________________________
activation_1 (Activation)    (None, 8, 8, 16)          0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 4, 4, 16)          0
_________________________________________________________________
flatten (Flatten)            (None, 256)               0
_________________________________________________________________
dense (Dense)                (None, 120)               30840
_________________________________________________________________
batch_normalization_2 (Batch (None, 120)               480
_________________________________________________________________
dense_1 (Dense)              (None, 84)                10164
_________________________________________________________________
dense_2 (Dense)              (None, 10)                850
=================================================================
Total params: 44,994
Trainable params: 44,710
Non-trainable params: 284

float 类型是4个字节

可训练权重的大小 = 44710 x 4 =178840 bytes
不可训练权重的大小 = 284 x 4 =1136 bytes
总的权重 = 178840 + 1136 =179976 bytes

整个模型的大小 = 总的权重+ 模型本身的大小 = 204800 bytes

前面的都是二维的,如果是一维的怎么算呢

例如:input有3 个1x7的channel, 那么用3个1x2的核来做卷积

import tensorflow.keras as keras
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv1D, MaxPooling1D
from keras.models import Sequential
from keras.layers.embeddings import Embedding
import numpy as np
max_word = 400
vocab_size = 88587model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=64, input_length=max_word))
model.add(Conv1D(filters=64, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Dropout(0.25))
model.add(Conv1D(filters=128, kernel_size=3, padding='same', activation='relu'))
model.add(MaxPooling1D(pool_size=2))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))print(model.summary())
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
embedding_1 (Embedding)      (None, 400, 64)           5669568   ## 88587 x 64
_________________________________________________________________
conv1d_1 (Conv1D)            (None, 400, 64)           12352     # 64 x (1x3) x 64 + 64
_________________________________________________________________
max_pooling1d_1 (MaxPooling1 (None, 200, 64)           0
_________________________________________________________________
dropout_1 (Dropout)          (None, 200, 64)           0
_________________________________________________________________
conv1d_2 (Conv1D)            (None, 200, 128)          24704     # 64 x (1x3) x 128 + 128
_________________________________________________________________
max_pooling1d_2 (MaxPooling1 (None, 100, 128)          0
_________________________________________________________________
dropout_2 (Dropout)          (None, 100, 128)          0
_________________________________________________________________
flatten_1 (Flatten)          (None, 12800)             0
_________________________________________________________________
dense_1 (Dense)              (None, 64)                819264     # 12800 x 64 + 64
_________________________________________________________________
dense_2 (Dense)              (None, 32)                2080      # 64 x 32 + 32
_________________________________________________________________
dense_3 (Dense)              (None, 1)                 33        # 32 x 1 + 1
=================================================================
Total params: 6,528,001
Trainable params: 6,528,001
Non-trainable params: 0

88587 x 64 = 5669568
64 x (1x3) x 64 + 64 = 12352
64 x (1x3) x 128 + 128 = 24704
12800 x 64 + 64 = 819264
64 x 32 + 32 = 2080
32 x 1 + 1 = 33

[深度学习-总结]LeNet网络的权重的大小的计算相关推荐

  1. 【深度学习】孪生网络(Siamese Network)的模式和训练过程

    [深度学习]孪生网络(Siamese Network)的模式和训练过程 文章目录 1 概述 2 Siamese network 孪生神经网络 3 孪生神经网络和伪孪生神经网络分别适用于什么场景呢? 4 ...

  2. HALCON 20.11:深度学习笔记(4)--- 网络和训练过程

    HALCON 20.11:深度学习笔记(4)--- 网络和训练过程 HALCON 20.11.0.0中,实现了深度学习方法.关于网络和训练过程如下: 在深度学习中,任务是通过网络发送输入图像来执行的. ...

  3. 【火炉炼AI】深度学习001-神经网络的基本单元-感知器

    [火炉炼AI]深度学习001-神经网络的基本单元-感知器 (本文所使用的Python库和版本号: Python 3.6, Numpy 1.14, scikit-learn 0.19, matplotl ...

  4. 详解深度学习之经典网络:AlexNet(2012) 并利用该网络架构实现人脸识别

    @[TOC](详解深度学习之经典网络:AlexNet(2012) 并利用该网络架构实现人脸识别**) 近来闲来无事,翻出了搁置已久的轻薄版电脑,望着积满灰尘的显示屏,觉得有愧于老师的尊尊教导,心中叹息 ...

  5. 详解深度学习之经典网络架构(十):九大框架汇总

    目录 0.概览 1.个人心得 2.总结 本文是对本人前面讲的的一些经典框架的汇总. 纯手打,如果有不足之处,可以在评论区里留言. 0.概览 (1)详解深度学习之经典网络架构(一):LeNet (2)详 ...

  6. 【深度学习】图网络——悄然兴起的深度学习新浪潮

    [深度学习]图网络--悄然兴起的深度学习新浪潮 https://mp.weixin.qq.com/s/mOZDN9u7YCdtYs6DbUml0Q 现实世界中的大量问题都可以抽象成图模型(Graph ...

  7. 深度学习网络模型——RepVGG网络详解、RepVGG网络训练花分类数据集整体项目实现

    深度学习网络模型--RepVGG网络详解.RepVGG网络训练花分类数据集整体项目实现 0 前言 1 RepVGG Block详解 2 结构重参数化 2.1 融合Conv2d和BN 2.2 Conv2 ...

  8. 【深度学习】Keras加载权重更新模型训练的教程(MobileNet)

    [深度学习]Keras加载权重更新模型训练的教程(MobileNet) 文章目录 1 重新训练 2 keras常用模块的简单介绍 3 使用预训练模型提取特征(口罩检测) 4 总结 1 重新训练 重新建 ...

  9. 新论文推荐:Auto-Keras:自动搜索深度学习模型的网络架构和超参数

    Auto-Keras 是一个开源的自动机器学习库,由美国德州农工大学(Texas A&M University)助理教授胡侠和他的两名博士生:金海峰.Qingquan Song提出.Auto- ...

最新文章

  1. 刑啊!智能音箱让10岁女童摸电门,内容来自网络却不审核,这锅该谁来背?...
  2. 有符号数据的符号位扩展
  3. rand函数的使用方法php,PHP array_rand()函数 使用基础教程
  4. python读取文本文件的三种方法
  5. Cloud一分钟 | 加码云计算!IBM斥340亿美元收购Red Hat
  6. jQuery Zoom 图片聚焦或者点击放大A plugin to enlarge images on touch, click, or mouseover
  7. ROOBO公布A轮1亿美元融资 发布人工智能机器人系统
  8. 便利蜂发布《白领早餐报告》:仅5成白领每天吃早餐
  9. android 自定义 对号,Android自定义View实现打钩动画功能
  10. 如何启用×××服务器端的IPsec功能
  11. 串口发送字符串到串口软件
  12. Java 并发编程如何入门
  13. 安装GitExtentions KDiff3已配置为合并工具,kdiff3的路径未配置
  14. 网页特殊符号(HTML字符实体)大全
  15. 一键构建云上高可用蛋白质结构预测平台
  16. [Hadoop in China 2011] 人人网:基于Hadoop的SNS统计和聚类推荐
  17. java 上传csv/xslx文件,预览,导入到数据库中
  18. 自定义ViewPager和弹性圆PagerIndicator
  19. 计算机辅助英语,计算机辅助英语教学
  20. PCB抄板、PCB打样、PCB反推原理图、PCB设计

热门文章

  1. 使用js对select动态添加和删除OPTION示例代码
  2. linux网络服务之dns
  3. Android美工坊--一个QQ登录验证的小例子
  4. 面向对象的软件开发方法简介
  5. TypeScript Never 与 Unknown
  6. 使用PostgREST的RestAPI操作PostgreSQL数据库教程
  7. 十多款优秀的Vue组件库介绍
  8. Java 9 特性与示例
  9. ddos常见攻击报文
  10. linux内核参数备注