https://blog.csdn.net/notHeadache/article/details/81164264

如下图片解释的清楚,来自stackoverflow

通俗易懂例子

https://www.knowledgemapper.com/knowmap/knowbook/jasdeepchhabra94@gmail.comUnderstandingLSTMinTensorflow(MNISTdataset)

https://i.stack.imgur.com/0Poch.png

标题

其它解释的补充的相当好:LSTM输出并不改变输入的维数,暗示输入输出的维数均为num_units

Good answer, You usually have embeddings for your input data and thus assume for every word for simplicity. So let's say each word has a distributed representation of 150 dimensions which are the features in the above diagram. Then num_units will act as the dimensionality of RNN/LSTM cell (say 128). So 150 -> 128. And hence output dimensions will be 128. Batch size and time_steps remains as it is. – HARSH PATHAK Dec 27 '19 at 16:15

add a comment

5

This term num_units or num_hidden_units sometimes noted using the variable name nhid in the implementations, means that the input to the LSTM cell is a vector of dimension nhid (or for a batched implementation, it would a matrix of shape batch_size x nhid). As a result, the output (from LSTM cell) would also be of same dimensionality since RNN/LSTM/GRU cell doesn't alter the dimensionality of the input vector or matrix.

As pointed out earlier, this term was borrowed from Feed-Forward Neural Networks (FFNs) literature and has caused confusion when used in the context of RNNs. But, the idea is that even RNNs can be viewed as FFNs at each time step. In this view, the hidden layer would indeed be containing num_hidden units as depicted in this figure:

Source: Understanding LSTM


More concretely, in the below example the num_hidden_units or nhid would be 3 since the size of hidden state (middle layer) is a 3D vector.

shareeditfollow

edited Sep 28 '18 at 20:11

answered Sep 28 '18 at 20:01

kmario23

29.9k77 gold badges9797 silver badges106106 bronze badges

  • 1

    You say "the input to the LSTM cell is a vector of dimension nhid". But the input is generally of shape [batch, T, input] where the input can be of any shape. So, when input is dynamically unrolled we would have an input of [b,t, input]. RNN would transform it as [b,t, nhid]. So, the output would be shape nhid not the input. – Vedanshu Oct 27 '18 at 11:41

add a comment

1

Most LSTM/RNN diagrams just show the hidden cells but never the units of those cells. Hence, the confusion. Each hidden layer has hidden cells, as many as the number of time steps. And further, each hidden cell is made up of multiple hidden units, like in the diagram below. Therefore, the dimensionality of a hidden layer matrix in RNN is (number of time steps, number of hidden units).

shareeditfollow

answered Jan 30 '19 at 10:05

basicLSTMCELL() num_units参数代表了LSTM输出向量的维数相关推荐

  1. lstm num_units 参数理解

    前言 关于LSTM原理: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 关于LSTM原理(译文):https://blog.csd ...

  2. DL之模型调参:深度学习算法模型优化参数之对LSTM算法进行超参数调优

    DL之模型调参:深度学习算法模型优化参数之对LSTM算法进行超参数调优 目录 基于keras对LSTM算法进行超参数调优 1.可视化LSTM模型的loss和acc曲线

  3. Tensorflow LSTM实现多维输入输出预测实践详解

    摘要:算法模型基于动态多隐层LSTM RNN搭建,损失函数使用cross_entropy损失最大值,输入M维度.输出N维度.代码基于Python3.6.X和Tensorflow1.13.X实现. 1. ...

  4. Keras LSTM实现多维输入输出时序预测实践详解

    自古以来,我们就希望预知未来,现如今,随着大数据人工智能技术的发展,我们早已经不满足传统的同比.环比等数据分析方法,但是时间序列趋势预测的传统算法又很专业,很难用于日常生产经营中. 深度学习神经网络为 ...

  5. Java (1)写一个类,名为Animal, 该类有两个私有属性,name(代表动物的名字),和legs(代表动物的腿的条数);要求为两个私有属性提供public的访问方法。并提供两个重载的构造方法,

    Java (1)写一个类,名为Animal, 该类有两个私有属性,name(代表动物的名字),和legs(代表动物的腿的条数);要求为两个私有属性提供public的访问方法.并提供两个重载的构造方法, ...

  6. HOG参数简介及Hog特征维数的计算

    HOG构造函数 CV_WRAP HOGDescriptor() :winSize(64,128), blockSize(16,16), blockStride(8,8),      cellSize( ...

  7. C# 传递数组参数_一维数组_二维数组

    using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.T ...

  8. BasicLSTMCell中num_units参数解释

    https://blog.csdn.net/notHeadache/article/details/81164264

  9. 晶振的各种参数代表什么意思?

    随着科技发展,晶振作为一种频率元器件被广泛应用于工业,科技,车载,数码,电子等各种领域,因为作用大.应用范围广,所以晶振素有电路心脏的称谓.常见的晶振有贴片.直插.车规级.石英.陶瓷晶振.硅晶振等等, ...

最新文章

  1. “AS3.0高级动画编程”学习:第二章转向行为(上)
  2. 【FFmpeg】函数详解(一)
  3. 有限算法下的技术实现路线
  4. 一个公式、五大指标帮你构建产品经理数据分析思维
  5. CLion 远程Linux服务器 开发调试
  6. 深入理解Spark 2.1 Core (十一):Shuffle Reduce 端的原理与源码分析
  7. legend2---开发日志16
  8. stm32编码器正反转计数程序_光电编码器接线图分析
  9. android引入开源库_为好目录引入开源:通过代码帮助公益组织
  10. C++ primer 第12章 12.3 使用标准库:文本查询程序
  11. Python数据类型--字典
  12. 1023. Have Fun with Numbers (20)
  13. CodeForces - 816A Karen and Morning 解题
  14. 用Java编写考试报名系统_基于jsp的计算机考试报名系统-JavaEE实现计算机考试报名系统 - java项目源码...
  15. seo外包公司可以为企业带来什么好处
  16. 《机器学习实战》(八)-- 树回归
  17. Discuz! Q 1.0来了!
  18. 自从有了BI商业智能系统,再也不用担心我的作图了!!!(图文)
  19. ESLint 和 Prettier 配合使用
  20. 【计算机体系】LFU与LRU的区别

热门文章

  1. form左上角有个锁的符号_第三章 表单笔记
  2. php 断点续传,php支持断点续传的文件下载类(附源码)
  3. json数据解析_ORACLE中Clob字段在不同数据库间自由地飞翔——SQL+JSON字段解析
  4. 判断exception类型_C++核心准则T.44:使用函数模板推断类模板参数类型(如果可能)...
  5. 应用层级时空记忆模型(HTM)实现对时序数据的异常检测
  6. dx9 lock unlock效率太低_synchronized的缺陷,Lock的诞生
  7. 你需要启用steam社区界面功能以进行购买_绝地求生购买衣服方法
  8. linux amd显卡下载,下载:AMD显卡Linux催化剂驱动9.10版
  9. php 检测服务器网速_php测试用户网速
  10. python创建数据集_使用Python从图像创建数据集以进行人脸识别