简 介: 对于paddle中的autograd.backward进行测试。并对集中常见到的函数进行测试。

关键词gradient

#mermaid-svg-FAf464xhZzlqQIW6 .label{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);fill:#333;color:#333}#mermaid-svg-FAf464xhZzlqQIW6 .label text{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .node rect,#mermaid-svg-FAf464xhZzlqQIW6 .node circle,#mermaid-svg-FAf464xhZzlqQIW6 .node ellipse,#mermaid-svg-FAf464xhZzlqQIW6 .node polygon,#mermaid-svg-FAf464xhZzlqQIW6 .node path{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-FAf464xhZzlqQIW6 .node .label{text-align:center;fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .node.clickable{cursor:pointer}#mermaid-svg-FAf464xhZzlqQIW6 .arrowheadPath{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .edgePath .path{stroke:#333;stroke-width:1.5px}#mermaid-svg-FAf464xhZzlqQIW6 .flowchart-link{stroke:#333;fill:none}#mermaid-svg-FAf464xhZzlqQIW6 .edgeLabel{background-color:#e8e8e8;text-align:center}#mermaid-svg-FAf464xhZzlqQIW6 .edgeLabel rect{opacity:0.9}#mermaid-svg-FAf464xhZzlqQIW6 .edgeLabel span{color:#333}#mermaid-svg-FAf464xhZzlqQIW6 .cluster rect{fill:#ffffde;stroke:#aa3;stroke-width:1px}#mermaid-svg-FAf464xhZzlqQIW6 .cluster text{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:12px;background:#ffffde;border:1px solid #aa3;border-radius:2px;pointer-events:none;z-index:100}#mermaid-svg-FAf464xhZzlqQIW6 .actor{stroke:#ccf;fill:#ECECFF}#mermaid-svg-FAf464xhZzlqQIW6 text.actor>tspan{fill:#000;stroke:none}#mermaid-svg-FAf464xhZzlqQIW6 .actor-line{stroke:grey}#mermaid-svg-FAf464xhZzlqQIW6 .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 .messageLine1{stroke-width:1.5;stroke-dasharray:2, 2;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 #arrowhead path{fill:#333;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sequenceNumber{fill:#fff}#mermaid-svg-FAf464xhZzlqQIW6 #sequencenumber{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 #crosshead path{fill:#333;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 .messageText{fill:#333;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 .labelBox{stroke:#ccf;fill:#ECECFF}#mermaid-svg-FAf464xhZzlqQIW6 .labelText,#mermaid-svg-FAf464xhZzlqQIW6 .labelText>tspan{fill:#000;stroke:none}#mermaid-svg-FAf464xhZzlqQIW6 .loopText,#mermaid-svg-FAf464xhZzlqQIW6 .loopText>tspan{fill:#000;stroke:none}#mermaid-svg-FAf464xhZzlqQIW6 .loopLine{stroke-width:2px;stroke-dasharray:2, 2;stroke:#ccf;fill:#ccf}#mermaid-svg-FAf464xhZzlqQIW6 .note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-FAf464xhZzlqQIW6 .noteText,#mermaid-svg-FAf464xhZzlqQIW6 .noteText>tspan{fill:#000;stroke:none}#mermaid-svg-FAf464xhZzlqQIW6 .activation0{fill:#f4f4f4;stroke:#666}#mermaid-svg-FAf464xhZzlqQIW6 .activation1{fill:#f4f4f4;stroke:#666}#mermaid-svg-FAf464xhZzlqQIW6 .activation2{fill:#f4f4f4;stroke:#666}#mermaid-svg-FAf464xhZzlqQIW6 .mermaid-main-font{font-family:"trebuchet ms", verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .section{stroke:none;opacity:0.2}#mermaid-svg-FAf464xhZzlqQIW6 .section0{fill:rgba(102,102,255,0.49)}#mermaid-svg-FAf464xhZzlqQIW6 .section2{fill:#fff400}#mermaid-svg-FAf464xhZzlqQIW6 .section1,#mermaid-svg-FAf464xhZzlqQIW6 .section3{fill:#fff;opacity:0.2}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle0{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle1{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle2{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle3{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle{text-anchor:start;font-size:11px;text-height:14px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .grid .tick{stroke:#d3d3d3;opacity:0.8;shape-rendering:crispEdges}#mermaid-svg-FAf464xhZzlqQIW6 .grid .tick text{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .grid path{stroke-width:0}#mermaid-svg-FAf464xhZzlqQIW6 .today{fill:none;stroke:red;stroke-width:2px}#mermaid-svg-FAf464xhZzlqQIW6 .task{stroke-width:2}#mermaid-svg-FAf464xhZzlqQIW6 .taskText{text-anchor:middle;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .taskText:not([font-size]){font-size:11px}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutsideRight{fill:#000;text-anchor:start;font-size:11px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutsideLeft{fill:#000;text-anchor:end;font-size:11px}#mermaid-svg-FAf464xhZzlqQIW6 .task.clickable{cursor:pointer}#mermaid-svg-FAf464xhZzlqQIW6 .taskText.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutsideLeft.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutsideRight.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-FAf464xhZzlqQIW6 .taskText0,#mermaid-svg-FAf464xhZzlqQIW6 .taskText1,#mermaid-svg-FAf464xhZzlqQIW6 .taskText2,#mermaid-svg-FAf464xhZzlqQIW6 .taskText3{fill:#fff}#mermaid-svg-FAf464xhZzlqQIW6 .task0,#mermaid-svg-FAf464xhZzlqQIW6 .task1,#mermaid-svg-FAf464xhZzlqQIW6 .task2,#mermaid-svg-FAf464xhZzlqQIW6 .task3{fill:#8a90dd;stroke:#534fbc}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutside0,#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutside2{fill:#000}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutside1,#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutside3{fill:#000}#mermaid-svg-FAf464xhZzlqQIW6 .active0,#mermaid-svg-FAf464xhZzlqQIW6 .active1,#mermaid-svg-FAf464xhZzlqQIW6 .active2,#mermaid-svg-FAf464xhZzlqQIW6 .active3{fill:#bfc7ff;stroke:#534fbc}#mermaid-svg-FAf464xhZzlqQIW6 .activeText0,#mermaid-svg-FAf464xhZzlqQIW6 .activeText1,#mermaid-svg-FAf464xhZzlqQIW6 .activeText2,#mermaid-svg-FAf464xhZzlqQIW6 .activeText3{fill:#000 !important}#mermaid-svg-FAf464xhZzlqQIW6 .done0,#mermaid-svg-FAf464xhZzlqQIW6 .done1,#mermaid-svg-FAf464xhZzlqQIW6 .done2,#mermaid-svg-FAf464xhZzlqQIW6 .done3{stroke:grey;fill:#d3d3d3;stroke-width:2}#mermaid-svg-FAf464xhZzlqQIW6 .doneText0,#mermaid-svg-FAf464xhZzlqQIW6 .doneText1,#mermaid-svg-FAf464xhZzlqQIW6 .doneText2,#mermaid-svg-FAf464xhZzlqQIW6 .doneText3{fill:#000 !important}#mermaid-svg-FAf464xhZzlqQIW6 .crit0,#mermaid-svg-FAf464xhZzlqQIW6 .crit1,#mermaid-svg-FAf464xhZzlqQIW6 .crit2,#mermaid-svg-FAf464xhZzlqQIW6 .crit3{stroke:#f88;fill:red;stroke-width:2}#mermaid-svg-FAf464xhZzlqQIW6 .activeCrit0,#mermaid-svg-FAf464xhZzlqQIW6 .activeCrit1,#mermaid-svg-FAf464xhZzlqQIW6 .activeCrit2,#mermaid-svg-FAf464xhZzlqQIW6 .activeCrit3{stroke:#f88;fill:#bfc7ff;stroke-width:2}#mermaid-svg-FAf464xhZzlqQIW6 .doneCrit0,#mermaid-svg-FAf464xhZzlqQIW6 .doneCrit1,#mermaid-svg-FAf464xhZzlqQIW6 .doneCrit2,#mermaid-svg-FAf464xhZzlqQIW6 .doneCrit3{stroke:#f88;fill:#d3d3d3;stroke-width:2;cursor:pointer;shape-rendering:crispEdges}#mermaid-svg-FAf464xhZzlqQIW6 .milestone{transform:rotate(45deg) scale(0.8, 0.8)}#mermaid-svg-FAf464xhZzlqQIW6 .milestoneText{font-style:italic}#mermaid-svg-FAf464xhZzlqQIW6 .doneCritText0,#mermaid-svg-FAf464xhZzlqQIW6 .doneCritText1,#mermaid-svg-FAf464xhZzlqQIW6 .doneCritText2,#mermaid-svg-FAf464xhZzlqQIW6 .doneCritText3{fill:#000 !important}#mermaid-svg-FAf464xhZzlqQIW6 .activeCritText0,#mermaid-svg-FAf464xhZzlqQIW6 .activeCritText1,#mermaid-svg-FAf464xhZzlqQIW6 .activeCritText2,#mermaid-svg-FAf464xhZzlqQIW6 .activeCritText3{fill:#000 !important}#mermaid-svg-FAf464xhZzlqQIW6 .titleText{text-anchor:middle;font-size:18px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 g.classGroup text{fill:#9370db;stroke:none;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:10px}#mermaid-svg-FAf464xhZzlqQIW6 g.classGroup text .title{font-weight:bolder}#mermaid-svg-FAf464xhZzlqQIW6 g.clickable{cursor:pointer}#mermaid-svg-FAf464xhZzlqQIW6 g.classGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-FAf464xhZzlqQIW6 g.classGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 .classLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.5}#mermaid-svg-FAf464xhZzlqQIW6 .classLabel .label{fill:#9370db;font-size:10px}#mermaid-svg-FAf464xhZzlqQIW6 .relation{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-FAf464xhZzlqQIW6 .dashed-line{stroke-dasharray:3}#mermaid-svg-FAf464xhZzlqQIW6 #compositionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #compositionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #aggregationStart{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #aggregationEnd{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #dependencyStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #dependencyEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #extensionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #extensionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 .commit-id,#mermaid-svg-FAf464xhZzlqQIW6 .commit-msg,#mermaid-svg-FAf464xhZzlqQIW6 .branch-label{fill:lightgrey;color:lightgrey;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .pieTitleText{text-anchor:middle;font-size:25px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .slice{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup text{fill:#9370db;stroke:none;font-size:10px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup text{fill:#9370db;fill:#333;stroke:none;font-size:10px}#mermaid-svg-FAf464xhZzlqQIW6 g.statediagram-cluster .cluster-label text{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup .state-title{font-weight:bolder;fill:#000}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 .transition{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-FAf464xhZzlqQIW6 .stateGroup .composit{fill:white;border-bottom:1px}#mermaid-svg-FAf464xhZzlqQIW6 .stateGroup .alt-composit{fill:#e0e0e0;border-bottom:1px}#mermaid-svg-FAf464xhZzlqQIW6 .state-note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-FAf464xhZzlqQIW6 .state-note text{fill:black;stroke:none;font-size:10px}#mermaid-svg-FAf464xhZzlqQIW6 .stateLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.7}#mermaid-svg-FAf464xhZzlqQIW6 .edgeLabel text{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .stateLabel text{fill:#000;font-size:10px;font-weight:bold;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .node circle.state-start{fill:black;stroke:black}#mermaid-svg-FAf464xhZzlqQIW6 .node circle.state-end{fill:black;stroke:white;stroke-width:1.5}#mermaid-svg-FAf464xhZzlqQIW6 #statediagram-barbEnd{fill:#9370db}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster rect{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster rect.outer{rx:5px;ry:5px}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-state .divider{stroke:#9370db}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-state .title-state{rx:5px;ry:5px}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster.statediagram-cluster .inner{fill:white}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster.statediagram-cluster-alt .inner{fill:#e0e0e0}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster .inner{rx:0;ry:0}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-state rect.basic{rx:5px;ry:5px}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-state rect.divider{stroke-dasharray:10,10;fill:#efefef}#mermaid-svg-FAf464xhZzlqQIW6 .note-edge{stroke-dasharray:5}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-note rect{fill:#fff5ad;stroke:#aa3;stroke-width:1px;rx:0;ry:0}:root{--mermaid-font-family: '"trebuchet ms", verdana, arial';--mermaid-font-family: "Comic Sans MS", "Comic Sans", cursive}#mermaid-svg-FAf464xhZzlqQIW6 .error-icon{fill:#522}#mermaid-svg-FAf464xhZzlqQIW6 .error-text{fill:#522;stroke:#522}#mermaid-svg-FAf464xhZzlqQIW6 .edge-thickness-normal{stroke-width:2px}#mermaid-svg-FAf464xhZzlqQIW6 .edge-thickness-thick{stroke-width:3.5px}#mermaid-svg-FAf464xhZzlqQIW6 .edge-pattern-solid{stroke-dasharray:0}#mermaid-svg-FAf464xhZzlqQIW6 .edge-pattern-dashed{stroke-dasharray:3}#mermaid-svg-FAf464xhZzlqQIW6 .edge-pattern-dotted{stroke-dasharray:2}#mermaid-svg-FAf464xhZzlqQIW6 .marker{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .marker.cross{stroke:#333}:root { --mermaid-font-family: "trebuchet ms", verdana, arial;}#mermaid-svg-FAf464xhZzlqQIW6 {color: rgba(0, 0, 0, 0.75);font: ;}

自动求解梯度
文章目录
函数backward
函数使用说明
实际测试
更多函数例子
square
exp
log
Softmax
总 结

§01 自动求解梯度


  关于paddle中的 paddle.autograd.backward函数参见: paddle.autograd.backward 中的内容。

1.1 函数backward

1.1.1 函数使用说明

  函数定义: def backward(tensors, grad_tensors=None, retain_graph=False)

  计算给定tensor的反向梯度。

(1)参数

    Args:tensors(list of Tensors): the tensors which the gradient to be computed. The tensors can not contain the same tensor.grad_tensors(list of Tensors of None, optional): the init gradients of the `tensors`` .If not None, it must have the same length with ``tensors`` ,and if any of the elements is None, then the init gradient is the default value which is filled with 1.0. If None, all the gradients of the ``tensors`` is the default value which is filled with 1.0.Defaults to None.retain_graph(bool, optional): If False, the graph used to compute grads will be freed. If you wouldlike to add more ops to the built graph after calling this method( :code:`backward` ), set the parameter:code:`retain_graph` to True, then the grads will be retained. Thus, seting it to False is much more memory-efficient.Defaults to False.Returns:NoneType: None

(2)举例

Examples:.. code-block:: pythonimport paddlex = paddle.to_tensor([[1, 2], [3, 4]], dtype='float32', stop_gradient=False)y = paddle.to_tensor([[3, 2], [3, 4]], dtype='float32')grad_tensor1 = paddle.to_tensor([[1,2], [2, 3]], dtype='float32')grad_tensor2 = paddle.to_tensor([[1,1], [1, 1]], dtype='float32')z1 = paddle.matmul(x, y)z2 = paddle.matmul(x, y)paddle.autograd.backward([z1, z2], [grad_tensor1, grad_tensor2], True)print(x.grad)#[[12. 18.]# [17. 25.]]x.clear_grad()paddle.autograd.backward([z1, z2], [grad_tensor1, None], True)print(x.grad)#[[12. 18.]# [17. 25.]]x.clear_grad()paddle.autograd.backward([z1, z2])print(x.grad)#[[10. 14.]# [10. 14.]]

1.1.2 实际测试

(1)单向量求解

 Ⅰ.定义变量和函数
import sys,os,math,time
import matplotlib.pyplot as plt
from numpy import *
import paddle
from paddle import to_tensor as TTx = TT([1], dtype='float32', stop_gradient=False)
y = TT([2], dtype='float32')z = paddle.matmul(x, y)print("x: {}".format(x),"y: {}".format(y),"z: {}".format(z))
x: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[1.])
y: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=True,[2.])
z: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[2.])
 Ⅱ.在backward之前
print("x.grad: {}".format(x.grad), "y.grad: {}".format(y.grad), "z.grad: {}".format(z.grad))
x.grad: None
y.grad: None
z.grad: None
 Ⅲ.运行backward之后
paddle.autograd.backward(z, TT([3], dtype='float32'))
x.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[6.])
y.grad: None
z.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[3.])

  从上面的代码可以看到,z=x⋅yz = x \cdot yz=x⋅y,所以,∂x=∂z⋅y\partial x = \partial z \cdot y∂x=∂z⋅y。根据∂z=3,y=2\partial z = 3,\,\,y = 2∂z=3,y=2
  ,所以∂x=6\partial x = 6∂x=6。

 Ⅳ.再次backward

  再次计算backward的时候,环境给出错误。

RuntimeError: (Unavailable) auto_0_ trying to backward through the same graph a second time, but this graph have already been freed. Please specify Tensor.backward(retain_graph=True) when calling backward at the first time.[Hint: Expected var->GradVarBase()->GraphIsFreed() == false, but received var->GradVarBase()->GraphIsFreed():1 != false:0.] (at /paddle/paddle/fluid/imperative/basic_engine.cc:74)

  即使调用 clear_grad(),也无法保证重新计算backward()

x.clear_grad()
y.clear_grad()
z.clear_grad()

  将autograd.backward函数中的retain_graph设置为TRUE。

paddle.autograd.backward(z, TT([3], dtype='float32'), retain_graph=True)

  此时就可以重复调用backward。

  第一次调用:

print("x.grad: {}".format(x.grad), "y.grad: {}".format(y.grad), "z.grad: {}".format(z.grad))
x.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[6.])
y.grad: None
z.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[3.])

  第二次调用则:

x.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[12.])
y.grad: None
z.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[3.])

(2)矩阵乘法

 Ⅰ.一个矩阵乘法
  • 定义矩阵相乘:

z=x⋅yz = x \cdot yz=x⋅y

  • 反向梯度:

∂x=∂z⋅y\partial x = \partial z \cdot y∂x=∂z⋅y

x = TT([[1,2],[3,4]], dtype='float32', stop_gradient=False)
y = TT([[3,2],[3,4]], dtype='float32')z = paddle.matmul(x, y)print("x: {}".format(x),"y: {}".format(y),"z: {}".format(z))paddle.autograd.backward(z, TT([[1,2],[2,3]], dtype='float32'), retain_graph=True)
print("x.grad: {}".format(x.grad), "y.grad: {}".format(y.grad), "z.grad: {}".format(z.grad))
x: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[1., 2.],[3., 4.]])
y: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=True,[[3., 2.],[3., 4.]])
z: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[9. , 10.],[21., 22.]])
x.grad: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[7. , 11.],[12., 18.]])
y.grad: None
z.grad: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[1., 2.],[2., 3.]])

  从中可以看到,对于矩阵的自动反向梯度,所遵循的与标量基本是一样的。

print("y*z.grad: {}".format(y.numpy().dot(z.grad.numpy())))
y*z.grad: [[ 7. 12.][11. 18.]]
 Ⅱ.两个矩阵乘法

  定义两个矩阵运算:

z1=x⋅y,z1=x⋅yz_1 = x \cdot y,\,\,z_1 = x \cdot yz1​=x⋅y,z1​=x⋅y

  那么梯度:

∂x=(∂z1+∂z2)⋅y\partial x = \left( {\partial z_1 + \partial z_2 } \right) \cdot y∂x=(∂z1​+∂z2​)⋅y

x = TT([[1,2],[3,4]], dtype='float32', stop_gradient=False)
y = TT([[3,2],[3,4]], dtype='float32')z1 = paddle.matmul(x, y)
z2 = paddle.matmul(x, y)print("x: {}".format(x),"y: {}".format(y),"z: {}".format(z))paddle.autograd.backward(z1, TT([[1,2],[2,3]], dtype='float32'))
paddle.autograd.backward(z2, TT([[1,1],[1,1]], dtype='float32'))print("x.grad: {}".format(x.grad), "y.grad: {}".format(y.grad), "z.grad: {}".format(z.grad))
x: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[1., 2.],[3., 4.]])
y: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=True,[[3., 2.],[3., 4.]])
z: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[9. , 10.],[21., 22.]])
x.grad: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[12., 18.],[17., 25.]])
y.grad: None
z.grad: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[1., 2.],[2., 3.]])

  检验x的梯度矩阵:

print("y*(z1+z2).grad: {}".format(y.numpy().dot((z1.grad+z2.grad).numpy())))
y*(z1+z2).grad: [[12. 17.][18. 25.]]

  它等于计算公式给出的数值。

1.2 更多函数例子

1.2.1 square

x = TT([1,2], dtype='float32', stop_gradient=False)z1 = paddle.square(x)print("x: {}".format(x),"z1: {}".format(z1))paddle.autograd.backward(z1)
print("x.grad: {}".format(x.grad), "z1.grad: {}".format(z1.grad))
x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 4.])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[2., 4.])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 1.])

1.2.2 exp

x = TT([1,2], dtype='float32', stop_gradient=False)z1 = paddle.exp(x)print("x: {}".format(x),"z1: {}".format(z1))paddle.autograd.backward(z1)
print("x.grad: {}".format(x.grad), "z1.grad: {}".format(z1.grad))
x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[2.71828175, 7.38905621])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[2.71828175, 7.38905621])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 1.])

1.2.3 log

x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[0.        , 0.69314718])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1.        , 0.50000000])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 1.])

1.2.4 Softmax

(1)理论推导

  为了方便起见,这里使用两个变量的SoftMax:

z1=ex1ex1+ex2,z2=ex2ex1+ex2z_1 = {{e^{x_1 } } \over {e^{x_1 } + e^{x_2 } }},\,\,\,\,z_2 = {{e^{x_2 } } \over {e^{x_1 } + e^{x_2 } }}z1​=ex1​+ex2​ex1​​,z2​=ex1​+ex2​ex2​​

那么:
∂x1=∂z1⋅ex1⋅(ex1+ex2)−ex1⋅ex1(ex1+ex2)2+∂z2⋅−ex2⋅ex1(ex1+ex2)2\partial x_1 = \partial z_1 \cdot {{e^{x_1 } \cdot \left( {e^{x_1 } + e^{x_2 } } \right) - e^{x_1 } \cdot e^{x_1 } } \over {\left( {e^{x_1 } + e^{x_2 } } \right)^2 }} + \partial z_2 \cdot {{ - e^{x_2 } \cdot e^{x_1 } } \over {\left( {e^{x_1 } + e^{x_2 } } \right)^2 }}∂x1​=∂z1​⋅(ex1​+ex2​)2ex1​⋅(ex1​+ex2​)−ex1​⋅ex1​​+∂z2​⋅(ex1​+ex2​)2−ex2​⋅ex1​​

  同理,也可以得到∂x2\partial x_2∂x2​。这里省略。

  如果∂z1=∂z2\partial z_1 = \partial z_2∂z1​=∂z2​,那么就会有:。∂x1=∂x2=0\partial x_1 = \partial x_2 = 0∂x1​=∂x2​=0。

(2)实验仿真

 Ⅰ.backward对z梯度不尽兴初始化
x = TT([1,2], dtype='float32', stop_gradient=False)z1 = paddle.exp(x) / paddle.sum(paddle.exp(x))print("x: {}".format(x),"z1: {}".format(z1))paddle.autograd.backward(z1)
print("x.grad: {}".format(x.grad), "z1.grad: {}".format(z1.grad))
x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[0.26894140, 0.73105860])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[0., 0.])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 1.])
 Ⅱ.对z的梯度进行初始化
x = TT([1,2], dtype='float32', stop_gradient=False)z1 = paddle.exp(x) / paddle.sum(paddle.exp(x))print("x: {}".format(x),"z1: {}".format(z1))paddle.autograd.backward(z1,TT([1,2], dtype='float32'))
print("x.grad: {}".format(x.grad), "z1.grad: {}".format(z1.grad))
x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[0.26894140, 0.73105860])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[-0.19661194,  0.19661200])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])

  直接根据前面推导的公式,可以验证结果是正确的。

x = array([1,2])
a = exp(sum(x))/(sum(exp(x)))**2
print("a: {}".format(a))
a: 0.19661193324148188

※ 总  结 ※


  对于paddle中的autograd.backward进行测试。并对集中常见到的函数进行测试。


■ 相关文献链接:

  • paddle.autograd.backward

paddle中的自动求解梯度 : autograd.backward相关推荐

  1. Paddle中的自动微分功能测试

    简 介: 初步分析了求解梯度的部分.详细的过程可以参见以下文献: pytorch自动求梯度-详解 关键词: 自动求导,Paddle #mermaid-svg-xBXnA8Coc8mqUWW0 {fon ...

  2. paddle 中的backward()函数

    对! 你没有看错,就是这个: paddle.autograd.backward(tensors, grad_tensors=None, retain_graph=False) 官方注解: 计算给定的 ...

  3. python自动求梯度

    from mxnet import autograd, nd 方法 x = nd.arange(4).reshape((4, 1)) 方法 结果 x.attach_grad() 申请求梯度所需内存 a ...

  4. 《动手学深度学习》 第二天 (自动求梯度)

    2.3.自动求梯度 MXNet提供的autograd模块可以用来自动求梯度. 2.3.1 一个简单的栗子 这里我们要求对函数 y = 2xTx (2乘以x的转秩乘以X)求关于列向量 x 的梯度.(使用 ...

  5. Pytorch中的向前计算(autograd)、梯度计算以及实现线性回归操作

    在整个Pytorch框架中, 所有的神经网络本质上都是一个autograd package(自动求导工具包) autograd package提供了一个对Tensors上所有的操作进行自动微分的功能. ...

  6. PyTorch基础(二)-----自动求导Autograd

    一.前言 上一篇文章中提到PyTorch提供了两个重要的高级功能,分别是: 具有强大的GPU加速的张量计算(如NumPy) 包含自动求导系统的的深度神经网络 第一个特性我们会在之后的学习中用到,这里暂 ...

  7. PyTorch入门学习(二):Autogard之自动求梯度

    autograd包是PyTorch中神经网络的核心部分,简单学习一下. autograd提供了所有张量操作的自动求微分功能. 它的灵活性体现在可以通过代码的运行来决定反向传播的过程, 这样就使得每一次 ...

  8. pytorch自动求梯度—详解

    构建深度学习模型的基本流程就是:搭建计算图,求得损失函数,然后计算损失函数对模型参数的导数,再利用梯度下降法等方法来更新参数.搭建计算图的过程,称为"正向传播",这个是需要我们自己 ...

  9. PyTorch 1.0 中文文档:torch.autograd

    译者:gfjiangly torch.autograd 提供类和函数,实现任意标量值函数的自动微分. 它要求对已有代码的最小改变-你仅需要用requires_grad=True关键字为需要计算梯度的声 ...

最新文章

  1. 50个ggplot2可视化案例
  2. DataBinding
  3. python代码规范准则_Python编码规范
  4. Oracle Form Builder
  5. Unity3D调用android方法(非插件方式)
  6. Vue.js 学习笔记 五 常用的事件修饰符
  7. Easy Recovery帮你解决数据丢失的苦恼
  8. 安卓开发中Theme.AppCompat.Light的解决方法
  9. k380没有验证码_罗技K380蓝牙键盘
  10. 苗族php动态网页设计作业
  11. 速览!PCBA需要刷三防漆,如何制作治工具?
  12. 怎么看台式计算机内存条,内存条型号,详细教您怎么查看内存条型号
  13. 喜讯 | 图扑科技再获厦门数字经济创新创业大赛一等奖
  14. 最牛训犬师,专治拆家打架咬人,20多年搞定2000多条狗
  15. RTL8372-CG/RTL8373-CG
  16. 联想台式电脑重装系统按哪个键?
  17. 【pandas 类库】
  18. ios客户端学习-UIButton
  19. 如何下载百度地图2.5维数据
  20. 【通信接口】CAN总线协议

热门文章

  1. Spring Boot 入门
  2. 熔断,限流,降级 一些理解
  3. 关于自动化网络监控的真相
  4. 【原创】HP 安装 depot (以mysql为例)
  5. 大规模web服务开发技能
  6. PHP扩展模块Memcache Redis Mssql部署
  7. php 回收周期(Collecting Cycles)
  8. 22.executor service Flask
  9. 关于Tomcat上请求的编解码问题
  10. Python字典基础