简介：对于paddle中的autograd.backward进行测试。并对集中常见到的函数进行测试。

关键词： gradient

#mermaid-svg-FAf464xhZzlqQIW6 .label{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);fill:#333;color:#333}#mermaid-svg-FAf464xhZzlqQIW6 .label text{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .node rect,#mermaid-svg-FAf464xhZzlqQIW6 .node circle,#mermaid-svg-FAf464xhZzlqQIW6 .node ellipse,#mermaid-svg-FAf464xhZzlqQIW6 .node polygon,#mermaid-svg-FAf464xhZzlqQIW6 .node path{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-FAf464xhZzlqQIW6 .node .label{text-align:center;fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .node.clickable{cursor:pointer}#mermaid-svg-FAf464xhZzlqQIW6 .arrowheadPath{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .edgePath .path{stroke:#333;stroke-width:1.5px}#mermaid-svg-FAf464xhZzlqQIW6 .flowchart-link{stroke:#333;fill:none}#mermaid-svg-FAf464xhZzlqQIW6 .edgeLabel{background-color:#e8e8e8;text-align:center}#mermaid-svg-FAf464xhZzlqQIW6 .edgeLabel rect{opacity:0.9}#mermaid-svg-FAf464xhZzlqQIW6 .edgeLabel span{color:#333}#mermaid-svg-FAf464xhZzlqQIW6 .cluster rect{fill:#ffffde;stroke:#aa3;stroke-width:1px}#mermaid-svg-FAf464xhZzlqQIW6 .cluster text{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 div.mermaidTooltip{position:absolute;text-align:center;max-width:200px;padding:2px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:12px;background:#ffffde;border:1px solid #aa3;border-radius:2px;pointer-events:none;z-index:100}#mermaid-svg-FAf464xhZzlqQIW6 .actor{stroke:#ccf;fill:#ECECFF}#mermaid-svg-FAf464xhZzlqQIW6 text.actor>tspan{fill:#000;stroke:none}#mermaid-svg-FAf464xhZzlqQIW6 .actor-line{stroke:grey}#mermaid-svg-FAf464xhZzlqQIW6 .messageLine0{stroke-width:1.5;stroke-dasharray:none;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 .messageLine1{stroke-width:1.5;stroke-dasharray:2, 2;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 #arrowhead path{fill:#333;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sequenceNumber{fill:#fff}#mermaid-svg-FAf464xhZzlqQIW6 #sequencenumber{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 #crosshead path{fill:#333;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 .messageText{fill:#333;stroke:#333}#mermaid-svg-FAf464xhZzlqQIW6 .labelBox{stroke:#ccf;fill:#ECECFF}#mermaid-svg-FAf464xhZzlqQIW6 .labelText,#mermaid-svg-FAf464xhZzlqQIW6 .labelText>tspan{fill:#000;stroke:none}#mermaid-svg-FAf464xhZzlqQIW6 .loopText,#mermaid-svg-FAf464xhZzlqQIW6 .loopText>tspan{fill:#000;stroke:none}#mermaid-svg-FAf464xhZzlqQIW6 .loopLine{stroke-width:2px;stroke-dasharray:2, 2;stroke:#ccf;fill:#ccf}#mermaid-svg-FAf464xhZzlqQIW6 .note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-FAf464xhZzlqQIW6 .noteText,#mermaid-svg-FAf464xhZzlqQIW6 .noteText>tspan{fill:#000;stroke:none}#mermaid-svg-FAf464xhZzlqQIW6 .activation0{fill:#f4f4f4;stroke:#666}#mermaid-svg-FAf464xhZzlqQIW6 .activation1{fill:#f4f4f4;stroke:#666}#mermaid-svg-FAf464xhZzlqQIW6 .activation2{fill:#f4f4f4;stroke:#666}#mermaid-svg-FAf464xhZzlqQIW6 .mermaid-main-font{font-family:"trebuchet ms", verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .section{stroke:none;opacity:0.2}#mermaid-svg-FAf464xhZzlqQIW6 .section0{fill:rgba(102,102,255,0.49)}#mermaid-svg-FAf464xhZzlqQIW6 .section2{fill:#fff400}#mermaid-svg-FAf464xhZzlqQIW6 .section1,#mermaid-svg-FAf464xhZzlqQIW6 .section3{fill:#fff;opacity:0.2}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle0{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle1{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle2{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle3{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .sectionTitle{text-anchor:start;font-size:11px;text-height:14px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .grid .tick{stroke:#d3d3d3;opacity:0.8;shape-rendering:crispEdges}#mermaid-svg-FAf464xhZzlqQIW6 .grid .tick text{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .grid path{stroke-width:0}#mermaid-svg-FAf464xhZzlqQIW6 .today{fill:none;stroke:red;stroke-width:2px}#mermaid-svg-FAf464xhZzlqQIW6 .task{stroke-width:2}#mermaid-svg-FAf464xhZzlqQIW6 .taskText{text-anchor:middle;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .taskText:not([font-size]){font-size:11px}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutsideRight{fill:#000;text-anchor:start;font-size:11px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutsideLeft{fill:#000;text-anchor:end;font-size:11px}#mermaid-svg-FAf464xhZzlqQIW6 .task.clickable{cursor:pointer}#mermaid-svg-FAf464xhZzlqQIW6 .taskText.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutsideLeft.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutsideRight.clickable{cursor:pointer;fill:#003163 !important;font-weight:bold}#mermaid-svg-FAf464xhZzlqQIW6 .taskText0,#mermaid-svg-FAf464xhZzlqQIW6 .taskText1,#mermaid-svg-FAf464xhZzlqQIW6 .taskText2,#mermaid-svg-FAf464xhZzlqQIW6 .taskText3{fill:#fff}#mermaid-svg-FAf464xhZzlqQIW6 .task0,#mermaid-svg-FAf464xhZzlqQIW6 .task1,#mermaid-svg-FAf464xhZzlqQIW6 .task2,#mermaid-svg-FAf464xhZzlqQIW6 .task3{fill:#8a90dd;stroke:#534fbc}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutside0,#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutside2{fill:#000}#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutside1,#mermaid-svg-FAf464xhZzlqQIW6 .taskTextOutside3{fill:#000}#mermaid-svg-FAf464xhZzlqQIW6 .active0,#mermaid-svg-FAf464xhZzlqQIW6 .active1,#mermaid-svg-FAf464xhZzlqQIW6 .active2,#mermaid-svg-FAf464xhZzlqQIW6 .active3{fill:#bfc7ff;stroke:#534fbc}#mermaid-svg-FAf464xhZzlqQIW6 .activeText0,#mermaid-svg-FAf464xhZzlqQIW6 .activeText1,#mermaid-svg-FAf464xhZzlqQIW6 .activeText2,#mermaid-svg-FAf464xhZzlqQIW6 .activeText3{fill:#000 !important}#mermaid-svg-FAf464xhZzlqQIW6 .done0,#mermaid-svg-FAf464xhZzlqQIW6 .done1,#mermaid-svg-FAf464xhZzlqQIW6 .done2,#mermaid-svg-FAf464xhZzlqQIW6 .done3{stroke:grey;fill:#d3d3d3;stroke-width:2}#mermaid-svg-FAf464xhZzlqQIW6 .doneText0,#mermaid-svg-FAf464xhZzlqQIW6 .doneText1,#mermaid-svg-FAf464xhZzlqQIW6 .doneText2,#mermaid-svg-FAf464xhZzlqQIW6 .doneText3{fill:#000 !important}#mermaid-svg-FAf464xhZzlqQIW6 .crit0,#mermaid-svg-FAf464xhZzlqQIW6 .crit1,#mermaid-svg-FAf464xhZzlqQIW6 .crit2,#mermaid-svg-FAf464xhZzlqQIW6 .crit3{stroke:#f88;fill:red;stroke-width:2}#mermaid-svg-FAf464xhZzlqQIW6 .activeCrit0,#mermaid-svg-FAf464xhZzlqQIW6 .activeCrit1,#mermaid-svg-FAf464xhZzlqQIW6 .activeCrit2,#mermaid-svg-FAf464xhZzlqQIW6 .activeCrit3{stroke:#f88;fill:#bfc7ff;stroke-width:2}#mermaid-svg-FAf464xhZzlqQIW6 .doneCrit0,#mermaid-svg-FAf464xhZzlqQIW6 .doneCrit1,#mermaid-svg-FAf464xhZzlqQIW6 .doneCrit2,#mermaid-svg-FAf464xhZzlqQIW6 .doneCrit3{stroke:#f88;fill:#d3d3d3;stroke-width:2;cursor:pointer;shape-rendering:crispEdges}#mermaid-svg-FAf464xhZzlqQIW6 .milestone{transform:rotate(45deg) scale(0.8, 0.8)}#mermaid-svg-FAf464xhZzlqQIW6 .milestoneText{font-style:italic}#mermaid-svg-FAf464xhZzlqQIW6 .doneCritText0,#mermaid-svg-FAf464xhZzlqQIW6 .doneCritText1,#mermaid-svg-FAf464xhZzlqQIW6 .doneCritText2,#mermaid-svg-FAf464xhZzlqQIW6 .doneCritText3{fill:#000 !important}#mermaid-svg-FAf464xhZzlqQIW6 .activeCritText0,#mermaid-svg-FAf464xhZzlqQIW6 .activeCritText1,#mermaid-svg-FAf464xhZzlqQIW6 .activeCritText2,#mermaid-svg-FAf464xhZzlqQIW6 .activeCritText3{fill:#000 !important}#mermaid-svg-FAf464xhZzlqQIW6 .titleText{text-anchor:middle;font-size:18px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 g.classGroup text{fill:#9370db;stroke:none;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family);font-size:10px}#mermaid-svg-FAf464xhZzlqQIW6 g.classGroup text .title{font-weight:bolder}#mermaid-svg-FAf464xhZzlqQIW6 g.clickable{cursor:pointer}#mermaid-svg-FAf464xhZzlqQIW6 g.classGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-FAf464xhZzlqQIW6 g.classGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 .classLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.5}#mermaid-svg-FAf464xhZzlqQIW6 .classLabel .label{fill:#9370db;font-size:10px}#mermaid-svg-FAf464xhZzlqQIW6 .relation{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-FAf464xhZzlqQIW6 .dashed-line{stroke-dasharray:3}#mermaid-svg-FAf464xhZzlqQIW6 #compositionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #compositionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #aggregationStart{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #aggregationEnd{fill:#ECECFF;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #dependencyStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #dependencyEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #extensionStart{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 #extensionEnd{fill:#9370db;stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 .commit-id,#mermaid-svg-FAf464xhZzlqQIW6 .commit-msg,#mermaid-svg-FAf464xhZzlqQIW6 .branch-label{fill:lightgrey;color:lightgrey;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .pieTitleText{text-anchor:middle;font-size:25px;fill:#000;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .slice{font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup text{fill:#9370db;stroke:none;font-size:10px;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup text{fill:#9370db;fill:#333;stroke:none;font-size:10px}#mermaid-svg-FAf464xhZzlqQIW6 g.statediagram-cluster .cluster-label text{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup .state-title{font-weight:bolder;fill:#000}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup rect{fill:#ECECFF;stroke:#9370db}#mermaid-svg-FAf464xhZzlqQIW6 g.stateGroup line{stroke:#9370db;stroke-width:1}#mermaid-svg-FAf464xhZzlqQIW6 .transition{stroke:#9370db;stroke-width:1;fill:none}#mermaid-svg-FAf464xhZzlqQIW6 .stateGroup .composit{fill:white;border-bottom:1px}#mermaid-svg-FAf464xhZzlqQIW6 .stateGroup .alt-composit{fill:#e0e0e0;border-bottom:1px}#mermaid-svg-FAf464xhZzlqQIW6 .state-note{stroke:#aa3;fill:#fff5ad}#mermaid-svg-FAf464xhZzlqQIW6 .state-note text{fill:black;stroke:none;font-size:10px}#mermaid-svg-FAf464xhZzlqQIW6 .stateLabel .box{stroke:none;stroke-width:0;fill:#ECECFF;opacity:0.7}#mermaid-svg-FAf464xhZzlqQIW6 .edgeLabel text{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .stateLabel text{fill:#000;font-size:10px;font-weight:bold;font-family:'trebuchet ms', verdana, arial;font-family:var(--mermaid-font-family)}#mermaid-svg-FAf464xhZzlqQIW6 .node circle.state-start{fill:black;stroke:black}#mermaid-svg-FAf464xhZzlqQIW6 .node circle.state-end{fill:black;stroke:white;stroke-width:1.5}#mermaid-svg-FAf464xhZzlqQIW6 #statediagram-barbEnd{fill:#9370db}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster rect{fill:#ECECFF;stroke:#9370db;stroke-width:1px}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster rect.outer{rx:5px;ry:5px}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-state .divider{stroke:#9370db}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-state .title-state{rx:5px;ry:5px}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster.statediagram-cluster .inner{fill:white}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster.statediagram-cluster-alt .inner{fill:#e0e0e0}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-cluster .inner{rx:0;ry:0}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-state rect.basic{rx:5px;ry:5px}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-state rect.divider{stroke-dasharray:10,10;fill:#efefef}#mermaid-svg-FAf464xhZzlqQIW6 .note-edge{stroke-dasharray:5}#mermaid-svg-FAf464xhZzlqQIW6 .statediagram-note rect{fill:#fff5ad;stroke:#aa3;stroke-width:1px;rx:0;ry:0}:root{--mermaid-font-family: '"trebuchet ms", verdana, arial';--mermaid-font-family: "Comic Sans MS", "Comic Sans", cursive}#mermaid-svg-FAf464xhZzlqQIW6 .error-icon{fill:#522}#mermaid-svg-FAf464xhZzlqQIW6 .error-text{fill:#522;stroke:#522}#mermaid-svg-FAf464xhZzlqQIW6 .edge-thickness-normal{stroke-width:2px}#mermaid-svg-FAf464xhZzlqQIW6 .edge-thickness-thick{stroke-width:3.5px}#mermaid-svg-FAf464xhZzlqQIW6 .edge-pattern-solid{stroke-dasharray:0}#mermaid-svg-FAf464xhZzlqQIW6 .edge-pattern-dashed{stroke-dasharray:3}#mermaid-svg-FAf464xhZzlqQIW6 .edge-pattern-dotted{stroke-dasharray:2}#mermaid-svg-FAf464xhZzlqQIW6 .marker{fill:#333}#mermaid-svg-FAf464xhZzlqQIW6 .marker.cross{stroke:#333}:root { --mermaid-font-family: "trebuchet ms", verdana, arial;}#mermaid-svg-FAf464xhZzlqQIW6 {color: rgba(0, 0, 0, 0.75);font: ;}

自动求解梯度

文章目录

函数backward

函数使用说明

实际测试

更多函数例子

square

exp

log

Softmax

总结

§01 自动求解梯度

关于paddle中的 paddle.autograd.backward函数参见： paddle.autograd.backward 中的内容。

1.1 函数backward

1.1.1 函数使用说明

函数定义： def backward(tensors, grad_tensors=None, retain_graph=False)

计算给定tensor的反向梯度。

（1）参数

    Args:tensors(list of Tensors): the tensors which the gradient to be computed. The tensors can not contain the same tensor.grad_tensors(list of Tensors of None, optional): the init gradients of the `tensors`` .If not None, it must have the same length with ``tensors`` ,and if any of the elements is None, then the init gradient is the default value which is filled with 1.0. If None, all the gradients of the ``tensors`` is the default value which is filled with 1.0.Defaults to None.retain_graph(bool, optional): If False, the graph used to compute grads will be freed. If you wouldlike to add more ops to the built graph after calling this method( :code:`backward` ), set the parameter:code:`retain_graph` to True, then the grads will be retained. Thus, seting it to False is much more memory-efficient.Defaults to False.Returns:NoneType: None

（2）举例

Examples:.. code-block:: pythonimport paddlex = paddle.to_tensor([[1, 2], [3, 4]], dtype='float32', stop_gradient=False)y = paddle.to_tensor([[3, 2], [3, 4]], dtype='float32')grad_tensor1 = paddle.to_tensor([[1,2], [2, 3]], dtype='float32')grad_tensor2 = paddle.to_tensor([[1,1], [1, 1]], dtype='float32')z1 = paddle.matmul(x, y)z2 = paddle.matmul(x, y)paddle.autograd.backward([z1, z2], [grad_tensor1, grad_tensor2], True)print(x.grad)#[[12. 18.]# [17. 25.]]x.clear_grad()paddle.autograd.backward([z1, z2], [grad_tensor1, None], True)print(x.grad)#[[12. 18.]# [17. 25.]]x.clear_grad()paddle.autograd.backward([z1, z2])print(x.grad)#[[10. 14.]# [10. 14.]]

1.1.2 实际测试

（1）单向量求解

Ⅰ.定义变量和函数

import sys,os,math,time
import matplotlib.pyplot as plt
from numpy import *
import paddle
from paddle import to_tensor as TTx = TT([1], dtype='float32', stop_gradient=False)
y = TT([2], dtype='float32')z = paddle.matmul(x, y)print("x: {}".format(x),"y: {}".format(y),"z: {}".format(z))

x: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[1.])
y: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=True,[2.])
z: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[2.])

Ⅱ.在backward之前

print("x.grad: {}".format(x.grad), "y.grad: {}".format(y.grad), "z.grad: {}".format(z.grad))

x.grad: None
y.grad: None
z.grad: None

Ⅲ.运行backward之后

paddle.autograd.backward(z, TT([3], dtype='float32'))

x.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[6.])
y.grad: None
z.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[3.])

从上面的代码可以看到，z=x⋅yz = x \cdot yz=x⋅y，所以，∂x=∂z⋅y\partial x = \partial z \cdot y∂x=∂z⋅y。根据∂z=3,y=2\partial z = 3,\,\,y = 2∂z=3,y=2
，所以∂x=6\partial x = 6∂x=6。

Ⅳ.再次backward

再次计算backward的时候，环境给出错误。

RuntimeError: (Unavailable) auto_0_ trying to backward through the same graph a second time, but this graph have already been freed. Please specify Tensor.backward(retain_graph=True) when calling backward at the first time.[Hint: Expected var->GradVarBase()->GraphIsFreed() == false, but received var->GradVarBase()->GraphIsFreed():1 != false:0.] (at /paddle/paddle/fluid/imperative/basic_engine.cc:74)

即使调用 clear_grad()，也无法保证重新计算backward()

x.clear_grad()
y.clear_grad()
z.clear_grad()

将autograd.backward函数中的retain_graph设置为TRUE。

paddle.autograd.backward(z, TT([3], dtype='float32'), retain_graph=True)

此时就可以重复调用backward。

第一次调用：

print("x.grad: {}".format(x.grad), "y.grad: {}".format(y.grad), "z.grad: {}".format(z.grad))

x.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[6.])
y.grad: None
z.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[3.])

第二次调用则：

x.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[12.])
y.grad: None
z.grad: Tensor(shape=[1], dtype=float32, place=CPUPlace, stop_gradient=False,[3.])

（2）矩阵乘法

Ⅰ.一个矩阵乘法

定义矩阵相乘：

z=x⋅yz = x \cdot yz=x⋅y

反向梯度：

∂x=∂z⋅y\partial x = \partial z \cdot y∂x=∂z⋅y

x = TT([[1,2],[3,4]], dtype='float32', stop_gradient=False)
y = TT([[3,2],[3,4]], dtype='float32')z = paddle.matmul(x, y)print("x: {}".format(x),"y: {}".format(y),"z: {}".format(z))paddle.autograd.backward(z, TT([[1,2],[2,3]], dtype='float32'), retain_graph=True)
print("x.grad: {}".format(x.grad), "y.grad: {}".format(y.grad), "z.grad: {}".format(z.grad))

x: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[1., 2.],[3., 4.]])
y: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=True,[[3., 2.],[3., 4.]])
z: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[9. , 10.],[21., 22.]])
x.grad: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[7. , 11.],[12., 18.]])
y.grad: None
z.grad: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[1., 2.],[2., 3.]])

从中可以看到，对于矩阵的自动反向梯度，所遵循的与标量基本是一样的。

print("y*z.grad: {}".format(y.numpy().dot(z.grad.numpy())))

y*z.grad: [[ 7. 12.][11. 18.]]

Ⅱ.两个矩阵乘法

定义两个矩阵运算：

z1=x⋅y,z1=x⋅yz_1 = x \cdot y,\,\,z_1 = x \cdot yz1=x⋅y,z1=x⋅y

那么梯度：

∂x=(∂z1+∂z2)⋅y\partial x = \left( {\partial z_1 + \partial z_2 } \right) \cdot y∂x=(∂z1+∂z2)⋅y

x = TT([[1,2],[3,4]], dtype='float32', stop_gradient=False)
y = TT([[3,2],[3,4]], dtype='float32')z1 = paddle.matmul(x, y)
z2 = paddle.matmul(x, y)print("x: {}".format(x),"y: {}".format(y),"z: {}".format(z))paddle.autograd.backward(z1, TT([[1,2],[2,3]], dtype='float32'))
paddle.autograd.backward(z2, TT([[1,1],[1,1]], dtype='float32'))print("x.grad: {}".format(x.grad), "y.grad: {}".format(y.grad), "z.grad: {}".format(z.grad))

x: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[1., 2.],[3., 4.]])
y: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=True,[[3., 2.],[3., 4.]])
z: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[9. , 10.],[21., 22.]])
x.grad: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[12., 18.],[17., 25.]])
y.grad: None
z.grad: Tensor(shape=[2, 2], dtype=float32, place=CPUPlace, stop_gradient=False,[[1., 2.],[2., 3.]])

检验x的梯度矩阵：

print("y*(z1+z2).grad: {}".format(y.numpy().dot((z1.grad+z2.grad).numpy())))

y*(z1+z2).grad: [[12. 17.][18. 25.]]

它等于计算公式给出的数值。

1.2 更多函数例子

1.2.1 square

x = TT([1,2], dtype='float32', stop_gradient=False)z1 = paddle.square(x)print("x: {}".format(x),"z1: {}".format(z1))paddle.autograd.backward(z1)
print("x.grad: {}".format(x.grad), "z1.grad: {}".format(z1.grad))

x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 4.])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[2., 4.])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 1.])

1.2.2 exp

x = TT([1,2], dtype='float32', stop_gradient=False)z1 = paddle.exp(x)print("x: {}".format(x),"z1: {}".format(z1))paddle.autograd.backward(z1)
print("x.grad: {}".format(x.grad), "z1.grad: {}".format(z1.grad))

x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[2.71828175, 7.38905621])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[2.71828175, 7.38905621])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 1.])

1.2.3 log

x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[0.        , 0.69314718])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1.        , 0.50000000])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 1.])

1.2.4 Softmax

（1）理论推导

为了方便起见，这里使用两个变量的SoftMax：

z1=ex1ex1+ex2,z2=ex2ex1+ex2z_1 = {{e^{x_1 } } \over {e^{x_1 } + e^{x_2 } }},\,\,\,\,z_2 = {{e^{x_2 } } \over {e^{x_1 } + e^{x_2 } }}z1=ex1+ex2ex1,z2=ex1+ex2ex2

那么：
∂x1=∂z1⋅ex1⋅(ex1+ex2)−ex1⋅ex1(ex1+ex2)2+∂z2⋅−ex2⋅ex1(ex1+ex2)2\partial x_1 = \partial z_1 \cdot {{e^{x_1 } \cdot \left( {e^{x_1 } + e^{x_2 } } \right) - e^{x_1 } \cdot e^{x_1 } } \over {\left( {e^{x_1 } + e^{x_2 } } \right)^2 }} + \partial z_2 \cdot {{ - e^{x_2 } \cdot e^{x_1 } } \over {\left( {e^{x_1 } + e^{x_2 } } \right)^2 }}∂x1=∂z1⋅(ex1+ex2)2ex1⋅(ex1+ex2)−ex1⋅ex1+∂z2⋅(ex1+ex2)2−ex2⋅ex1

同理，也可以得到∂x2\partial x_2∂x2。这里省略。

如果∂z1=∂z2\partial z_1 = \partial z_2∂z1=∂z2，那么就会有：。∂x1=∂x2=0\partial x_1 = \partial x_2 = 0∂x1=∂x2=0。

（2）实验仿真

Ⅰ.backward对z梯度不尽兴初始化

x = TT([1,2], dtype='float32', stop_gradient=False)z1 = paddle.exp(x) / paddle.sum(paddle.exp(x))print("x: {}".format(x),"z1: {}".format(z1))paddle.autograd.backward(z1)
print("x.grad: {}".format(x.grad), "z1.grad: {}".format(z1.grad))

x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[0.26894140, 0.73105860])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[0., 0.])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 1.])

Ⅱ.对z的梯度进行初始化

x = TT([1,2], dtype='float32', stop_gradient=False)z1 = paddle.exp(x) / paddle.sum(paddle.exp(x))print("x: {}".format(x),"z1: {}".format(z1))paddle.autograd.backward(z1,TT([1,2], dtype='float32'))
print("x.grad: {}".format(x.grad), "z1.grad: {}".format(z1.grad))

x: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])
z1: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[0.26894140, 0.73105860])
x.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[-0.19661194,  0.19661200])
z1.grad: Tensor(shape=[2], dtype=float32, place=CPUPlace, stop_gradient=False,[1., 2.])

直接根据前面推导的公式，可以验证结果是正确的。

x = array([1,2])
a = exp(sum(x))/(sum(exp(x)))**2
print("a: {}".format(a))

a: 0.19661193324148188

※ 总结 ※

对于paddle中的autograd.backward进行测试。并对集中常见到的函数进行测试。

■ 相关文献链接:

paddle.autograd.backward

paddle中的自动求解梯度： autograd.backward相关推荐

Paddle中的自动微分功能测试
简介: 初步分析了求解梯度的部分.详细的过程可以参见以下文献: pytorch自动求梯度-详解关键词: 自动求导,Paddle #mermaid-svg-xBXnA8Coc8mqUWW0 {fon ...
paddle 中的backward（）函数
对! 你没有看错,就是这个: paddle.autograd.backward(tensors, grad_tensors=None, retain_graph=False) 官方注解: 计算给定的 ...
python自动求梯度
from mxnet import autograd, nd 方法 x = nd.arange(4).reshape((4, 1)) 方法结果 x.attach_grad() 申请求梯度所需内存 a ...
《动手学深度学习》第二天（自动求梯度）
2.3.自动求梯度 MXNet提供的autograd模块可以用来自动求梯度. 2.3.1 一个简单的栗子这里我们要求对函数 y = 2xTx (2乘以x的转秩乘以X)求关于列向量 x 的梯度.(使用 ...
Pytorch中的向前计算（autograd）、梯度计算以及实现线性回归操作
在整个Pytorch框架中, 所有的神经网络本质上都是一个autograd package(自动求导工具包) autograd package提供了一个对Tensors上所有的操作进行自动微分的功能. ...
PyTorch基础（二）-----自动求导Autograd
一.前言上一篇文章中提到PyTorch提供了两个重要的高级功能,分别是: 具有强大的GPU加速的张量计算(如NumPy) 包含自动求导系统的的深度神经网络第一个特性我们会在之后的学习中用到,这里暂 ...
PyTorch入门学习（二）：Autogard之自动求梯度
autograd包是PyTorch中神经网络的核心部分,简单学习一下. autograd提供了所有张量操作的自动求微分功能. 它的灵活性体现在可以通过代码的运行来决定反向传播的过程, 这样就使得每一次 ...
pytorch自动求梯度—详解
构建深度学习模型的基本流程就是:搭建计算图,求得损失函数,然后计算损失函数对模型参数的导数,再利用梯度下降法等方法来更新参数.搭建计算图的过程,称为"正向传播",这个是需要我们自己 ...
PyTorch 1.0 中文文档：torch.autograd
译者:gfjiangly torch.autograd 提供类和函数,实现任意标量值函数的自动微分. 它要求对已有代码的最小改变-你仅需要用requires_grad=True关键字为需要计算梯度的声 ...

paddle中的自动求解梯度： autograd.backward

§01 自动求解梯度

1.1 函数backward

1.1.1 函数使用说明

（1）参数

（2）举例

1.1.2 实际测试

（1）单向量求解

Ⅰ.定义变量和函数

Ⅱ.在backward之前

Ⅲ.运行backward之后

Ⅳ.再次backward

（2）矩阵乘法

Ⅰ.一个矩阵乘法

Ⅱ.两个矩阵乘法

1.2 更多函数例子

1.2.1 square

1.2.2 exp

1.2.3 log

1.2.4 Softmax

（1）理论推导

（2）实验仿真

Ⅰ.backward对z梯度不尽兴初始化

Ⅱ.对z的梯度进行初始化

※ 总结 ※

paddle中的自动求解梯度： autograd.backward相关推荐

最新文章

热门文章

paddle中的自动求解梯度 ： autograd.backward

§01 自动求解梯度

1.1 函数backward

1.1.1 函数使用说明

（1）参数

（2）举例

1.1.2 实际测试

（1）单向量求解

Ⅰ.定义变量和函数

Ⅱ.在backward之前

Ⅲ.运行backward之后

Ⅳ.再次backward

（2）矩阵乘法

Ⅰ.一个矩阵乘法

Ⅱ.两个矩阵乘法

1.2 更多函数例子

1.2.1 square

1.2.2 exp

1.2.3 log

1.2.4 Softmax

（1）理论推导

（2）实验仿真

Ⅰ.backward对z梯度不尽兴初始化

Ⅱ.对z的梯度进行初始化

※ 总 结 ※

paddle中的自动求解梯度 ： autograd.backward相关推荐

最新文章

热门文章

paddle中的自动求解梯度： autograd.backward

※ 总结 ※

paddle中的自动求解梯度： autograd.backward相关推荐