Ⅰ 论文信息
Ⅱ 论文框架
- 1 Introduction
- 2 The Challenges of Computation on Graphs
- 3 Problem Setting and Notation
- 4 Extending Convolutions to Graphs
- 5 Polynomial Filters on Graphs
- - 5.1 The Graph Laplacian
  - 5.2 Polynomials of the Laplacian
  - 5.3 ChebNet
  - 5.4 Polynomial Filters are Node-Order Equivariant（节点顺序等价）
  - 5.5 Embedding Computation
- 6 Modern Graph Neural Networks
- - 6.1 Embedding Computation
  - 6.2 Thoughts
- 7 Interactive Graph Neural Networks
- 8 From Local to Global Convolutions
- - 8.1 Spectral Convolutions
  - 8.2 Spectral Representations of Natural Images
  - 8.3 Embedding Computation
  - 8.4 Spectral Convolutions are Node-Order Equivariant
  - 8.5 Global Propagation via Graph Embeddings
- 9 Learning GNN Parameters
- 10 Conclusion and Further Reading

Ⅰ 论文信息

本文链接：Understanding Convolutions on Graphs

本文是2021年9月发表在distill上的有关图卷积神经网络的文章，它的介绍图神经网络的姊妹篇A Gentle Introduction to Graph Neural Networks已经在B站听沐神讲过了，决定再看看这篇介绍图卷积的加深对图的认识。

Ⅱ 论文框架

1 Introduction

2 The Challenges of Computation on Graphs

3 Problem Setting and Notation

4 Extending Convolutions to Graphs

5 Polynomial Filters on Graphs

5.1 The Graph Laplacian

The graph Laplacian $L$ is the square $n \times n$ matrix defined as: $L = D - A$ .
- 其中 $D$ 为diagonal degree matrix， $Dv=∑uAvuD_v=\sum_{u}A_{vu}$ ， $A$ 为0-1领接矩阵
- Laplacian $L$ 仅与图的结构有关，与节点特征无关
graph Laplacian具有很多有趣的性质，出现在许多与图有关的数学问题中，在后面的章节中会有所介绍

5.2 Polynomials of the Laplacian

polynomials of the Laplacian: $pw(L)=w0In+w1L+w2L2+...+wdLd=∑i=0dwiLip_w(L) = w_0I_n+w_1L+w_2L^2+...+w_dL^d=\sum_{i=0}^dw_iL^i$
- 每个此形式的多项式都可以用一组系数向量表示 $w=[w_0,w_1,...,w_d]$
These polynomials can be thought of as the equivalent of ‘filters’ in CNNs, and the coefficients $w$ as the weights of the ‘filters’.
Once we have constructed the feature vector $x$ , we can define its convolution with a polynomial filter $p_w$ as: $x^′=p_w(L)x$
根据推导，当仅有 $w 1 = 1$ ,其余均为0时

Q：为什么 $L_vx$ 就直接推到第二行邻居 $u$ 了呢， $x_u$ 怎么跑出来了呢
A：第二行的 $u$ 是图中所有节点，是经过第三行的计算之后，由于 $D$ 、 $A$ 的特性才推导至第四行的 $u$ 为 $v$ 的邻居节点
每个节点 $v$ 的特征都是与它的邻居 $u$ 们的特征有关(combination)
- 邻居与节点的最大距离代表 the degree of localization，由 $d$ 表示

5.3 ChebNet

ChebNet中对 polynomial filters进行了重定义：
$pw(L)=∑i=1dwiTi(L~)p_w(L) = \sum_{i=1}^dw_iT_i(\widetilde{L})$

)

$T_i$ 是 degree-i Chebyshev polynomial of the first kind
$L~\widetilde{L}$ 是 normalized Laplacian ： $L~=2Lλmax(L)−In\widetilde{L}=\frac{2L}{\lambda_{max}(L)-I_n}$

这样进行重定义的意义：

$L$ 是半正定的， $L~\widetilde{L}$ 是 $L$ 的 scale-down version， $L~\widetilde{L}$ 的值在-1到1之间，从而防止 $L~\widetilde{L}$ 的幂的输入爆炸
Chebyshev polynomials 具有一些有趣的性质，使插值在数值上更加稳定

5.4 Polynomial Filters are Node-Order Equivariant（节点顺序等价）

The polynomial filters we considered here are actually independent of the ordering of the nodes. 这些多项式filters与图中节点的顺序是无关的。
A similar proof follows for higher degree polynomials: the entries in the powers of LL are equivariant to the ordering of the nodes.

5.5 Embedding Computation

现在可以像CNN一样将 ChebNet 层堆叠起来，中间加入非线性层。

在这个网络中，对每个结点使用的filter weights都是相同的、共享的，这点也和CNN中的卷积核权重共享一样。

6 Modern Graph Neural Networks

如5.2中的公式图所示，我们可以认为这卷积由两步组成：

Aggregating over immediate neighbour features $x_u$
Combining with the node’s own feature $x_v$

KEY IDEA: What if we consider different kinds of ‘aggregation’ and ‘combination’ steps, beyond what are possible using polynomial filters?

这些卷积可以被认为是相邻节点之间的 meassage-passing ：每一步之后，每个阶段都会收到来自邻居们的一些信息。

6.1 Embedding Computation

Message-passing 构成了如今许多GNN结构的脊柱，我们在此描述几种常用的GNN框架：

Graph Convoluntional Networks (GCN)
Graph Attention Networks (GAT)
Graph Sample and Aggregate (GraphSAGE)
Graph Isomorphism Network (GIN)

GCN

每一步 $k$ 的时候，可学习参数 $W, B$ 以及函数 $f$ 对所有节点都是共享的。这样使得GCN模型能够 scale well，因为参数的数量和图的size无关。
GAT

每一步 $k$ 的时候，可学习参数 $W, B$ 以及函数 $f$ 对所有节点都是共享的。这样使得GAT模型能够 scale well，因为参数的数量和图的size无关。

这里我们只用了单头注意力机制，多头注意力机制也是类似的。

3. GraphSAGE

原GraphSAGE论文中 $AGGu∈Nv(hu(k−1)){AGG}_{u\in{\mathcal{N}}_v}({h_u^{(k-1)}})$ 有三种choices：

mean （与GCN相似）
dimension-wise Maximum
LSTM（在给邻居们排序之后）

在此，我们决定使用 RNN aggregator，因为它比LSTM更易于解释但是两者理念相似。

此外，原论文中还是用了 ‘neighbourhood sampling’ ：无论一个节点的neighbourhood有多大，都从中进行固定大小的随机采样。这样可以增大映射的多样性，同时使得算法可以在大图上使用。

可学习参数对于所有节点共享。这样使得GraphSAGE模型能够 scale well，因为参数的数量和图的size无关。

4. GIN

可学习参数对于所有节点共享。这样使得GIN模型能够 scale well，因为参数的数量和图的size无关。

6.2 Thoughts

评估不同的 aggregation functions是很有趣的事情，论文How Powerful are Graph Neural Networks中通过他们怎样保留邻居节点的特征进行了比较。

我们只讨论了只在节点上进行运算的GNN，现在还要新的也在边上进行计算的GNN，但message passing的概念是相同的。

7 Interactive Graph Neural Networks

8 From Local to Global Convolutions

目前我们所讲到的方法都进行的是 ‘local’ convolutions：每个节点的特征都是用它 local neighbors 的特征的函数来更新的。

尽管经过足够步数的meassage-passing能够最终保证图中所有节点的信息都能被传递，但我们想要更直接的进行 ‘global’ convolutions 的方法。

8.1 Spectral Convolutions

KEY IDEA：Given a feature vector $x$ , the Laplacian $L$ allows us to quantify how smooth $x$ is, with respect to $G$ .

8.2 Spectral Representations of Natural Images

These visualizations should convince you that the first eigenvectors are indeed smooth, and the smoothness correspondingly decreases as we consider later eigenvectors.

For any image $x$ , we can think of the initial entries of the spectral representation $x^\hat{x}$ as capturing ‘global’ image-wide trends, which are the low-frequency components, while the later entries as capturing ‘local’ details, which are the high-frequency components.

8.3 Embedding Computation

convolution in the spectral domain 需要的参数比 direct convolution in the natural domain 要少挺多。
此外，由于图中拉普拉斯特征向量的平滑度，使用光谱表示（spectral representation）会自动对相邻节点强制执行归纳偏差（inductive bias）以获得相似的表示。

Spectral Convolution 的执行过程：

8.4 Spectral Convolutions are Node-Order Equivariant

与拉普拉斯多项式filters相似，Spectral Convolutions 也是与节点顺序无关的。

Spectral Convolutions的缺点：

必须要从 $L$ 中计算特征矩阵 $U_m$ ，这对于大图而言是不可行的
即使算出了 $U_m$ ，计算效率也是很低的，因为 $U_m$ 和 $U_m^T$ 的重复乘法
学习到的filters是针对于所输入的图的，这意味着对于新的结构不同的图，它们并不适用

8.5 Global Propagation via Graph Embeddings

一个更简单的结合graph-level信息的方法是：compute embeddings of the entire graph by pooling node (and possibly edge) embeddings, and then using the graph embedding to update node embeddings, following an iterative scheme。但是这种方法忽略了图中潜在的拓扑。

9 Learning GNN Parameters

我们所讨论的embedding computations不论是spectral还是spatial的都是完全可微分的，这允许GNN以端到端的方式进行训练，只要设置一个合适的损失函数 $L\mathcal{L}$ ：

Node Classification
传统的categorical cross-entropy：

GNN也适用于半监督的设置，可以只计算有标记的节点的loss：
Graph Classification
通过聚合点的表征，可以为整个图构建一个向量表征。这种图表征可被用于做包括分类在内的各种graph-level task。
Link Prediction
从相邻和不相邻的节点中采样节点对，并用这些节点对作为输入预测边是否存在。采用类似logistic regression的损失函数：
Node Clustering
仅仅把学习到的节点表征进行聚类。

另一种self-supervised方法是强制相邻的节点得到相似的embeddings，模仿random-walk方法如node2vec和DeepWalk：

10 Conclusion and Further Reading

推荐两篇图神经网络的综述：[29] [30]

GNNs in Practice
GCN中提出将update公式改为这种形式，一边在GPU上有效进行GNN的向量化实现：

Regularization技术如Dropout等也可以直接用于GNN中，此外还有图专用的正则化技术如DropEdge。
Different Kinds of Graphs
本文关注的是无向图，但是还有一些spatial convolution的简单变体可以用于有向图、时序图、异构图。
Pooling
为了graph-level task，pooling可用于学习图表征。
简单的方式就是将最后的节点表征聚合起来通过一个predict函数：
此外，还有医学更有力的POOLING的方式：SortPool，DiffPool，SAGPool。

【论文笔记(2)】图卷积网络介绍 Understanding Convolutions on Graphs相关推荐

GCN图卷积网络 | 介绍
目录 0 前言 1 基于空间域的GCN[2] 2 基于谱域的GCN 2.1拉普拉斯矩阵 2.2为什么GCN要用拉普拉斯矩阵? 2.3 拉普拉斯矩阵的谱分解(特征分解) 2.4卷积的定义 2.5傅里叶变 ...
论文阅读笔记：《一种改进的图卷积网络半监督节点分类》
论文阅读笔记:<一种改进的图卷积网络半监督节点分类> 文章目录论文阅读笔记:<一种改进的图卷积网络半监督节点分类> 摘要: 引言非欧几里得数据 1 深度池化对偶图神经网络 ...
【论文笔记】Revisiting graph based collaborative Filtering:一种线性残差图图卷积网络方法
Revisiting Graph based Collaborative Filtering:A Linear Residual Graph Convolutional Network Approac ...
【CV论文解读】AAAI2021 | 在图卷积网络中超越低频信息
论文解读者:北邮 GAMMA Lab 博士生薄德瑜题目: 在图卷积网络中超越低频信息会议: AAAI 2021 论文链接: https://arxiv.org/abs/2101.00797 图 ...
178页，四年图神经网络研究精华，图卷积网络作者Thomas Kipf博士论文公布
点上方蓝字计算机视觉联盟获取更多干货在右上方 ··· 设为星标 ★,与你不见不散仅作分享,不代表本公众号立场,侵权联系删除转载于:机器之心 AI博士笔记系列推荐周志华<机器学习>手 ...
图卷积网络(Graph Convolutional Networks, GCN)详细介绍
本文翻译自博客. 在这篇博文中会为大家详细地介绍目前使用广泛的图神经网络--图卷积网络(Graph Convolutional Networks, GCN)的相关知识.首先将带领大家直觉上感受其工作原 ...
【图神经网络研究精华】图卷积网络作者Thomas Kipf博士论文公布
关注上方"深度学习技术前沿",选择"星标公众号", 资源干货,第一时间送达! 转载自:机器之心对于普通人来说,将自己的学位论文公布到社交媒体可能需要点勇气.但 ...
图卷积网络进行骨骼识别代码_【骨骼行为识别】2s-AGCN论文理解
Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition 论文链接: https:/ ...
直播 | WWW 2021论文解读：论解耦图卷积网络和标签传播的等价性
「AI Drive」是由 PaperWeekly 和 biendata 共同发起的学术直播间,旨在帮助更多的青年学者宣传其最新科研成果.我们一直认为,单向地输出知识并不是一个最好的方式,而有效地反馈和 ...
论文浅尝 | 基于图卷积网络的跨语言图谱实体对齐
论文笔记整理:谭亦鸣,东南大学博士生,研究兴趣:知识图谱问答本文提出了一种基于图卷积网络的跨语言实体对齐方法,通过设计一种属性 embedding 用于 GCN 的训练,发现GCN能同时学习到特征 ...

【论文笔记(2)】图卷积网络介绍 Understanding Convolutions on Graphs

目录