翻译一篇TNN 的论文仅用于学习
原文章链接
有道翻译的也是用了第一人称。如果有错,一定是你对。

题目:Adaptive Propagation Graph Convolutional Network

自适应传播图卷积网络

摘要:

图卷积网络(GCNs)是一类神经网络模型,通过交叉顶点操作跨节点的消息传递交换对图数据进行推理。关于后者,出现了两个关键问题:1)如何设计一个可微交换协议(例如,原始GCN中的一跳拉普拉斯平滑),2)如何在复杂性上描述相对于本地更新的权衡。在本文中,我们展示了最先进的结果可以通过在每个节点上独立地调整通信步数来实现。特别地,我们为每个节点赋予一个暂停单元(灵感来自Graves的自适应计算时间[1]),在每次交换后决定是否继续通信。我们证明了所提出的自适应传播GCN (AP-GCN)在许多基准测试中实现了优于或类似于目前所提出的最佳模型的结果,同时在额外参数方面需要较小的开销。我们还研究了一个正则化术语,以加强通信和准确性之间的明确权衡。AP-GCN实验的代码是作为开源库发布的。

Graph convolutional networks (GCNs) are a family of neural network models that perform inference on graph data by interleaving vertexwise operations and message-passing exchanges across nodes. Concerning the latter, two key questions arise: 1) how to design a differentiable exchange protocol (e.g., a one-hop Laplacian smoothing in the original GCN) and 2) how to characterize the tradeoff in complexity with respect to the local updates. In this brief, we show that the state-ofthe- art results can be achieved by adapting the number of communication steps independently at every node. In particular, we endow each node with a halting unit (inspired by Graves adaptive computation time [1]) that after every exchange decides whether to continue communicating or not. We show that the proposed adaptive propagation GCN (AP-GCN) achieves superior or similar results to the best proposed models so far on a number of benchmarks while requiring a small overhead in terms of additional parameters. We also investigate a regularization term to enforce an explicit tradeoff between communication and accuracy. The code for the AP-GCN experiments is released as an open-source library.

introduction:

深度学习在许多高维输入方面取得了显著的成功,通过适当地设计可以利用其属性的架构偏差。这包括图像(通过卷积过滤器)[2]、文本[3]、生物医学序列[4]和视频[5]。因此,一个主要的研究问题是,如何通过实现新的可微块来将这种成功复制到其他类型的数据上。在各种可能性中,图表代表了世界上最大的数据来源之一,范围从推荐系统[6]到生物医学应用[7]、社会网络[8]、计算机程序[9]、知识库[10]和许多其他。

Deep learning has achieved remarkable success on a number of high-dimensional inputs, by properly designing architectural biases that can exploit their properties. This includes images (through convolutional filters) [2], text [3], biomedical sequences [4], and videos [5]. A major research question, then, is how to replicate this success on other types of data, through the implementation of novel differentiable blocks adequate to them. Among the possibilities, graphs represent one of the largest sources of data in the world, ranging from recommender systems [6] to biomedical applications [7], social networks [8], computer programs [9], knowledge bases [10], and many others.

在其最一般的形式中,图是由一组顶点组成,顶点由一系列边连接,例如,社会关系、引文或任何形式的关系。图神经网络(gnn)可以通过交叉局部操作(在单个节点或边上定义)和通信步骤来设计,利用图拓扑来组合局部输出。然后,这些架构可以用于各种任务,从节点分类到边缘预测和路径计算。

In its most general form, a graph is composed of a set of vertices connected by a series of edges representing, e.g., social connections, citations, or any form of relation. Graph neural networks (GNNs) [11] [13], then, can be designed by interleaving local operations (defined on either individual nodes or edges) with communication steps, exploiting the graph topology to combine the local outputs. These architectures can then be exploited for a variety of tasks, ranging from node classification to edge prediction and path computation.

在过去几年提出的不同GNN模型家族中,图卷积网络(GCNs)[14]已经成为节点和图分类的一种事实上的标准,代表了图处理上下文中最简单(但有效)的构建块之一。GCNs是通过交叉顶点操作构建的,通过单个全连接层实现,利用图的所谓拉普拉斯矩阵进行通信。在实践中,单一的GCN层提供了跨邻居的加权信息组合,表示本地化的单跳信息交换。

Among the different families of GNN models proposed over the last years, graph convolutional networks (GCNs) [14] have become a sort of de facto standard for node and graph classification, representing one of the simplest (yet efficient) building blocks in the context of graph processing.GCNs are built by interleaving vertexwise operations, implemented via a single fully connected layer, with a communication step exploiting the so-called Laplacian matrix of the graph. In practice, a single GCN layer provides a weighted combination of information across neighbors, representing a localized one-hop exchange of information.1

以GCN层为基本构建块,最近几个研究问题受到了广泛的关注,最引人注目的是:1)如何设计更有效的通信协议,能够提高GCN的准确性和潜在地更好地利用图的结构 ,2)如何权衡本地(顶点)操作的数量与通信步骤[18]。虽然我们将相关工作的完整概述推迟到第二节,但我们在这里简要提到两个关键结果。首先,Li等人的[19]表明,拉普拉斯算子(一种平滑算子)的使用导致重复应用标准GCN层往往会过度平滑数据,不允许简单地叠加GCN层来获得极深的网络。其次,Klicpera等人的[18]表明,只要完全将节点之间的通信与顶点操作分离,用PageRank变量替换拉普拉斯通信步骤,就可以获得最先进的结果。我们稍后将利用这两个关键结果。

Taking the GCN layer as a fundamental building block, several research questions have received vast attention lately, most notably: 1) how to design more effective communication protocols, able to improve the accuracy of the GCN and potentially better leverage the structure of the graph [15] [17] and 2) how to trade off the amount of local (vertexwise) operations with the communication steps [18]. While we defer a complete overview of related works to Section II, we briefly mention two key results here. First, Li et al. [19] showed that the use of the Laplacian (a smoothing operator) has as a consequence that repeated application of standard GCN layers tends to oversmooth the data, disallowing the possibility of naively stacking GCN layer to obtain extremely deep networks. Second, Klicpera et al. [18] showed that the state-of-the-art results can be obtained by replacing the Laplacian communication step with a PageRank variation, as long as completely separating communication between nodes from the vertexwise operations. We exploit both of these key results later on.

Contributions of This Brief:

我们注意到,绝大多数改进前面提到的第1点的建议包括选择一个最大的通信步骤T,并为T步骤迭代一个简单的协议,以便在T跳邻居之间传播信息。在本文中,我们提出了以下研究问题:如果允许每个顶点的通信步数独立变化,是否可以提高GCN层的性能?

We note that the vast majority of proposals to improve point 1) mentioned before consists in selecting a certain maximum number of communication steps T and iterating a simple protocol for T steps in order to diffuse the information across T -hop neighbors. In this brief, we ask the following research question: can the performance of GCN layers be improved, if the number of communication steps is allowed to vary independently for each vertex ?

为了回答这个问题,我们提出了一种GCN的变体,我们称之为自适应传播GCN (AP-GCN)。在AP-GCN(见图1)中,每个顶点都被赋予一个额外的单元,该单元输出一个值来控制通信是继续进行下一步(从而结合较远的邻居的信息)还是停止,并保留最后的值以进行进一步处理。为了实现这个自适应单元,我们利用之前在递归神经网络[1]中关于自适应计算时间的工作,设计了一个可微方法来学习这种传播策略。在大量的比较和基准测试中,我们表明AP-GCN可以达到最先进的结果,而通信步骤的数量不仅在数据集之间,而且在各个顶点之间也可能有显著差异。在计算时间和额外的可训练参数方面,这是以极小的开销实现的。此外,我们进行了一个大型超参数分析,表明我们的方法可以提供一种简单的方法来平衡GCN的精度与传播步骤的数量。

To answer this question, we propose a variation of GCN that we call adaptive propagation GCN (AP-GCN). In the AP-GCN (see Fig. 1), every vertex is endowed with an additional unit that outputs a value controlling whether the communication should continue for another step (hence combining the information from neighbors farther away) or should stop, and the final value should be kept for further processing. In order to implement this adaptive unit, we leverage previous work on adaptive computation time in recurrent neural networks [1] to design a differentiable method to learn this propagation strategy. On an extensive set of comparisons and benchmarks, we show that AP-GCN can reach state-of-the-art results, while the number of communication steps can vary significantly not only across data sets but also across individual vertexes. This is achieved with an extremely small overhead in terms of computational time and additional trainable parameters. In addition, we perform a large hyperparameter analysis, showing that our method can provide a simple way to balance the accuracy of the GCN with the number of propagation steps.

相关的工作:

GCNs属于光谱gnn的一类,它是基于图形信号处理(GSP)工具[20][23]。GSP允许通过利用所谓的图拉普拉斯特征分解来定义图上的傅里叶变换。这个理论在图神经网络中的第一次应用是在[24]中。然而,这种方法的计算量大,且没有空间本地化,这意味着每个节点的更新依赖于整个图结构。后来的建议[25]表明,通过适当地限制应用在频域的滤波器类,可以得到一个更简单的公式,它也在图域中空间本地化。多项式滤波器[25]可以通过图上的T跳交换来实现,但它们需要为所有顶点选择一个先验有效的T。在[14]中引入的GCN表明,即使使用更简单的线性(即单跳)操作,也可以获得最先进的结果。然而,在实践中,他们未能构建更深层的架构(例如,2个GCN层)。

GCNs belong to the class of spectral GNNs, which are based on graph signal processing (GSP) tools [20] [23]. GSP allows to define a Fourier transform over graphs by exploiting the eigendecomposition of the so-called graph Laplacian. The first application of this theory to graph NNs was in [24]. This approach, however, was both computationally heavy and not spatially localized, meaning that each nodewise update depended on the entire graph structure. Later proposals [25] showed that by properly restricting the class of filters applied in the frequency domain, one could obtain a simpler formulation that was also spatially localized in the graph domain. Polynomial filters [25] can be implemented via T -hop exchanges on the graph, but they require to select a priori a valid T for all the vertices. The GCN, introduced in [14], showed that state-ofthe- art results could be obtained even with simpler linear (i.e., onehop) operations. However, they failed to build deeper architectures (i.e., >2 GCN layers) in practice.

Li等人对GCN的性质进行了正式的分析,表明构建更深层次网络的难度可能取决于重复应用拉普拉斯算子对数据的超平滑。进一步的分析和在gnn中考虑更高阶结构的需要在[26]中提供,表明GCNs等价于所谓的1-D Weisfeiler Leman图同构启发式。最近的几篇论文提出通过使用不同类型的传播方法来避免这些缺点,最显著的是PageRank的变化[17],[18]。

Li et al. [19] formally analyzed the properties of the GCN, showing that the difficulty of building deeper networks could depend on the oversmoothing of the data due to a repeated application of the Laplacian operator. Further analyses and the need to consider higher order structures in GNNs were provided in [26], showing that GCNs are equivalent to the so-called 1-D Weisfeiler Leman graph isomorphism heuristic. Several recent papers have proposed to avoid some of these shortcomings by using different types of propagation methods, most notably PageRank variations [17], [18].

在本文中,我们探索了一个正交的想法,我们假设不仅可以通过修改现有的传播方法来提高性能,而且可以通过允许每个节点以一种自适应的方式独立于其他节点来改变通信量。跳跃知识(JK)网络[27]和GeniePath[28]通过利用额外的网络聚合组件(例如LSTM网络)实现类似的功能;然而,经过多个扩散步骤,他们未能达到最先进的结果[17]。

In this brief, we explore an orthogonal idea, where we hypothesize that performance can be improved not only by modifying the existing propagation method but by allowing each node to vary the amount of communication independently from the others, in an adaptive fashion. Jumping knowledge (JK) networks [27] and GeniePath [28] achieve something similar by exploiting an additional network aggregation component (e.g., an LSTM network); after multiple diffusion steps, however, they fail to reach state-of-the-art results [17].

最后,我们强调在本文中我们关注的是GCN,但gnn的替代模型已经被设计出来,包括来自[29]、图注意网络[30]、图嵌入等。我们为更多的信息[12],[13]引用关于这个主题的多个最近的调查。

Finally, we underline that we focus on GCN in this brief, but alternative models for GNNs have been devised, including those from [29], graph attention networks [30], graph embeddings, and others.We refer to multiple recent surveys on the topic for more information [12], [13].

Graph Convolutional Networks:

更一般地说,拉普拉斯矩阵可以用不同的方式重新规格化(见[14]),或者用图上定义的任何适当的移位算子替换。如第二节所述,GCN的名称来源于对第(1)项的GSP[20]的解释。一个图的傅里叶变换可以通过考虑拉普拉斯矩阵[23]的特征分解来定义。在这种情况下,(1)可以被证明等价于用线性滤波器[14]实现的图卷积。因为它的实现只需要在邻居之间进行一跳交换,所以GCN也是消息传递神经网络(MPNN)[13]的一个例子

这两种解释为(1)中的基本模型提出了两类扩展,我们对其进行评论,认为它们与我们提出的方法有关。首先,在GSP解释下,用更复杂的滤波器代替线性滤波操作是有意义的特别是,多项式滤波器可以通过结合来自每个节点的高阶邻域的信息来实现,这取决于多项式[32]的程度。例如,Chebyshev过滤器[25]会导致下面的层(为简单起见省略了偏差)

然而,如果将(1)更广义地解释为MPNN,我们并不局限于考虑过滤操作。事实上,(1)最一般的扩展为(为简化单个节点i表示)[13]

be implemented as generic neural networks or any other differentiable mechanism.(实现为一般神经网络或任何其他可微机制。)最值得注意的是,Klicpera等人[18]提出使用(近似的)PageRank协议的传播步骤,以抵消重复应用拉普拉斯矩阵[19]的超平滑效应,尽管传播步骤的最大数量仍然必须由用户先验选择。

有趣的是,PageRank传播[18]和密切相关的ARMA模型[16]可以理解为在图[33]上近似有理滤波器,这通常比线性或多项式滤波器更有表现力。

Interestingly, PageRank propagation [18] and the closely related ARMA models [16] can be understood as approximating rational filters on the graph [33], which are in general more expressive than linear or polynomial filters.

Designing and Training Deep GCNs

在经典的深度网络的精神下,可以组合III-B节中描述的基本构建块来设计更深层次的架构。例如,一个具有单个隐藏层和一个输出层的二进制分类网络,根据(1)实现,定义为

其中,可调整的权重为W、v、b和c。在[17]中推广的一个更近期的推理是,以(3)的形式实现架构,使w ?和W ?(打不出来,看原文吧)更深层次的网络,但不交叉多个节点和传播步骤。我们在这里遵循这一设计原则,因为我们发现它在实践中表现得更好。

where the adaptable weights are W, v, b, and c. A more recent line of reasoning, popularized in [17], is to implement architectures in the form (3), making both w?and W? deeper network, but without interleaving multiple nodewise and propagation steps. We follow this design principle here, as we have found it to perform better empirically.

一旦一个特定的网络f被设计,它的优化遵循与其他深度网络相同的策略。例如,对于节点分类(如第三- a节所述),我们使用已知节点标签上的交叉熵损失来优化网络

Once a specific network f has been designed, its optimization follows the same strategies as for other deep networks. For example, for node classification (as described in Section III-A), we optimize the network with a cross-entropy loss on the known node labels 。

然而,请注意,与标准神经网络不同的是,f (xi)的输出将取决于几个其他节点,这取决于特定的架构。因此,(5)在随机方式下更难有效地求解[34]。

Note, however, that differently from standard neural networks, the output of f (xi ) will depend on several other nodes, depending on the specific architecture. For this reason, (5) is harder to solve efficiently in a stochastic fashion [34].

PROPOSED ADAPTIVE PROPAGATION PROTOCOL:

在第二节和第三节中,我们分析了在图中使用具有复杂扩散步骤的图模块的动机。然而,绝大多数的建议都考虑了一个单一的、为图中所有节点共享的最大通信步骤数[例如,(2)中的数字K]。在本节中,我们介绍了一种新的GCN变体,它为每个节点独立选择通信步数,并在训练过程中对该数目进行调整和实时计算。据我们所知,我们提出的AP-GCN是文献中唯一结合这两个特性的模型。

In Sections II and III, we analyzed the motivation for having graph modules with complex diffusion steps across the graph. However, the vast majority of proposals have considered a single, maximum number of communication steps that are shared for all the nodes in the graph [e.g., the number K in (2)]. In this section, we introduce a novel variation of GCN in which the number of communication steps is selected independently for every node and this number is adapted and computed on-the-fly during training. To the best of our knowledge, our proposed AP-GCN is the only model in the literature combining these two properties.

我们的AP-GCN框架如图1所示。考虑到(3)中的符号,我们从传播步骤Ψ\PsiΨ分离节点操作ψ\psiψ。前者通过在单个节点zj=ψ(xj)z_j=\psi(x_j)zj=ψ(xj)上应用通用神经网络实现,如图1左侧所示。然后,这种嵌入被用作迭代完成传播步骤Ψ\PsiΨ的起始种子。

该方案的关键是,传播步数依赖于节点i的索引,在传播过程中自适应计算。实现该机制的灵感来自于循环神经网络(RNNs)[1]的自适应计算时间。
首先,我们赋予每个节点一个线性二分类器作为传播过程的暂停单元。在传播的一般迭代k之后,我们按节点计算

为了确保传播步骤的数目保持合理,遵循[1],我们采用了两种技术。首先,我们确定一个最大迭代次数T。其次,我们使用停止值的运行和来定义传播过程的预算

当k = Ki时,达到预算,节点i在迭代k时停止传播。我们将暂停概率合并如下


通过这种方式,序列{pi}形成了一个有效的停止概率{hi}的累积分布。利用它,而不是使用最新的价值传播,我们可以自适应地组合信息在每一步免费

传播步骤的数量可以通过定义传播代价Si来控制,类似于[1],它表示更新第i个节点所需的传播步骤的数量

传播惩罚负责在计算时间和精度之间进行权衡。此外,它还规定了信息在图表上传播的容易程度。在实际操作中,主网络每L步交替优化一次停机单元(在我们的实验中,L = 5)。

Adaptive Propagation Graph Convolutional Network相关推荐

  1. 【GCN】《Adaptive Propagation Graph Convolutional Network》(TNNLS 2020)

    <Adaptive Propagation Graph Convolutional Network>(TNNLS 2020) 为每个节点赋予一个停止单元,该单元输出一个值控制Propaga ...

  2. 【论文】解读AM-GCN: Adaptive Multi-channel Graph Convolutional

    解读AM-GCN: Adaptive Multi-channel Graph Convolutional 摘要 当下提出的新问题:GCNs能否可以在信息丰富的复杂图中优化集成节点的特征喝拓扑结构.提出 ...

  3. 图卷积网络 GCN Graph Convolutional Network(谱域GCN)的理解和详细推导

    文章目录 1. 为什么会出现图卷积神经网络? 2. 图卷积网络的两种理解方式 2.1 vertex domain(spatial domain):顶点域(空间域) 2.2 spectral domai ...

  4. 论文笔记(SocialGCN: An Efficient Graph Convolutional Network based Model for Social Recommendation)

    一个有效的基于图卷积神经网络的社交推荐模型 原文链接:SocialGCN: An Efficient Graph Convolutional Network based Model for Socia ...

  5. paper reading:Part-based Graph Convolutional Network for Action Recognition

    paper reading:Part-based Graph Convolutional Network for Action Recognition 文章目录 paper reading:Part- ...

  6. 行人轨迹论文:STUGCN:A Social Spatio-Temporal Unifying Graph Convolutional Network for Trajectory Predictio

    STUGCN:A Social Spatio-Temporal Unifying Graph Convolutional Network for Trajectory Prediction用于轨迹预测 ...

  7. RA-GCN:Richly Activated Graph Convolutional Network for Robust Skeleton-based Action Recognition

    Richly Activated Graph Convolutional Network for Robust Skeleton-based Action Recognition TCSVT2020 ...

  8. 多尺度动态图卷积神经网络----Multi-scale Dynamic Graph Convolutional Network for Hyperspectral Image Classificati

    一.摘要 卷积神经网络(CNN)在表示高光谱图像和实现高光谱图像分类方面表现出令人印象深刻的能力.然而,传统的CNN模型只能对固定大小和权重的规则正方形图像区域进行卷积,因此不能普遍适用于具有不同对象 ...

  9. CaEGCN: Cross-Attention Fusion based Enhanced Graph Convolutional Network for Clustering 2021

    问题:现有的深度聚类方法往往忽略了数据之间的关系. 本文提出了一种基于交叉注意的深度聚类框架--基于交叉注意融合的增强型图形卷积网络(CaEGCN) ,该网络包含四个主要模块: 交叉注意融合模块,创新 ...

最新文章

  1. 可构造样式表 - 通过javascript来生成css的新方式
  2. Java SE 6 新特性: 编译器 API
  3. Asp.net HttpClient Proxy(Fiddler)
  4. “约见”面试官系列之常见面试题第九篇vue实现双向绑定原理(建议收藏)
  5. java 方法 示例_Java语言环境getISOLanguages()方法与示例
  6. String path = request.getContextPath(....拼装当前网页的相对路径
  7. 软件工程复习提纲——第七章
  8. 文件不能自动求和_Excel求和公式函数的使用方法教程
  9. xp 系统不能够通过网络访问解决方法
  10. Logisim实现运动码表
  11. 电脑绣花制版软件评比(上)
  12. 腾讯云短信功能初步使用
  13. 中国NPP净初级生产力数据/植被覆盖空间分布数据分享(2000-2021)
  14. pytorch实现LeNet5手写数字识别+各层特征图可视化
  15. 一个简单的CORBA例子
  16. ng6的ng-template的一个用法
  17. Linux搭建samba服务及使用案例
  18. java调用微信支付流程
  19. PPT中如何制作两圆交叉阴影图
  20. 如何让一幅图片逐渐变淡

热门文章

  1. Google天涯问答提问遭遇
  2. 利用手机号登录获取手机验证码
  3. 16进制转10进制 nodejs_Js字符串与十六进制的相互转换 【转】
  4. 项目准备和启动——软件项目合同条款评审
  5. 使用element-ui中tree树状图
  6. Codeforces 439 A. Devu, the Singer and Churu, the Joker
  7. J-Link下载烧录提示Failed to read back RAMCode for verification
  8. Java笔记-08 异常
  9. c语言实现gps坐标转化,C语言计算GPS卫星位置
  10. 【英国诺森比亚大学】流体与热能课题组招收全奖博士