从头构建Graph

DGL通过DGLGraph对象来创建一个有向图，我们可以直接通过指定节点，以及src节点和target节点来创建一个graph。节点的id从0开始。

例如下面一段代码构建了一个有向星型图，共有6个节点，中心节点的id是0，边分别是从中心到叶子节点。

import dgl
import numpy as np
import torchg = dgl.graph(([0, 0, 0, 0, 0], [1, 2, 3, 4, 5]), num_nodes=6)
# Equivalently, PyTorch LongTensors also work.
g = dgl.graph((torch.LongTensor([0, 0, 0, 0, 0]), torch.LongTensor([1, 2, 3, 4, 5])), num_nodes=6)# You can omit the number of nodes argument if you can tell the number of nodes from the edge list alone.
g = dgl.graph(([0, 0, 0, 0, 0], [1, 2, 3, 4, 5]))
print(g.edges())

输出：

(tensor([0, 0, 0, 0, 0]), tensor([1, 2, 3, 4, 5]))

注意DGLGraph默认使用有向图以方便计算，如果要创建无向图，需要创建一个bidirectional graph。

指定节点和边的特征

很多graph的节点和边都包含属性，虽然在现实中，节点和边的属性都是任意的，但是在DGLGraph中，只接受tensor(numerical contents)类型。因此，节点或边的属性必须拥有相同的shape，我们将节点或边的属性称作features。

我们可以通过ndata和edata接口来分配或检索节点和边的features：

# Assign a 3-dimensional node feature vector for each node.
g.ndata['x'] = torch.randn(6, 3)
# Assign a 4-dimensional edge feature vector for each edge.
g.edata['a'] = torch.randn(5, 4)# 可以给每个节点一个5*4维的特征矩阵
g.ndata['y'] = torch.randn(6, 5, 4)
print(g.edata['a'])
print(g.ndata)

输出结果如下：

tensor([[-0.0667,  0.4955, -0.7605,  1.0864],[ 1.3225, -0.0638,  0.4249, -0.9549],[ 1.5967,  0.6893,  0.0693, -0.2029],[ 0.9383, -0.9211, -0.3480, -1.2578],[-0.2237, -0.3213,  0.2844, -2.1222]])
{'x': tensor([[-1.5809,  0.5587, -0.2951],[-0.6097, -0.6469,  1.1301],[-0.2151, -0.4234,  0.1623],[ 0.0742,  1.0390,  0.4727],[-0.9907,  0.0712, -1.9446],[-0.8176, -1.3116, -0.0609]]), 'y': tensor([[[ 0.5657,  1.5601, -0.5912, -0.9166],[-2.1264, -1.0500, -1.5964,  0.8197],[-0.1032,  0.8735,  0.5557,  0.2568],[-0.2407,  0.5534, -0.4418, -0.8438],[-1.3463, -1.3163,  0.4165,  0.4069]],[[ 0.5147,  0.7456,  0.5775, -0.8002],[-0.7700, -0.9576, -0.4264,  0.5365],[-0.2953, -1.0986, -0.7701,  0.6752],[-0.8701,  0.0455, -0.0241,  1.4218],[ 0.8420,  1.5854,  0.2167, -0.3292]],[[-1.9216, -1.4101, -0.8027, -0.1626],[ 0.8344,  0.6824, -0.2703, -0.6369],[-0.8784, -1.3154,  2.5829, -0.6084],[ 1.0764,  0.6415, -0.0548,  2.0256],[-0.2596,  0.9234, -0.7495,  0.4572]],[[ 0.4695,  0.4599,  1.0253, -1.6217],[-0.9483, -1.1822,  0.6945,  0.2053],[-1.6246, -0.2697,  0.3077, -0.1492],[ 0.1044,  1.3403, -2.2207,  0.7767],[-1.0187,  0.9309,  1.3097, -0.8092]],[[ 0.3566, -0.9140, -0.0288,  0.8432],[ 1.4894, -1.0284,  1.1628,  1.6677],[ 0.9602,  0.8019,  0.2072,  0.9472],[-0.8926,  0.6656,  0.2531,  0.1492],[ 1.2601,  0.5075,  0.6341, -0.6500]],[[-0.3595,  0.8457,  0.0679, -0.2255],[ 0.0466, -0.5428, -1.8730,  1.7333],[-1.2157,  0.6068, -0.4385, -1.6794],[ 0.2981,  1.2320, -0.1630,  2.4952],[-1.3436, -2.1708, -0.3203, -0.9315]]])}

目前有很多成熟的方法可以将属性编码成numerical features：
对于分类属性（比如性别，职业），可以通过one-hot进行编码
对于字符串型的属性（比如文章引用），可以考虑nlp模型
对于图像的属性，可以考虑CNN等视觉模型

获取Graph的结构信息

DGLGraph对象提供了很多方法来获取Graph的结构信息：

print(g.num_nodes())  # 6
print(g.num_edges())  # 5
# Out degrees of the center node
print(g.out_degrees(0))  # 5 出度 图中每个节点出度的最大值
# In degrees of the center node - note that the graph is directed so the in degree should be 0.
print(g.in_degrees(0))  # 0 入度 图中每个节点入度的最大值

Graph 转换

DGL提供了很多转换的API，例如从一个图中抽取一张子图。

# 从原始图中获取节点0、节点1和节点3来成成一个子图
sg1 = g.subgraph([0, 1, 3])
# 从原始图中抽取边0、边1和边3来成成一个子图
sg2 = g.edge_subgraph([0, 1, 3])

可以在子图中通过dgl.NID或者dgl.EID来获取父图中节点和边的映射。

# The original IDs of each node in sg1
print(sg1.ndata[dgl.NID])
# The original IDs of each edge in sg1
print(sg1.edata[dgl.EID])
# The original IDs of each node in sg2
print(sg2.ndata[dgl.NID])
# The original IDs of each edge in sg2
print(sg2.edata[dgl.EID])

输出：

tensor([0, 1, 3])
tensor([0, 2])
tensor([0, 1, 2, 4])
tensor([0, 1, 3])

subgraph和edge_subgraph同样会将父图的属性复制一份给子图：

# sg1的每个原始节点的属性
print(sg1.ndata['x'])
# sg1的每个原始边的属性
print(sg1.edata['a'])
# sg2中每个原始节点的属性
print(sg2.ndata['x'])
# sg2中每个原始边的属性
print(sg2.edata['a'])

还有一种常用的转换方式就是通过给原始graph添加反向边dgl.add_reverse_edges

newg = dgl.add_reverse_edges(g)
newg.edges()

如果你有一个无向图，可以通过这种方式来转成bidirectional graph。这种方式会将原始图中所有的边都添加一个反向边。

保存、加载Graph

我们可以通过dgl.save_graphs来保存一个Graph或者一个Graph列表，通过dgl.load_graphs来加载graph

# 保存graph
dgl.save_graphs('graph.dgl', g)
dgl.save_graphs('graphs.dgl', [g, sg1, sg2])
# 加载graph
(g,), _ = dgl.load_graphs('graph.dgl')
print(g)
(g, sg1, sg2), _ = dgl.load_graphs('graphs.dgl')
print(g)
print(sg1)
print(sg2)

输出：

Graph(num_nodes=6, num_edges=5,ndata_schemes={'y': Scheme(shape=(5, 4), dtype=torch.float32), 'x': Scheme(shape=(3,), dtype=torch.float32)}edata_schemes={'a': Scheme(shape=(4,), dtype=torch.float32)})
Graph(num_nodes=6, num_edges=5,ndata_schemes={'y': Scheme(shape=(5, 4), dtype=torch.float32), 'x': Scheme(shape=(3,), dtype=torch.float32)}edata_schemes={'a': Scheme(shape=(4,), dtype=torch.float32)})
Graph(num_nodes=3, num_edges=2,ndata_schemes={'_ID': Scheme(shape=(), dtype=torch.int64), 'x': Scheme(shape=(3,), dtype=torch.float32), 'y': Scheme(shape=(5, 4), dtype=torch.float32)}edata_schemes={'_ID': Scheme(shape=(), dtype=torch.int64), 'a': Scheme(shape=(4,), dtype=torch.float32)})
Graph(num_nodes=4, num_edges=3,ndata_schemes={'_ID': Scheme(shape=(), dtype=torch.int64), 'x': Scheme(shape=(3,), dtype=torch.float32), 'y': Scheme(shape=(5, 4), dtype=torch.float32)}edata_schemes={'_ID': Scheme(shape=(), dtype=torch.int64), 'a': Scheme(shape=(4,), dtype=torch.float32)})

DGL教程【二】如何通过DGL表示一个Graph相关推荐

DGL教程【四】使用GNN进行链路预测
在之前的介绍中,我们已经学习了使用GNN进行节点分类,比如预测一个图中的节点所属的类别.这一节中我们将教你如何进行链路预测,比如预测任意两个节点之间是不是存在边. 本节你将学到: 构建一个GNN的链路 ...
DGL教程【三】构建自己的GNN模块
有时,利用现有的GNN模型进行堆叠无法满足我们的需求,例如我们希望通过考虑节点重要性或边权值来发明一种聚合邻居信息的新方法. 本节将介绍: DGL的消息传递API 自己实现一个GraphSage卷积模 ...
DGL教程【一】使用Cora数据集进行分类
本教程将演示如何构建一个基于半监督的节点分类任务的GNN网络,任务基于一个小数据集Cora,这是一个将论文作为节点,引用关系作为边的网络结构. 任务就是预测一个论文的所属分类.每一个论文包含一个词频信 ...
Homebrew进阶使用教程(二)-用一个命令行天气客户端构建自己的仓库
[homebrew 系列文章] HomeBrew常规使用教程 Homebrew进阶使用教程(一) Homebrew进阶使用教程(二)-用一个命令行天气客户端构建自己的仓库 Homebrew进阶使用教程 ...
【DGL教程】第4章图数据集
官方文档:https://docs.dgl.ai/en/latest/guide/data.html dgl.data实现了很多常用的图数据集,这些数据集都是dgl.data.DGLDataset的子 ...
【DGL教程】第1章图(Graph)
Deep Graph Library (DGL)是一个用于构建图神经网络模型的框架网址:https://www.dgl.ai/ 官方文档:https://docs.dgl.ai/ 论坛:https: ...
python爬虫入门教程(二)：开始一个简单的爬虫
2019/10/28更新使用Python3,而不再是Python2 转载请注明出处:https://blog.csdn.net/aaronjny/article/details/77945329 爬 ...
图神经网络框架DGL教程-第4章：图数据处理管道
更多图神经网络和深度学习内容请关注: 第4章:图数据处理管道 DGL在 dgl.data 里实现了很多常用的图数据集.它们遵循了由 dgl.data.DGLDataset 类定义的标准的数据处理管道. ...
黄聪：Microsoft Enterprise Library 5.0 系列教程(二) Cryptography Application Block (高级)
原文:黄聪:Microsoft Enterprise Library 5.0 系列教程(二) Cryptography Application Block (高级) 本章介绍的是企业库加密应用程序模块 ...

DGL教程【二】如何通过DGL表示一个Graph