1. 背景


想要利用deep learning 去预测两个点之间最短路径上的前向或者说是parent or father(当然路径可能不止一条,前向肯定也不是唯一一个,这里后面再说)

因为在用一些求shortest path 算法时,都会利用保存当前点的前向最后得到路径, 或许每个点都训练一个模型,这样根据前向节点,可以把路径一步一步迭代出来。。。当然这是比较愚蠢的做法。。。不过前期先试一试

和导师交流了想法,决定先用grid这个比较规则的形状(就当做棋盘吧),因为每个点最多有四条边,上下左右或者东南西北bala。。。可以让他们代表四个通道类似与图像的RBG三个通道。这里我grid选400个点的,尺寸也就是20 × 20 × 4。

4个通道,我令 north 为第 0 个维度, 顺时针, east 为第 1 个维度, south 第 2 个, west 第 3 个,边存在的方向,值为 1 , 边不存在的方向,值为 inf

起点就选(0,0),你可以想象一下面前有一个棋盘,但是这个棋盘有的边是不存在的,但是所有点是全联通的,选取左上角为起点,现在要用BFS遍历这个棋盘,保存每个节点的距离和前向, 这样得到的路径就是一个从起点到每个其他点的最短路的树。

2. 实验

step1: 生成数据

随机生成 1 million grids(20*20*4),有边的值为1, 没有边的值为inf,BFS遍历得到前向节点,具体看代码

step2: 用CNN模型训练,4个卷积层,一个全连接层,lable用的是随机选取的终点,这里我选取第151个点,结果训练三个minibatch基本上accuracy就收敛了,然后一直波动。


3. 总结


4. 后续

和导师交流了一下,觉得应该用反卷积(deconvolution),这样可以输出一个channel(点标号从0~399) 20*20 ,或者两个channel(点用坐标表示)也是20×20,这里输出的是每个点的前向,这样可以得到一个最短树





experiment 1
try to use CNN to predict the parents in SSSP
generate 1000000 20*20*4 grids(4 means each nodes maybe have 4 degree, 20 means the number of nodes
shape as below:norththe start<-####################west##########east####################south
Checking whether a grid is connected or not by BFS, and save the connected one along with parents in the single source shortest paths
import numpy as np
import random
import Queue
np.set_printoptions(threshold=np.inf)  # not exist a edge
inf = 0x3fffffff
#set grid's size
N = 20
#[north, east, south, west] if exist a path, then the value should be weight
D = 4
#in this field, we set the weight 0 or 1
weight = 1
#mp density
density = 0.8def generate_mp(size=N, dime=D, dens=density):#init mp matrixe = np.random.random_integers(0,9,(N,N,D))#set connection densitye[e<10*(1-density)] = infe[e!=inf] = 1#set edge value#northe[0,:,0] = inf#easte[:,N-1,1] = inf#southe[N-1,:,2] = inf#weste[:,0,3] = inf#init father matrixfa = np.zeros((N,N,2))fa += -2#init visitedvisited = np.zeros((N,N))#init distance mapdis = np.zeros((N,N))dis += inf#set source nodes = [0,0]#move direction north, east, south, west respectivelydx = [-1,0,1,0]dy = [0,1,0,-1]q = Queue.Queue(0)q.put(s)visited[0,0] = 1dis[0,0] = 0fa[0,0,:] = [-1,-1]while(q.empty() == False):u = q.get()for k in range(D):nx = u[0]+dx[k]ny = u[1]+dy[k]v = [nx,ny]if e[u[0],u[1],k] != inf and visited[nx,ny] == 0:visited[nx,ny] = 1dis[nx,ny] = dis[u[0],u[1]] + 1fa[nx,ny,:] = uq.put(v)if(np.sum(visited) == N*N):output={}output["map"] = eoutput["father"] = fareturn output
num = 0
X = np.zeros((1,N*N*D))
y = np.zeros((1,N*N))
first = 1
while(num<1000000):new_out = generate_mp(N, D, density)if new_out != None:x_ = new_out["map"].reshape(1,-1)y_temp = new_out["father"].reshape(N*N,2)y_ = np.zeros(N*N)y_[range(N*N)] = y_temp[range(N*N), 0] * N + y_temp[range(N*N), 1]if first == 1:X += x_y += y_first = 0num += 1continueX = np.vstack((X, x_))y = np.vstack((y, y_))num += 1if num%1000 == 0:print numif num%10000== 0:np.save("X"+str(num/10000)+".npy", X)np.save("y"+str(num/10000)+".npy", y)print X.shape, y.shapefirst = 1X = np.zeros((1,N*N*D))y = np.zeros((1,N*N))


A Convolution Network is used to train a single source shortest path(SSSP) in grid.
We want to use CNN to predict the father in the shortest-path from start to every node, and that
may need N*N - 1 models. Sounds terrible!
The train data was generated randomly by running the python file called "generate_data.py". We can
get one million grids along with plent of fathers.
Author: Line290
from __future__ import print_function
import numpy as np
import tensorflow as tf#Import data
X = np.load("X1.npy")
y = np.load("y1.npy").astype(np.int32)
for i in range(21):if i == 0 or i == 1:continueX_load = np.load("X"+str(i)+".npy")y_load = np.load("y"+str(i)+".npy").astype(np.int32)X = np.vstack((X,X_load))y = np.vstack((y,y_load))
# print np.shape(X), np.shape(y)
#This is the node that we prepare to reach, there are N*N-1 nodes totally.
#end range from 1 to N*N, in there it's [1,400]
end = 151
end_ = [7,11]
#the number of grid
N = X.shape[0]#extract the endth colume
y_end = np.zeros((N,4))
y_temp = y[:,end]
for i in range(N):if y_temp[i] + 20 == end:y_end[i][0] = 1elif y_temp[i] - 1 == end:y_end[i][1] = 1elif y_temp[i] - 20 == end:y_end[i][2] = 1elif y_temp[i] + 1 == end:y_end[i][3] = 1
# #one-hot, 400 nodes
# y_end = np.zeros((N,4))
# y_end[range(N),list(y_temp)] = 1#Partitoning the data, then get data_train and data_test
X_train = X[:189999]
X_test = X[190000:]
y_train = y_end[:189999]
y_test = y_end[190000:]#Parameters
learning_rate = 0.001
training_iters = 1000000
capacity = X_train.shape[0]
batch_size = 256
display_step = 1#Network Parameters
n_input = 1600 # grid shape: 20*20*4
n_classes = 4 #total nodes
dropout = 0.75#tf Graph input
x = tf.placeholder(tf.float32, [None, n_input])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32) #dropout (keep probability)# Create some wrappers for simplicity
def conv2d(x, W, b, strides=1):# Conv2D wrapper, with bias and relu activationx = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')x = tf.nn.bias_add(x, b)return tf.nn.relu(x)def maxpool2d(x, k=2):# MaxPool2D wrapperreturn tf.nn.max_pool(x, ksize=[1, k, k, 1], strides=[1, k, k, 1],padding='SAME')# Create model
def conv_net(x, weights, biases, dropout):# Reshape input picturex = tf.reshape(x, shape=[-1, 20, 20, 4])# Convolution Layerconv1 = conv2d(x, weights['wc1'], biases['bc1'])# Max Pooling (down-sampling)conv1 = maxpool2d(conv1, k=2)# Convolution Layerconv2 = conv2d(conv1, weights['wc2'], biases['bc2'])# Max Pooling (down-sampling)conv2 = maxpool2d(conv2, k=2)# Convolution Layerconv3 = conv2d(conv2, weights['wc3'], biases['bc3'])# Max Pooling (down-sampling)conv3 = maxpool2d(conv3, k=2)# Convolution Layerconv4 = conv2d(conv3, weights['wc4'], biases['bc4'])# Max Pooling (down-sampling)conv4 = maxpool2d(conv4, k=2)# Fully connected layer# Reshape conv4 output to fit fully connected layer inputfc1 = tf.reshape(conv4, [-1, weights['wd1'].get_shape().as_list()[0]])fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])fc1 = tf.nn.relu(fc1)# Apply Dropoutfc1 = tf.nn.dropout(fc1, dropout)# Output, class predictionout = tf.add(tf.matmul(fc1, weights['out']), biases['out'])return out# Store layers weight & bias
weights = {# 3x3 conv, 4 input, 32 outputs'wc1': tf.Variable(tf.random_normal([3, 3, 4, 32])),# 3x3 conv, 32 inputs, 64 outputs'wc2': tf.Variable(tf.random_normal([3, 3, 32, 64])),# 3x3 conv, 64 inputs, 128 outputs'wc3': tf.Variable(tf.random_normal([3, 3, 64, 128])),# 1x1 conv, 128 inputs, 256 outputs'wc4': tf.Variable(tf.random_normal([1, 1, 128, 256])),# fully connected, 2*2*256 inputs, 2048 outputs'wd1': tf.Variable(tf.random_normal([2*2*256, 2048])),# 2048 inputs, 4 outputs (class prediction)'out': tf.Variable(tf.random_normal([2048, n_classes]))
}biases = {'bc1': tf.Variable(tf.random_normal([32])),'bc2': tf.Variable(tf.random_normal([64])),'bc3': tf.Variable(tf.random_normal([128])),    'bc4': tf.Variable(tf.random_normal([256])),    'bd1': tf.Variable(tf.random_normal([2048])),'out': tf.Variable(tf.random_normal([n_classes]))
}# Construct model
pred = conv_net(x, weights, biases, keep_prob)# Define loss and optimizer
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)# Evaluate model
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))# Initializing the variables
init = tf.initialize_all_variables()# Launch the graph
with tf.Session() as sess:sess.run(init)step = 1# start = 0# end = start + batch_size# Keep training until reach max iterationswhile step * batch_size < training_iters:# Run optimization op (backprop)# batch_x = X_train[start:end,:]# batch_y = y_train[start:end,:]indeices = np.random.choice(capacity, batch_size)batch_x = X_train[indeices,:]batch_y = y_train[indeices,:]sess.run(optimizer, feed_dict={x: batch_x, y: batch_y,keep_prob: dropout})if step % display_step == 0:# Calculate batch loss and accuracyloss, acc = sess.run([cost, accuracy], feed_dict={x: batch_x,y: batch_y,keep_prob: 1.})print("Iter " + str(step*batch_size) + ", Minibatch Loss= " + \"{:.6f}".format(loss) + ", Training Accuracy= " + \"{:.5f}".format(acc))step += 1# start = end# end = end + batch_sizeprint("Optimization Finished!")# Calculate accuracy for 10000 testsprint("Testing Accuracy:", \sess.run(accuracy, feed_dict={x: X_test,y: y_test,
keep_prob: 1.}))


