基于深度学习的音乐推荐系统（三）使用已训练的卷积神经网络提取语谱图特征并计算图像间相似度

该模块包含几部分：

调用训练好的并且已经保存的CNN模型（仅四层卷积层部分）
逐个读取tfrecords文件中的元素，并送入已训练好的CNN中，给每个图片提取128个特征
每首歌包含11个图片，即11*128个特征，将每首歌的11*128个特征之间进行余弦相似度计算
逐个歌曲计算，返回每个歌曲的最相似的三首歌歌名，以列表的形式

调用训练好的并且已经保存的CNN模型（仅四层卷积层部分）
定义CNN模型的参数

lr = tf.Variable(0.001, dtype=tf.float32)
x = tf.placeholder(tf.float32, [None, 256, 256, 1],name='x')
y_ = tf.placeholder(tf.float32, [None],name='y_')
keep_prob = tf.placeholder(tf.float32)

CNN模型结构定义

def weight_variable(shape,name):initial = tf.truncated_normal(shape, stddev=0.1)return tf.Variable(initial,name=name)def bias_variable(shape,name):initial = tf.constant(0.1, shape=shape)return tf.Variable(initial,name=name)with tf.name_scope('conv2d'):def conv2d(x, W):# stride [1, x_movement, y_movement, 1]# Must have strides[0] = strides[3] = 1return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')with tf.name_scope('max_pool_2x2'):def max_pool_2x2(x):# stride [1, x_movement, y_movement, 1]return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')def max_pool_4x4(x):# stride [1, x_movement, y_movement, 1]return tf.nn.max_pool(x, ksize=[1,4,4,1], strides=[1,4,4,1], padding='SAME')   def define_predict_y(x):with tf.variable_scope("conv1"):## conv1 layer ##W_conv1 = weight_variable([3,3, 1,64],'W_conv1') # patch 3x3, in size 1, out size 64b_conv1 = bias_variable([64],'b_conv1')h_conv1 = tf.nn.elu(conv2d(x, W_conv1) + b_conv1) # output size 28x28x32h_pool1 = tf.nn.max_pool(h_conv1, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')                                         # output size 14x14x32with tf.variable_scope("conv2"):## conv2 layer ##W_conv2 = weight_variable([3,3, 64, 128],'W_conv2') # patch 5x5, in size 32, out size 64b_conv2 = bias_variable([128],'b_conv2')h_conv2 = tf.nn.elu(conv2d(h_pool1, W_conv2) + b_conv2) # output size 14x14x64h_pool2 = max_pool_4x4(h_conv2)  with tf.variable_scope("conv3"):## conv3 layer ##W_conv3 = weight_variable([3,3, 128, 256],'W_conv3') # patch 5x5, in size 32, out size 64b_conv3 = bias_variable([256],'b_conv3')h_conv3 = tf.nn.elu(conv2d(h_pool2, W_conv3) + b_conv3) # output size 14x14x64h_pool3 = max_pool_4x4(h_conv3) with tf.variable_scope("conv4"):## conv4 layer ##W_conv4 = weight_variable([3,3, 256, 512],'W_conv4') # patch 5x5, in size 32, out size 64b_conv4 = bias_variable([512],'b_conv4')h_conv4 = tf.nn.elu(conv2d(h_pool3, W_conv4) + b_conv4) # output size 14x14x64h_pool4 = max_pool_4x4(h_conv4)   with tf.variable_scope("fc1"):## fc1 layer ##W_fc1 = weight_variable([2*2*512, 128],'W_fc1')b_fc1 = bias_variable([128],'b_fc1')# [n_samples, 7, 7, 64] ->> [n_samples, 7*7*64]h_pool4_flat = tf.reshape(h_pool4, [-1, 2*2*512])h_fc1 = tf.nn.elu(tf.matmul(h_pool4_flat, W_fc1) + b_fc1)h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)# ## fc2 layer ### with tf.variable_scope("fc2"):#   W_fc2 = weight_variable([128, 10],'W_fc2')#  b_fc2 = bias_variable([10],'b_fc2')#     predict_y = tf.matmul(h_fc1_drop, W_fc2) + b_fc2return h_fc1_dropprediction = define_predict_y(x)
# 用于保存和载入模型
new_saver=tf.train.Saver()

载入已经保存的模型参数

new_saver.restore(sess, tf.train.latest_checkpoint('C:/Users/Administrator/Desktop/ckpt/'))print("导入参数成功！")

逐个读取tfrecords文件中的元素，并送入已训练好的CNN中，给每个图片提取128个特征

1.逐个读取tfrecords文件中的元素

def _parse_record(example_proto):features = {'encoded': tf.FixedLenFeature((), tf.string),'fname': tf.FixedLenFeature((), tf.string),'width': tf.FixedLenFeature((), tf.int64),'height': tf.FixedLenFeature((), tf.int64),'label': tf.FixedLenFeature((), tf.int64),}parsed_features = tf.parse_single_example(example_proto, features=features)return parsed_features###1.....
img_vec_list = [] #所有图片的向量，按顺序存的def read_test(input_file):# 用 dataset 读取 tfrecord 文件dataset = tf.data.TFRecordDataset(input_file)dataset = dataset.map(_parse_record)#解析tfrecord文件中的所有记录，使用dataset的map方法#dataset = dataset.repeat(epochs).shuffle(buffer_size).batch(batch_size)iterator = dataset.make_one_shot_iterator()with tf.Session() as sess:try:i =0while iterator.get_next():i = i+1print(i)features = sess.run(iterator.get_next())img_fname = features['fname']img_fname = img_fname.decode()img = tf.decode_raw(features['encoded'], tf.uint8)img = tf.reshape(img, [256, 256, 1])img = tf.cast(img, tf.float32) / 255.0        #将矩阵归一化0-1之间label = tf.cast(features['label'], tf.int32)one = [sess.run(img),img_fname,sess.run(label)]print(one[1])img_vec_list.append(one)except tf.errors.OutOfRangeError:print("..")print("-------------",len(img_vec_list))img_vec_list.sort(key = lambda x:x[1])print("over..")
read_test('F:/data/test0.tfrecords')
read_test('F:/data/train0.tfrecords')
read_test('F:/data/test1.tfrecords')
read_test('F:/data/train1.tfrecords')
read_test('F:/data/test2.tfrecords')
read_test('F:/data/train2.tfrecords')
read_test('F:/data/test3.tfrecords')
read_test('F:/data/train3.tfrecords')
read_test('F:/data/test4.tfrecords')
read_test('F:/data/train4.tfrecords')
read_test('F:/data/test5.tfrecords')
read_test('F:/data/train5.tfrecords')
read_test('F:/data/test6.tfrecords')
read_test('F:/data/train6.tfrecords')
read_test('F:/data/test7.tfrecords')
read_test('F:/data/train7.tfrecords')
read_test('F:/data/test8.tfrecords')
read_test('F:/data/train8.tfrecords')
read_test('F:/data/test9.tfrecords')
read_test('F:/data/train9.tfrecords')

2.并送入已训练好的CNN中

vector_list = []def get_vector():with tf.Session() as sess:print("there..")# 如果是训练，初始化参数sess.run(tf.global_variables_initializer())print("222")# 创建一个协调器，管理线程coord = tf.train.Coordinator()print("333")# 启动QueueRunner,此时文件名队列已经进队threads = tf.train.start_queue_runners(sess=sess, coord=coord)print("444")new_saver.restore(sess, tf.train.latest_checkpoint('C:/Users/Administrator/Desktop/ckpt/'))print("导入参数成功！")for i in range(len(img_vec_list)):vector = sess.run(prediction,feed_dict={x:np.expand_dims(img_vec_list[i][0],0),y_:np.expand_dims(img_vec_list[i][2],0),keep_prob:0.5})vector_list.append(vector)#print("vector is :",len(vector[0]))get_vector()

每首歌包含11个图片，即11*128个特征，将每首歌的11*128个特征之间进行余弦相似度计算

def cos_sim(vector_a, vector_b):"""计算两个向量之间的余弦相似度:param vector_a: 向量 a :param vector_b: 向量 b:return: sim"""vector_a = np.mat(vector_a)vector_b = np.mat(vector_b)num = float(vector_a * vector_b.T)denom = np.linalg.norm(vector_a) * np.linalg.norm(vector_b)cos = num / denomsim = 0.5 + 0.5 * cosreturn sim##########3....
cos_list = []def get_all_vec_cos():for i in range(len(img_vec_list)):max_cos = 0max_index = ifor j in range(len(img_vec_list)):if int(i/11) == int(j/11):continueelse:temp_cos = cos_sim(vector_list[i],vector_list[j])if temp_cos>max_cos:print("temp_cos:",temp_cos,"max_cos",max_cos)max_cos = temp_cosmax_index = int(j/11)cos_list.append([int(i/11),max_index,max_cos])print("cos:",i,"  ",cos_list[i])print("cos_list:",len(cos_list))get_all_vec_cos()

逐个歌曲计算，返回每个歌曲的最相似的三首歌歌名，以列表的形式

most_video = []#返回的是vidoe序号
def get_most_video():#将cos_list分割,每份11个#cos_list = [cos_list[i:i+11] for i in range(0,len(cos_list),11)]print("cos_list:",cos_list)split_cos_list = []for j in range(0,len(cos_list),11):split_cos_list.append(cos_list[j:j+11])print("split_cos_list:",split_cos_list)for i in range(len(split_cos_list)):index = []for item in split_cos_list[i]:index.append(item[1])most_index = Counter(index).most_common(3)most_video.append(most_index)#print("most_video:",len(most_video))get_most_video()
#print(most_video)

基于深度学习的音乐推荐系统（三）使用已训练的卷积神经网络提取语谱图特征并计算图像间相似度相关推荐

基于深度学习的音乐推荐系统
♚ 作者:沂水寒城,CSDN博客专家,个人研究方向:机器学习.深度学习.NLP.CV Blog: http://yishuihancheng.blog.csdn.net 推荐系统在我们日常生活中发挥着 ...
基于深度学习的天气识别算法对比研究-TensorFlow实现-卷积神经网络（CNN） | 第1例（内附源码+数据）
Hulu（北京）推荐算法负责人周涵宁：怎样应对基于深度学习的视频推荐系统...
本文仅用于学习和交流目的,不得用于商业目的.非商业转载请注明作译者.出处,并保留本文的原始链接:http://www.ituring.com.cn/art... 周涵宁,本科毕业于清华大学自动化系,于 ...
基于深度学习的商品推荐系统（Web）
基于深度学习的商品推荐系统(ECRS_Web) 项目简介技术栈项目用到的技术如下: 语言:Python3 Java Web端:Layui,Flask,Nginx,Gevent,Flask_Cach ...
深度学习与自然语言处理教程(8) - NLP中的卷积神经网络（NLP通关指南·完结）
作者:韩信子@ShowMeAI 教程地址:https://www.showmeai.tech/tutorials/36 本文地址:https://www.showmeai.tech/article-d ...
【深度学习】人人都能看得懂的卷积神经网络——入门篇
近年来,卷积神经网络热度很高,在短时间内,这类网络成为了一种颠覆性技术,打破了从文本.视频到语音多个领域的大量最先进的算法,远远超出其最初在图像处理的应用范围. 卷积神经网络的一个例子在客流预测.信 ...
LeCun亲授的深度学习入门课：从飞行器的发明到卷积神经网络
Root 编译整理量子位出品 | 公众号 QbitAI 深度学习和人脑有什么关系?计算机是如何识别各种物体的?我们怎样构建人工大脑? 这是深度学习入门者绕不过的几个问题.很幸运,这里有位大牛很乐意 ...
手把手教你用深度学习做物体检测(三)：模型训练
本篇文章旨在快速试验使用yolov3算法训练出自己的物体检测模型,所以会重过程而轻原理,当然,原理是非常重要的,只是原理会安排在后续文章中专门进行介绍.所以如果本文中有些地方你有原理方面的疑惑,也没关 ...
计算机视觉与深度学习 | 基于MATLAB 深度学习工具实现简单的数字分类问题（卷积神经网络）
博主github:https://github.com/MichaelBeechan 博主CSDN:https://blog.csdn.net/u011344545 %% Time:2019.3.7 ...

基于深度学习的音乐推荐系统（三）使用已训练的卷积神经网络提取语谱图特征并计算图像间相似度

调用训练好的并且已经保存的CNN模型（仅四层卷积层部分）

逐个读取tfrecords文件中的元素，并送入已训练好的CNN中，给每个图片提取128个特征

每首歌包含11个图片，即11128个特征，将每首歌的11128个特征之间进行余弦相似度计算

逐个歌曲计算，返回每个歌曲的最相似的三首歌歌名，以列表的形式

基于深度学习的音乐推荐系统（三）使用已训练的卷积神经网络提取语谱图特征并计算图像间相似度相关推荐

最新文章

热门文章

基于深度学习的音乐推荐系统（三）使用已训练的卷积神经网络提取语谱图特征并计算图像间相似度

调用训练好的并且已经保存的CNN模型（仅四层卷积层部分）

逐个读取tfrecords文件中的元素，并送入已训练好的CNN中，给每个图片提取128个特征

每首歌包含11个图片，即11*128个特征，将每首歌的11*128个特征之间进行余弦相似度计算

逐个歌曲计算，返回每个歌曲的最相似的三首歌歌名，以列表的形式

基于深度学习的音乐推荐系统（三）使用已训练的卷积神经网络提取语谱图特征并计算图像间相似度相关推荐

最新文章

热门文章

每首歌包含11个图片，即11128个特征，将每首歌的11128个特征之间进行余弦相似度计算