[代码解读]基于多代理RL的车联网频谱分享_Python实现
论文原文:Spectrum Sharing in Vehicular Networks Based on Multi-Agent Reinforcement Learning
论文翻译 & 解读:[论文笔记]Spectrum Sharing in Vehicular Networks Based on Multi-Agent Reinforcement Learning
代码地址:https://github.com/le-liang/MARLspectrumSharingV2X
博客中用到的VISIO流程图(由博主个人绘制,有错误欢迎交流指教):https://download.csdn.net/download/m0_37495408/12353933
使用方法:
(来自原作者github的readme)
- 要训练多主体RL模型:main_marl_train.py + Environment_marl.py + replay_memory.py
- 要训练基准单一代理RL模型:main_sarl_train.py + Environment_marl.py + replay_memory.py
- 要在同一环境中测试所有模型:main_test.py + Environment_marl_test.py + replay_memory.py +'/ model'。
- 可以从运行“ main_test.py”直接复制本文中的图3和图4。通过“ Environment_marl_test.py”中的“ self.demand_size”更改V2V有效负载大小。
- 图5只能从培训期间的记录回报中获得。
- 图6-7显示了任意情节的表现(但随机基线失败且MARL传输成功)。实际上,大多数此类情节都表现出一些有趣的现象,表明了多主体合作。解释取决于读者。
- 不建议在“ main_marl_train.py”中使用“测试”模式。
基本类定义
在Environment_marl.py文件中,定义了架构的四个基本CLASS,分别是:V2Vchannels,V2Ichannels,Vehicle,Environ。其中Environ的方法(即函数)最多,Vehicle没有函数只有几个属性,其余两者各有两个方法(分别是计算路损和阴影衰落)。
Vehicle
初始化时需要传入三个参数:起始位置、起始方向、速度。函数内部将自己定义两个list:neighbors、destinations,分别存放邻居和V2V的通信端(这里两者在数值上相同,因为设定V2V的对象即为邻居)
class Vehicle:# Vehicle simulator: include all the information for a vehicledef __init__(self, start_position, start_direction, velocity):self.position = start_positionself.direction = start_directionself.velocity = velocityself.neighbors = []self.destinations = []
从下方的代码可见destionations的含义
def renew_neighbor(self): # 这个来自CLASS Env""" Determine the neighbors of each vehicles """for i in range(len(self.vehicles)):self.vehicles[i].neighbors = []self.vehicles[i].actions = []z = np.array([[complex(c.position[0], c.position[1]) for c in self.vehicles]])Distance = abs(z.T - z)for i in range(len(self.vehicles)):sort_idx = np.argsort(Distance[:, i])for j in range(self.n_neighbor):self.vehicles[i].neighbors.append(sort_idx[j + 1])destination = self.vehicles[i].neighborsself.vehicles[i].destinations = destination
V2Vchannels
内部参数z:这里将bs和ms的高度设置为1.5m,阴影的std为3,都是来自TR36 885-A.1.4-1;载波频率为2,单位为GHz;
class V2Vchannels:# Simulator of the V2V Channelsdef __init__(self):self.t = 0self.h_bs = 1.5self.h_ms = 1.5self.fc = 2self.decorrelation_distance = 10self.shadow_std = 3
包含两个方法:
计算路损
def get_path_loss(self, position_A, position_B):d1 = abs(position_A[0] - position_B[0])d2 = abs(position_A[1] - position_B[1])d = math.hypot(d1, d2) + 0.001 # sqrt(x*x + y*y)# 下一行定义有效BP距离d_bp = 4 * (self.h_bs - 1) * (self.h_ms - 1) * self.fc * (10 ** 9) / (3 * 10 ** 8)def PL_Los(d):if d <= 3:return 22.7 * np.log10(3) + 41 + 20 * np.log10(self.fc / 5)else:if d < d_bp:return 22.7 * np.log10(d) + 41 + 20 * np.log10(self.fc / 5)else:return 40.0 * np.log10(d) + 9.45 - 17.3 * np.log10(self.h_bs) - 17.3 * np.log10(self.h_ms) + 2.7 * np.log10(self.fc / 5)def PL_NLos(d_a, d_b):n_j = max(2.8 - 0.0024 * d_b, 1.84)return PL_Los(d_a) + 20 - 12.5 * n_j + 10 * n_j * np.log10(d_b) + 3 * np.log10(self.fc / 5)if min(d1, d2) < 7:PL = PL_Los(d)else:PL = min(PL_NLos(d1, d2), PL_NLos(d2, d1))return PL # + self.shadow_std * np.random.normal()
说明:上述代码使用随机过程模型(见[2]-p328)。
路损使用曼哈顿网格布局LOS模型,即:
, for
以及:, for
上面的n_1=2.2、n_2=4.0,分别表示在BP之前和之后的率衰落常数,d'表示有效BP距离,代码中用d_bp表示。
曼哈顿网格布局NLOS模型:
代码后半出现的min函数,在[2]的p344页有描述,这是假设接收机位于垂直街道是对PL的估计方法。
代码中的公式出自IST-4-027756 WINNER II D1.1.2 V1.2 WINNER II
其中有如下表格,与代码中的参数完全符合:
更新阴影衰落
def get_shadowing(self, delta_distance, shadowing):return np.exp(-1 * (delta_distance / self.decorrelation_distance)) * shadowing \+ math.sqrt(1 - np.exp(-2 * (delta_distance / self.decorrelation_distance))) * np.random.normal(0, 3) # standard dev is 3 db
这个更新公式是出自文献[1]-A-1.4 Channel model表格后的部分,如下:
V2Ichannels
包含的两个方法和V2V相同,但是计算路损的时候不再区分Los了
def get_path_loss(self, position_A):d1 = abs(position_A[0] - self.BS_position[0])d2 = abs(position_A[1] - self.BS_position[1])distance = math.hypot(d1, d2)return 128.1 + 37.6 * np.log10(math.sqrt(distance ** 2 + (self.h_bs - self.h_ms) ** 2) / 1000) # + self.shadow_std * np.random.normal()def get_shadowing(self, delta_distance, shadowing):nVeh = len(shadowing)self.R = np.sqrt(0.5 * np.ones([nVeh, nVeh]) + 0.5 * np.identity(nVeh))return np.multiply(np.exp(-1 * (delta_distance / self.Decorrelation_distance)), shadowing) \+ np.sqrt(1 - np.exp(-2 * (delta_distance / self.Decorrelation_distance))) * np.random.normal(0, 8, nVeh)
上面的两个方法均是文献[1]-Table A.1.4-2的内容和其后的说明,如下:
Environ
初始化需要传入4个list(为上下左右路口的位置数据):down_lane, up_lane, left_lane, right_lane;地图的宽和高;车辆数和邻居数。除以上所提外,内部含有好多参数,如下:
class Environ:def __init__(self, down_lane, up_lane, left_lane, right_lane, width, height, n_veh, n_neighbor):self.V2Vchannels = V2Vchannels()self.V2Ichannels = V2Ichannels()self.vehicles = []self.demand = []self.V2V_Shadowing = []self.V2I_Shadowing = []self.delta_distance = []self.V2V_channels_abs = []self.V2I_channels_abs = []self.V2I_power_dB = 23 # dBmself.V2V_power_dB_List = [23, 15, 5, -100] # the power levelsself.V2I_power = 10 ** (self.V2I_power_dB)self.sig2_dB = -114self.bsAntGain = 8self.bsNoiseFigure = 5self.vehAntGain = 3self.vehNoiseFigure = 9self.sig2 = 10 ** (self.sig2_dB / 10)self.n_RB = n_vehself.n_Veh = n_vehself.n_neighbor = n_neighborself.time_fast = 0.001self.time_slow = 0.1 # update slow fading/vehicle position every 100 msself.bandwidth = int(1e6) # bandwidth per RB, 1 MHz# self.bandwidth = 1500self.demand_size = int((4 * 190 + 300) * 8 * 2) # V2V payload: 1060 Bytes every 100 ms# self.demand_size = 20self.V2V_Interference_all = np.zeros((self.n_Veh, self.n_neighbor, self.n_RB)) + self.sig2
添加车:有两个方法:add_new_vehivles(需要传输起始坐标、方向、速度),add_new_vehicles_by_number(n)。后者比较有意思,只需要一个参数,n,但是并不是添加n辆车,而是4n辆车,上下左右方向各一台,位置是随机的。
更新车辆位置:renew_position(无),遍历每辆车,根据其方向和速度更新位置,到路口时依据概率顺时针转弯,到地图边界时使其顺时针转弯留在地图中。
更新邻居:renew_neighbor(self),已经在Vehicle中进行描述
更新信道:renew_channel(self),这里定义了一个很重要的量:channel_abs,它是路损和阴影衰落的和。【内含所有车辆的信息】
def renew_channel(self):""" Renew slow fading channel """self.V2V_pathloss = np.zeros((len(self.vehicles), len(self.vehicles))) + 50 * np.identity(len(self.vehicles))self.V2I_pathloss = np.zeros((len(self.vehicles)))self.V2V_channels_abs = np.zeros((len(self.vehicles), len(self.vehicles)))self.V2I_channels_abs = np.zeros((len(self.vehicles)))for i in range(len(self.vehicles)):for j in range(i + 1, len(self.vehicles)):self.V2V_Shadowing[j][i] = self.V2V_Shadowing[i][j] = self.V2Vchannels.get_shadowing(self.delta_distance[i] + self.delta_distance[j], self.V2V_Shadowing[i][j])self.V2V_pathloss[j,i] = self.V2V_pathloss[i][j] = self.V2Vchannels.get_path_loss(self.vehicles[i].position, self.vehicles[j].position)self.V2V_channels_abs = self.V2V_pathloss + self.V2V_Shadowingself.V2I_Shadowing = self.V2Ichannels.get_shadowing(self.delta_distance, self.V2I_Shadowing)for i in range(len(self.vehicles)):self.V2I_pathloss[i] = self.V2Ichannels.get_path_loss(self.vehicles[i].position)self.V2I_channels_abs = self.V2I_pathloss + self.V2I_Shadowing
更新快衰落信道:renew_channels_fastfading(self),其数值为把channels_abs减了一个随机数,这里在减之前将channels_abs增加了一维,层数为RB的个数。
def renew_channels_fastfading(self):""" Renew fast fading channel """# 1 2, 3 4 --> 1 1 2 2 3 3 4 4 逐个元素复制V2V_channels_with_fastfading = np.repeat(self.V2V_channels_abs[:, :, np.newaxis], self.n_RB, axis=2)# A - 20 logself.V2V_channels_with_fastfading = V2V_channels_with_fastfading - 20 * np.log10(np.abs(np.random.normal(0, 1, V2V_channels_with_fastfading.shape) + 1j * np.random.normal(0, 1, V2V_channels_with_fastfading.shape)) / math.sqrt(2))# 1 2, 3 4 --> 1 1 2 2, 3 3 4 4V2I_channels_with_fastfading = np.repeat(self.V2I_channels_abs[:, np.newaxis], self.n_RB, axis=1)self.V2I_channels_with_fastfading = V2I_channels_with_fastfading - 20 * np.log10(np.abs(np.random.normal(0, 1, V2I_channels_with_fastfading.shape) + 1j * np.random.normal(0, 1, V2I_channels_with_fastfading.shape))/ math.sqrt(2))
计算Reward:Compute_Performance_Reward_Train(self, actions_power),这里的输入非常重要,是RL的action,其定义在main_marl_train.py中,是个三维数组,以(层,行,列)进行说明,一层一个车,一行一个邻居,共有两列分别为RB选择(用RB的序号表示)和power选择(也用序号表示,作为power_db_list的索引),如下所示:
for i in range(n_veh):for j in range(n_neighbor):state_old = get_state(env, [i, j], 1, epsi_final)action = predict(sesses[i*n_neighbor+j], state_old, epsi_final, True)action_all_testing[i, j, 0] = action % n_RB # chosen RBaction_all_testing[i, j, 1] = int(np.floor(action / n_RB)) # power level
具体计算步骤为:
- 从action中取出RB选择、power选择
- 计算V2I信道容量 V2I_rate # 返回值的长度是RB个数,但实际含义是V2I链路的数目,因为V2I链路数=RB个数
- 计算V2V信道容量V2V_rate # 返回值中一格对应一个V2V链路,这里返回的是所有V2V的速率
- 遍历每一个RB,从actions找到共用一个RB的车号
- 分V2I对V2V的干扰、V2V之间的干扰两步,计算信道容量
- 计算剩余demand和time_limit的剩余时间
- 生成reward(reward_elements = V2V_Rate/10,并且demand=0的记作1)
- 根据剩余demand将active_links置0(这是唯二修改active_links的方法,另一种是初始化active_links时将其全部置一)
- 将active_links置1的场合
- env.py中,new_random_game时(该函数在 *train.py中在最开始出现过一次)
- *train.py中episode的开端,直接对active_links置一
- 将active_links置1的场合
代码如下:
def Compute_Performance_Reward_Train(self, actions_power):actions = actions_power[:, :, 0] # the channel_selection_partpower_selection = actions_power[:, :, 1] # power selection# ------------ Compute V2I rate --------------------V2I_Rate = np.zeros(self.n_RB)V2I_Interference = np.zeros(self.n_RB) # V2I interferencefor i in range(len(self.vehicles)):for j in range(self.n_neighbor):if not self.active_links[i, j]:continueV2I_Interference[actions[i][j]] += 10 ** ((self.V2V_power_dB_List[power_selection[i, j]] - self.V2I_channels_with_fastfading[i, actions[i, j]]+ self.vehAntGain + self.bsAntGain - self.bsNoiseFigure) / 10)self.V2I_Interference = V2I_Interference + self.sig2V2I_Signals = 10 ** ((self.V2I_power_dB - self.V2I_channels_with_fastfading.diagonal() + self.vehAntGain + self.bsAntGain - self.bsNoiseFigure) / 10)V2I_Rate = np.log2(1 + np.divide(V2I_Signals, self.V2I_Interference)) # 计算V2I信道容量# ------------ Compute V2V rate -------------------------V2V_Interference = np.zeros((len(self.vehicles), self.n_neighbor))V2V_Signal = np.zeros((len(self.vehicles), self.n_neighbor))actions[(np.logical_not(self.active_links))] = -1 # inactive links will not transmit regardless of selected power levelsfor i in range(self.n_RB): # scanning all bandsindexes = np.argwhere(actions == i) # find spectrum-sharing V2Vsfor j in range(len(indexes)):receiver_j = self.vehicles[indexes[j, 0]].destinations[indexes[j, 1]]V2V_Signal[indexes[j, 0], indexes[j, 1]] = 10 ** ((self.V2V_power_dB_List[power_selection[indexes[j, 0], indexes[j, 1]]]- self.V2V_channels_with_fastfading[indexes[j][0], receiver_j, i] + 2 * self.vehAntGain - self.vehNoiseFigure) / 10)# V2I links interference to V2V linksV2V_Interference[indexes[j, 0], indexes[j, 1]] = 10 ** ((self.V2I_power_dB - self.V2V_channels_with_fastfading[i, receiver_j, i] + 2 * self.vehAntGain - self.vehNoiseFigure) / 10)# V2V interferencefor k in range(j + 1, len(indexes)): # spectrum-sharing V2Vsreceiver_k = self.vehicles[indexes[k][0]].destinations[indexes[k][1]]V2V_Interference[indexes[j, 0], indexes[j, 1]] += 10 ** ((self.V2V_power_dB_List[power_selection[indexes[k, 0], indexes[k, 1]]]- self.V2V_channels_with_fastfading[indexes[k][0]][receiver_j][i] + 2 * self.vehAntGain - self.vehNoiseFigure) / 10)V2V_Interference[indexes[k, 0], indexes[k, 1]] += 10 ** ((self.V2V_power_dB_List[power_selection[indexes[j, 0], indexes[j, 1]]]- self.V2V_channels_with_fastfading[indexes[j][0]][receiver_k][i] + 2 * self.vehAntGain - self.vehNoiseFigure) / 10)self.V2V_Interference = V2V_Interference + self.sig2V2V_Rate = np.log2(1 + np.divide(V2V_Signal, self.V2V_Interference))self.demand -= V2V_Rate * self.time_fast * self.bandwidthself.demand[self.demand < 0] = 0 # eliminate negative demandsself.individual_time_limit -= self.time_fastreward_elements = V2V_Rate/10reward_elements[self.demand <= 0] = 1self.active_links[np.multiply(self.active_links, self.demand <= 0)] = 0 # transmission finished, turned to "inactive"return V2I_Rate, V2V_Rate, reward_elements
注:这里返回三个数值,其中最后一个并不是最终的reward,最终的reward需要把这三个数值加权组合起来。
执行训练:act_for_training(self, actions),输入actions,通过Compute_Performance_Reward_Train计算最终reward,代码如下:
def act_for_training(self, actions):action_temp = actions.copy()V2I_Rate, V2V_Rate, reward_elements = self.Compute_Performance_Reward_Train(action_temp)lambdda = 0.reward = lambdda * np.sum(V2I_Rate) / (self.n_Veh * 10) + (1 - lambdda) * np.sum(reward_elements) / (self.n_Veh * self.n_neighbor)return reward
执行测试:act_for_testing(self, actions),这里和上面差不多,也用到了Compute_Performance_Reward_Train,但最后返回的是V2I_rate, V2V_success, V2V_rate。
def act_for_testing(self, actions):action_temp = actions.copy()V2I_Rate, V2V_Rate, reward_elements = self.Compute_Performance_Reward_Train(action_temp)V2V_success = 1 - np.sum(self.active_links) / (self.n_Veh * self.n_neighbor) # V2V success ratesreturn V2I_Rate, V2V_success, V2V_Rate
上面所述的三个量,是一次episode中的单步step所生成的最终结果,main_marl_train.py的testing部分可以看到,部分代码如下:
for test_step in range(n_step_per_episode):# trained modelsaction_all_testing = np.zeros([n_veh, n_neighbor, 2], dtype='int32')for i in range(n_veh):for j in range(n_neighbor):state_old = get_state(env, [i, j], 1, epsi_final)action = predict(sesses[i*n_neighbor+j], state_old, epsi_final, True)action_all_testing[i, j, 0] = action % n_RB # chosen RBaction_all_testing[i, j, 1] = int(np.floor(action / n_RB)) # power levelaction_temp = action_all_testing.copy()V2I_rate, V2V_success, V2V_rate = env.act_for_testing(action_temp)V2I_rate_per_episode.append(np.sum(V2I_rate)) # sum V2I rate in bpsrate_marl[idx_episode, test_step,:,:] = V2V_ratedemand_marl[idx_episode, test_step+1,:,:] = env.demand
计算干扰:Compute_Interference(self, actions),通过+=的方法计算V2V_Interference_all,代码如下:
def Compute_Interference(self, actions):V2V_Interference = np.zeros((len(self.vehicles), self.n_neighbor, self.n_RB)) + self.sig2channel_selection = actions.copy()[:, :, 0] # 取所有层的第0列power_selection = actions.copy()[:, :, 1] # 取所有层的第1列channel_selection[np.logical_not(self.active_links)] = -1 # 将未激活的链路置为-1# interference from V2I linksfor i in range(self.n_RB):for k in range(len(self.vehicles)):for m in range(len(channel_selection[k, :])):V2V_Interference[k, m, i] += 10 ** ((self.V2I_power_dB - self.V2V_channels_with_fastfading[i][self.vehicles[k].destinations[m]][i] + 2 * self.vehAntGain - self.vehNoiseFigure) / 10)# interference from peer V2V linksfor i in range(len(self.vehicles)):for j in range(len(channel_selection[i, :])):for k in range(len(self.vehicles)):for m in range(len(channel_selection[k, :])):# if i == k or channel_selection[i,j] >= 0:if i == k and j == m or channel_selection[i, j] < 0:continueV2V_Interference[k, m, channel_selection[i, j]] += 10 ** ((self.V2V_power_dB_List[power_selection[i, j]]- self.V2V_channels_with_fastfading[i][self.vehicles[k].destinations[m]][channel_selection[i,j]] + 2 * self.vehAntGain - self.vehNoiseFigure) / 10)self.V2V_Interference_all = 10 * np.log10(V2V_Interference)
在main_marl_train.py的get_state中有用到,用于构成state中的V2V_interference,如下:
def get_state(env, idx=(0,0), ind_episode=1., epsi=0.02):""" Get state from the environment """# include V2I/V2V fast_fading, V2V interference, V2I/V2V 信道信息(PL+shadow),# 剩余时间, 剩余负载# V2I_channel = (env.V2I_channels_with_fastfading[idx[0], :] - 80) / 60V2I_fast = (env.V2I_channels_with_fastfading[idx[0], :] - env.V2I_channels_abs[idx[0]] + 10)/35# V2V_channel = (env.V2V_channels_with_fastfading[:, env.vehicles[idx[0]].destinations[idx[1]], :] - 80) / 60V2V_fast = (env.V2V_channels_with_fastfading[:, env.vehicles[idx[0]].destinations[idx[1]], :] - env.V2V_channels_abs[:, env.vehicles[idx[0]].destinations[idx[1]]] + 10)/35V2V_interference = (-env.V2V_Interference_all[idx[0], idx[1], :] - 60) / 60V2I_abs = (env.V2I_channels_abs[idx[0]] - 80) / 60.0V2V_abs = (env.V2V_channels_abs[:, env.vehicles[idx[0]].destinations[idx[1]]] - 80)/60.0load_remaining = np.asarray([env.demand[idx[0], idx[1]] / env.demand_size])time_remaining = np.asarray([env.individual_time_limit[idx[0], idx[1]] / env.time_slow])# return np.concatenate((np.reshape(V2V_channel, -1), V2V_interference, V2I_abs, V2V_abs, time_remaining, load_remaining, np.asarray([ind_episode, epsi])))return np.concatenate((V2I_fast, np.reshape(V2V_fast, -1), V2V_interference, np.asarray([V2I_abs]), V2V_abs, time_remaining, load_remaining, np.asarray([ind_episode, epsi])))# 这里有所有感兴趣的物理量:V2V_fast V2I_fast V2V_interference V2I_abs V2V_abs
有的小伙伴看到这就有点迷了,为什么这里又要计算V2V_Interference了?我怎么感觉之前好像算过,是的,在计算V2V_rate的时候就需要计算V2V_Interference,我目前观察那个是按照RB分配来算的,这个是直接按照车挨个遍历的。
ReplayMemory
这部分内容来自replay_memory.py,内容不多,只定义了一个类: ReplayMemory,需要注意的是每一个agent都有一个memory,在main_marl_train.py--class Agent可以看到,如下所示
class Agent(object):def __init__(self, memory_entry_size):self.discount = 1self.double_q = Trueself.memory_entry_size = memory_entry_sizeself.memory = ReplayMemory(self.memory_entry_size)
初始化:需要输入memory的容量:entry_size,初始化的代码如下:
class ReplayMemory:def __init__(self, entry_size):self.entry_size = entry_sizeself.memory_size = 200000self.actions = np.empty(self.memory_size, dtype = np.uint8)self.rewards = np.empty(self.memory_size, dtype = np.float64)self.prestate = np.empty((self.memory_size, self.entry_size), dtype = np.float16)self.poststate = np.empty((self.memory_size, self.entry_size), dtype = np.float16)self.batch_size = 2000self.count = 0self.current = 0
添加(s, a)对:add(self, prestate, poststate, reward, action),从add方法的参数可以看出参数包括:(上一个状态,下一个状态,奖励,动作),代码如下:
def add(self, prestate, poststate, reward, action):self.actions[self.current] = actionself.rewards[self.current] = rewardself.prestate[self.current] = prestateself.poststate[self.current] = poststateself.count = max(self.count, self.current + 1)self.current = (self.current + 1) % self.memory_size
对每个agent来说,都需要将自己在每个time_step将这个状态转移的信息记录下来,在main_marl_train.py--Training的部分可以看到add的使用,代码如下,这个for循环上面还有一个对于episode的for循环,可以看出,在每个episode的每个step,都需要对所有agent进行(s,a)对的添加【最后一行】
for i_step in range(n_step_per_episode): # range内是0.1/0.001 = 100time_step = i_episode*n_step_per_episode + i_step # time_step是整体的stepstate_old_all = []action_all = []action_all_training = np.zeros([n_veh, n_neighbor, 2], dtype='int32')for i in range(n_veh):for j in range(n_neighbor):state = get_state(env, [i, j], i_episode/(n_episode-1), epsi)state_old_all.append(state)action = predict(sesses[i*n_neighbor+j], state, epsi)action_all.append(action)action_all_training[i, j, 0] = action % n_RB # chosen RBaction_all_training[i, j, 1] = int(np.floor(action / n_RB)) # power level# All agents take actions simultaneously, obtain shared reward, and update the environment.action_temp = action_all_training.copy()train_reward = env.act_for_training(action_temp)record_reward[time_step] = train_rewardenv.renew_channels_fastfading()env.Compute_Interference(action_temp)for i in range(n_veh):for j in range(n_neighbor):state_old = state_old_all[n_neighbor * i + j]action = action_all[n_neighbor * i + j]state_new = get_state(env, [i, j], i_episode/(n_episode-1), epsi)agents[i * n_neighbor + j].memory.add(state_old, state_new, train_reward, action) # add entry to this agent's memory
采样:sample(self),经过多次add后,每个agent已经有了多个(s,a)对,但是实际训练的时候一次取出batch_size个(s,a)对进行训练,代码如下所示:
def sample(self):if self.count < self.batch_size:indexes = range(0, self.count)else:indexes = random.sample(range(0,self.count), self.batch_size)prestate = self.prestate[indexes]poststate = self.poststate[indexes]actions = self.actions[indexes]rewards = self.rewards[indexes]return prestate, poststate, actions, rewards
主代码-main_marl_train.py
定义CLASS Agent:Agent(object),无输入参数,内容是一些算法参数,注意memory的实现方法是ReplayMemory,上面刚提到过
class Agent(object):def __init__(self, memory_entry_size):self.discount = 1self.double_q = Trueself.memory_entry_size = memory_entry_sizeself.memory = ReplayMemory(self.memory_entry_size)
参数初始化:这部分直接写在代码中,没有函数,大概包括:地图属性(路口坐标,整体地图尺寸)、#车、#邻居、#RB、#episode,一些算法参数,代码如下:
对于地图参数 up_lanes / down_lanes / left_lanes / right_lanes 的含义,首先要了解本次所用的系统模型由3GPP TR 36.885的城市案例给出,每条街有四个车道(正反方向各两个车道) ,车道宽3.5m,模型网格(road grid)的尺寸以黄线之间的距离确定,为433m*250m,区域面积为1299m*750m。仿真中等比例缩小为原来的1/2(这点可以由 width 和 height 参数是 / 2 的看出来),反映在车道的参数上就是在 lanes 中的 i / 2.0 。
下面以 up_lanes 为例进行说明。在上图中我们可以看到,车道宽3.5m,所以将车视作质点的话,应该是在3.5m的车道中间移动的,因此在 up_lanes 中 in 后面的 中括号里 3.5 需要 /2,第二项的3.5就是通向双车道的第二条车道的中间;第三项 +250 就是越过建筑物的第一条同向车道,以此类推。
up_lanes = [i/2.0 for i in [3.5/2, 3.5 + 3.5/2, 250+3.5/2, 250+3.5+3.5/2, 500+3.5/2, 500+3.5+3.5/2]]
down_lanes = [i/2.0 for i in [250-3.5-3.5/2,250-3.5/2,500-3.5-3.5/2,500-3.5/2,750-3.5-3.5/2,750-3.5/2]]
left_lanes = [i/2.0 for i in [3.5/2,3.5/2 + 3.5,433+3.5/2, 433+3.5+3.5/2, 866+3.5/2, 866+3.5+3.5/2]]
right_lanes = [i/2.0 for i in [433-3.5-3.5/2,433-3.5/2,866-3.5-3.5/2,866-3.5/2,1299-3.5-3.5/2,1299-3.5/2]]width = 750/2
height = 1298/2IS_TRAIN = 1
IS_TEST = 1-IS_TRAINlabel = 'marl_model'n_veh = 4
n_neighbor = 1
n_RB = n_vehenv = Environment_marl.Environ(down_lanes, up_lanes, left_lanes, right_lanes, width, height, n_veh, n_neighbor)
env.new_random_game() # initialize parameters in env# n_episode = 3000
n_episode = 600
n_step_per_episode = int(env.time_slow/env.time_fast) # slow = 0.1, fast = 0.001
epsi_final = 0.02
epsi_anneal_length = int(0.8*n_episode)
mini_batch_step = n_step_per_episode
target_update_step = n_step_per_episode*4n_episode_test = 100 # test episodes
获取状态:get_state(env, idx=(0,0), ind_episode=1., epsi=0.02),输入是env(环境),输出包括:
- V2V_fast:(PL+shadowing) - 随机数(在本文 基本类定义 -- Environ -- 更新快衰信道 一节有)
- V2I_fast:同上
- V2V_interference(在本文 基本类定义 -- Environ -- 计算干扰 一节有)
- V2I_abs(PL+shadowing)
- V2V_abs(PL+shadowing)
需要注意的是,代码中的V2I_abs出现了-80,/60 的操作,这个将代码作者在github讨论区的解释放在这里:
This is to roughly normalize DQN inputs for the ease of training. The numbers are obtained from several trial runs
“这是为了使DQN输入大致标准化,以便于培训。 这些数字是从几次试运行获得的”
def get_state(env, idx=(0,0), ind_episode=1., epsi=0.02):""" Get state from the environment """# include V2I/V2V fast_fading, V2V interference, V2I/V2V 信道信息(PL+shadow),# 剩余时间, 剩余负载# V2I_channel = (env.V2I_channels_with_fastfading[idx[0], :] - 80) / 60V2I_fast = (env.V2I_channels_with_fastfading[idx[0], :] - env.V2I_channels_abs[idx[0]] + 10)/35# V2V_channel = (env.V2V_channels_with_fastfading[:, env.vehicles[idx[0]].destinations[idx[1]], :] - 80) / 60V2V_fast = (env.V2V_channels_with_fastfading[:, env.vehicles[idx[0]].destinations[idx[1]], :] - env.V2V_channels_abs[:, env.vehicles[idx[0]].destinations[idx[1]]] + 10)/35V2V_interference = (-env.V2V_Interference_all[idx[0], idx[1], :] - 60) / 60V2I_abs = (env.V2I_channels_abs[idx[0]] - 80) / 60.0V2V_abs = (env.V2V_channels_abs[:, env.vehicles[idx[0]].destinations[idx[1]]] - 80)/60.0load_remaining = np.asarray([env.demand[idx[0], idx[1]] / env.demand_size])time_remaining = np.asarray([env.individual_time_limit[idx[0], idx[1]] / env.time_slow])# return np.concatenate((np.reshape(V2V_channel, -1), V2V_interference, V2I_abs, V2V_abs, time_remaining, load_remaining, np.asarray([ind_episode, epsi])))return np.concatenate((V2I_fast, np.reshape(V2V_fast, -1), V2V_interference, np.asarray([V2I_abs]), V2V_abs, time_remaining, load_remaining, np.asarray([ind_episode, epsi])))
定义NN:
with g.as_default():# ============== Training network ========================x = tf.placeholder(tf.float32, [None, n_input]) # 输入w_1 = tf.Variable(tf.truncated_normal([n_input, n_hidden_1], stddev=0.1))w_2 = tf.Variable(tf.truncated_normal([n_hidden_1, n_hidden_2], stddev=0.1))w_3 = tf.Variable(tf.truncated_normal([n_hidden_2, n_hidden_3], stddev=0.1))w_4 = tf.Variable(tf.truncated_normal([n_hidden_3, n_output], stddev=0.1))b_1 = tf.Variable(tf.truncated_normal([n_hidden_1], stddev=0.1))b_2 = tf.Variable(tf.truncated_normal([n_hidden_2], stddev=0.1))b_3 = tf.Variable(tf.truncated_normal([n_hidden_3], stddev=0.1))b_4 = tf.Variable(tf.truncated_normal([n_output], stddev=0.1))layer_1 = tf.nn.relu(tf.add(tf.matmul(x, w_1), b_1))layer_1_b = tf.layers.batch_normalization(layer_1)layer_2 = tf.nn.relu(tf.add(tf.matmul(layer_1_b, w_2), b_2))layer_2_b = tf.layers.batch_normalization(layer_2)layer_3 = tf.nn.relu(tf.add(tf.matmul(layer_2_b, w_3), b_3))layer_3_b = tf.layers.batch_normalization(layer_3)y = tf.nn.relu(tf.add(tf.matmul(layer_3_b, w_4), b_4))g_q_action = tf.argmax(y, axis=1)# compute lossg_target_q_t = tf.placeholder(tf.float32, None, name="target_value")g_action = tf.placeholder(tf.int32, None, name='g_action')action_one_hot = tf.one_hot(g_action, n_output, 1.0, 0.0, name='action_one_hot')q_acted = tf.reduce_sum(y * action_one_hot, reduction_indices=1, name='q_acted')g_loss = tf.reduce_mean(tf.square(g_target_q_t - q_acted), name='g_loss') # 求误差optim = tf.train.RMSPropOptimizer(learning_rate=0.001, momentum=0.95, epsilon=0.01).minimize(g_loss) # 梯度下降# ==================== Prediction network ========================x_p = tf.placeholder(tf.float32, [None, n_input]) # 输入w_1_p = tf.Variable(tf.truncated_normal([n_input, n_hidden_1], stddev=0.1))w_2_p = tf.Variable(tf.truncated_normal([n_hidden_1, n_hidden_2], stddev=0.1))w_3_p = tf.Variable(tf.truncated_normal([n_hidden_2, n_hidden_3], stddev=0.1))w_4_p = tf.Variable(tf.truncated_normal([n_hidden_3, n_output], stddev=0.1))b_1_p = tf.Variable(tf.truncated_normal([n_hidden_1], stddev=0.1))b_2_p = tf.Variable(tf.truncated_normal([n_hidden_2], stddev=0.1))b_3_p = tf.Variable(tf.truncated_normal([n_hidden_3], stddev=0.1))b_4_p = tf.Variable(tf.truncated_normal([n_output], stddev=0.1))layer_1_p = tf.nn.relu(tf.add(tf.matmul(x_p, w_1_p), b_1_p))layer_1_p_b = tf.layers.batch_normalization(layer_1_p)layer_2_p = tf.nn.relu(tf.add(tf.matmul(layer_1_p_b, w_2_p), b_2_p))layer_2_p_b = tf.layers.batch_normalization(layer_2_p)layer_3_p = tf.nn.relu(tf.add(tf.matmul(layer_2_p_b, w_3_p), b_3_p))layer_3_p_b = tf.layers.batch_normalization(layer_3_p)y_p = tf.nn.relu(tf.add(tf.matmul(layer_3_p_b, w_4_p), b_4_p))g_target_q_idx = tf.placeholder('int32', [None, None], 'output_idx') # 输入,这是一个(n, 2)的listtarget_q_with_idx = tf.gather_nd(y_p, g_target_q_idx) # 提取首参的某几行/列init = tf.global_variables_initializer()saver = tf.train.Saver()
在这里仅说明大体结构,具体含义请见下问“采样并获得loss”部分,有结合算法原理的Network结构说明。
整体分成三个NN:Training,compute_loss,Prediction,分别用N1 N2 N3表示。其中N1和N3结构完全一致,为算法结构中的DQN网络,输出Q值,不同点在于,N1每次迭代式都更新,而N3每隔一段时间更新一次。N2接受N1的输入,负责计算Q函数并对N1实现迭代更新。
在
预测:predict(sess, s_t, ep, test_ep = False),此函数用于驱动NN,生成动作action,代码如下:
def predict(sess, s_t, ep, test_ep = False):n_power_levels = len(env.V2V_power_dB_List)if np.random.rand() < ep and not test_ep:pred_action = np.random.randint(n_RB*n_power_levels)else:pred_action = sess.run(g_q_action, feed_dict={x: [s_t]})[0]return pred_action
这里的action是一个int,但内涵了RB和power_level的信息,在本代码后面Training和Testing中都有出现,使用方法如下:
action = predict(sesses[i*n_neighbor+j], state, epsi)action_all.append(action)action_all_training[i, j, 0] = action % n_RB # chosen RBaction_all_training[i, j, 1] = int(np.floor(action / n_RB)) # power level
采样并获得loss:q_learning_mini_batch(current_agent, current_sess),输入单个agent,里面用到了CLASS:memory的sample方法,上面有提到。此外double q-learning也在这里设置。
def q_learning_mini_batch(current_agent, current_sess):""" Training a sampled mini-batch """batch_s_t, batch_s_t_plus_1, batch_action, batch_reward = current_agent.memory.sample()if current_agent.double_q: # double q-learningpred_action = current_sess.run(g_q_action, feed_dict={x: batch_s_t_plus_1})q_t_plus_1 = current_sess.run(target_q_with_idx, {x_p: batch_s_t_plus_1, g_target_q_idx: [[idx, pred_a] for idx, pred_a in enumerate(pred_action)]})batch_target_q_t = current_agent.discount * q_t_plus_1 + batch_rewardelse:q_t_plus_1 = current_sess.run(y_p, {x_p: batch_s_t_plus_1})max_q_t_plus_1 = np.max(q_t_plus_1, axis=1)batch_target_q_t = current_agent.discount * max_q_t_plus_1 + batch_reward_, loss_val = current_sess.run([optim, g_loss], {g_target_q_t: batch_target_q_t, g_action: batch_action, x: batch_s_t})return loss_val
4.23 补充:这个函数需要结合NN的结构来看,个人感觉还是有点复杂的。如表面意思通过 if 表现了不同DQN和double q-learning两种方法,需要注意的是在两个if里面都只计算了target network的部分,算法图左上方的Network的输入、迭代更新由最后一句完成:
_, loss_val = current_sess.run([optim, g_loss], {g_target_q_t: batch_target_q_t, g_action: batch_action, x: batch_s_t})
这段代码需要和这篇博客中的图相对应才可以理解,在这里将算法原理图和代码流程图贴出来(代码图由博主通过VISIO绘制,没有遵循标准格式,有错误请见谅)
普通DQN
Double DQN
与普通DQN在target network处有不同,前者直接通过Predict Network(上图的‘predict/每隔一段时间更新一次’)和max构成target network,但是doubkle DQN将training network和Predict Network级联构成target network。
Training环节
for i in episode:(对于一次完整的episode迭代)
- 根据i确定epsi(递增->不变)
- 每100次更新一次位置、邻居、快衰、信道。
- 初始化demand time_limit active_links(全1)
- for i_step in episode:(对于episode中的每一步):
初始化state_old_all,action_all action_all_training
- for循环:对每一个链路
- 获取该链路的state【对于单个链路】
通过predict得到action(包含RB和POWER的信息)【对于单个链路】
根据action得到action_all_trainging = [车,邻居,RB/power]【讲单个链路的内容存储起来】
通过action_for_training得到reward【这里是对于所有链路的】如果是sarl,则把计算reward的放到上面的for内,其他一样
把reward加入record_reward
更新快衰
根据action计算干扰
使用for循环对每个链路
计算新状态
将(state_old,state_new,train_reward,action)加入agent的memory中【所以说这里的memory每一条是对于单个链路的】
每当得到mini_batch_step个新状态后:通过Q-learning_mini_batch得到loss
每当到达target_update_step后,更新target_q_network
record_reward = np.zeros([n_episode*n_step_per_episode, 1])
record_loss = []
if IS_TRAIN:for i_episode in range(n_episode):print("-------------------------")print('Episode:', i_episode)if i_episode < epsi_anneal_length:epsi = 1 - i_episode * (1 - epsi_final) / (epsi_anneal_length - 1) # epsilon decreases over each episodeelse:epsi = epsi_final# 每迭代100次更新一次位置、邻居、信道、快衰if i_episode%100 == 0:env.renew_positions() # update vehicle positionenv.renew_neighbor()env.renew_channel() # update channel slow fadingenv.renew_channels_fastfading() # update channel fast fadingenv.demand = env.demand_size * np.ones((env.n_Veh, env.n_neighbor))env.individual_time_limit = env.time_slow * np.ones((env.n_Veh, env.n_neighbor))env.active_links = np.ones((env.n_Veh, env.n_neighbor), dtype='bool')for i_step in range(n_step_per_episode): # range内是0.1/0.001 = 100time_step = i_episode*n_step_per_episode + i_step # time_step是整体的stepstate_old_all = []action_all = []action_all_training = np.zeros([n_veh, n_neighbor, 2], dtype='int32')for i in range(n_veh):for j in range(n_neighbor):state = get_state(env, [i, j], i_episode/(n_episode-1), epsi)state_old_all.append(state)action = predict(sesses[i*n_neighbor+j], state, epsi)action_all.append(action)action_all_training[i, j, 0] = action % n_RB # chosen RBaction_all_training[i, j, 1] = int(np.floor(action / n_RB)) # power level# All agents take actions simultaneously, obtain shared reward, and update the environment.action_temp = action_all_training.copy()train_reward = env.act_for_training(action_temp)record_reward[time_step] = train_rewardenv.renew_channels_fastfading()env.Compute_Interference(action_temp)for i in range(n_veh):for j in range(n_neighbor):state_old = state_old_all[n_neighbor * i + j]action = action_all[n_neighbor * i + j]state_new = get_state(env, [i, j], i_episode/(n_episode-1), epsi)agents[i * n_neighbor + j].memory.add(state_old, state_new, train_reward, action) # add entry to this agent's memory# training this agentif time_step % mini_batch_step == mini_batch_step-1:loss_val_batch = q_learning_mini_batch(agents[i*n_neighbor+j], sesses[i*n_neighbor+j])record_loss.append(loss_val_batch)if i == 0 and j == 0:print('step:', time_step, 'agent',i*n_neighbor+j, 'loss', loss_val_batch)if time_step % target_update_step == target_update_step-1:update_target_q_network(sesses[i*n_neighbor+j])if i == 0 and j == 0:print('Update target Q network...')print('Training Done. Saving models...')for i in range(n_veh):for j in range(n_neighbor):model_path = label + '/agent_' + str(i * n_neighbor + j)save_models(sesses[i * n_neighbor + j], model_path)current_dir = os.path.dirname(os.path.realpath(__file__))reward_path = os.path.join(current_dir, "model/" + label + '/reward.mat')scipy.io.savemat(reward_path, {'reward': record_reward})record_loss = np.asarray(record_loss).reshape((-1, n_veh*n_neighbor))loss_path = os.path.join(current_dir, "model/" + label + '/train_loss.mat')scipy.io.savemat(loss_path, {'train_loss': record_loss})
Testing环节
首先加载training得到的模型
for i in episode:(对于一次完整的episode迭代)
- 更新位置、邻居、快衰、信道。
- 初始化demand time_limit active_links(全1)
- for i_step in episode:(对于episode中的每一步):
初始化state_old_all,action_all action_all_testing
通过predict得到action(包含RB和POWER的信息)
根据action得到action_all_traingingaction_all_testing = [车,邻居,RB/power]
通过action_for_trainingaction_for_testing得到reward V2I_rate, V2V_success, V2V_rate
对V2I_rate求和并加入V2I_rate_per_episode
将V2V_rate加入rate_marl
更新demand
if IS_TEST:print("\nRestoring the model...")for i in range(n_veh):for j in range(n_neighbor):model_path = label + '/agent_' + str(i * n_neighbor + j)load_models(sesses[i * n_neighbor + j], model_path)V2I_rate_list = []V2V_success_list = []V2I_rate_list_rand = []V2V_success_list_rand = []rate_marl = np.zeros([n_episode_test, n_step_per_episode, n_veh, n_neighbor])rate_rand = np.zeros([n_episode_test, n_step_per_episode, n_veh, n_neighbor])demand_marl = env.demand_size * np.ones([n_episode_test, n_step_per_episode+1, n_veh, n_neighbor])demand_rand = env.demand_size * np.ones([n_episode_test, n_step_per_episode+1, n_veh, n_neighbor])power_rand = np.zeros([n_episode_test, n_step_per_episode, n_veh, n_neighbor])for idx_episode in range(n_episode_test):print('----- Episode', idx_episode, '-----')env.renew_positions()env.renew_neighbor()env.renew_channel()env.renew_channels_fastfading()env.demand = env.demand_size * np.ones((env.n_Veh, env.n_neighbor))env.individual_time_limit = env.time_slow * np.ones((env.n_Veh, env.n_neighbor))env.active_links = np.ones((env.n_Veh, env.n_neighbor), dtype='bool')env.demand_rand = env.demand_size * np.ones((env.n_Veh, env.n_neighbor))env.individual_time_limit_rand = env.time_slow * np.ones((env.n_Veh, env.n_neighbor))env.active_links_rand = np.ones((env.n_Veh, env.n_neighbor), dtype='bool')V2I_rate_per_episode = []V2I_rate_per_episode_rand = []for test_step in range(n_step_per_episode):# trained modelsaction_all_testing = np.zeros([n_veh, n_neighbor, 2], dtype='int32')for i in range(n_veh):for j in range(n_neighbor):state_old = get_state(env, [i, j], 1, epsi_final)action = predict(sesses[i*n_neighbor+j], state_old, epsi_final, True)action_all_testing[i, j, 0] = action % n_RB # chosen RBaction_all_testing[i, j, 1] = int(np.floor(action / n_RB)) # power levelaction_temp = action_all_testing.copy()V2I_rate, V2V_success, V2V_rate = env.act_for_testing(action_temp)V2I_rate_per_episode.append(np.sum(V2I_rate)) # sum V2I rate in bpsrate_marl[idx_episode, test_step,:,:] = V2V_ratedemand_marl[idx_episode, test_step+1,:,:] = env.demand# random baselineaction_rand = np.zeros([n_veh, n_neighbor, 2], dtype='int32')action_rand[:, :, 0] = np.random.randint(0, n_RB, [n_veh, n_neighbor]) # bandaction_rand[:, :, 1] = np.random.randint(0, len(env.V2V_power_dB_List), [n_veh, n_neighbor]) # powerV2I_rate_rand, V2V_success_rand, V2V_rate_rand = env.act_for_testing_rand(action_rand)V2I_rate_per_episode_rand.append(np.sum(V2I_rate_rand)) # sum V2I rate in bpsrate_rand[idx_episode, test_step, :, :] = V2V_rate_randdemand_rand[idx_episode, test_step+1,:,:] = env.demand_randfor i in range(n_veh):for j in range(n_neighbor):power_rand[idx_episode, test_step, i, j] = env.V2V_power_dB_List[int(action_rand[i, j, 1])]# update the environment and compute interferenceenv.renew_channels_fastfading()env.Compute_Interference(action_temp)if test_step == n_step_per_episode - 1:V2V_success_list.append(V2V_success)V2V_success_list_rand.append(V2V_success_rand)V2I_rate_list.append(np.mean(V2I_rate_per_episode))V2I_rate_list_rand.append(np.mean(V2I_rate_per_episode_rand))print(round(np.average(V2I_rate_per_episode), 2), 'rand', round(np.average(V2I_rate_per_episode_rand), 2))print(V2V_success_list[idx_episode], 'rand', V2V_success_list_rand[idx_episode])
参考自
[1]3GPP TR36.885报告
[2]《5G移动通信技术》
[代码解读]基于多代理RL的车联网频谱分享_Python实现相关推荐
- 图像分割套件PaddleSeg全面解析(一)train.py代码解读
首先祝贺百度团队百度斩获NeurIPS2020挑战赛冠军,https://www.jiqizhixin.com/articles/2020-12-09-2. 在此次比赛中使用的是基于飞桨深度学习框架开 ...
- 基于SegNet和UNet的遥感图像分割代码解读
基于SegNet和UNet的遥感图像分割代码解读 目录 基于SegNet和UNet的遥感图像分割代码解读 前言 概述 代码框架 代码细节分析 划分数据集gen_dataset.py UNet模型训练u ...
- 类ChatGPT逐行代码解读(2/2):从零起步实现ChatLLaMA和ColossalChat
本文为<类ChatGPT逐行代码解读>系列的第二篇,上一篇是:如何从零起步实现Transformer.ChatGLM 本文两个模型的特点是加了RLHF 第六部分 LLaMA的RLHF版:C ...
- 200行代码解读TDEngine背后的定时器
作者 | beyondma来源 | CSDN博客 导读:最近几周,本文作者几篇有关陶建辉老师最新的创业项目-TdEngine代码解读文章出人意料地引起了巨大的反响,原以为C语言已经是昨日黄花,不过从读 ...
- Unet论文解读代码解读
论文地址:http://www.arxiv.org/pdf/1505.04597.pdf 论文解读 网络 架构: a.U-net建立在FCN的网络架构上,作者修改并扩大了这个网络框架,使其能够使用很少 ...
- 基于反向代理的Web缓存应用-可缓存的CMS系统设计
基于反向代理的Web缓存加速 --可缓存的CMS系统设计 作者: 车东 Email: chedongATbigfoot.com/chedongATchedong.com 写于:2003/05 ...
- BERT:代码解读、实体关系抽取实战
目录 前言 一.BERT的主要亮点 1. 双向Transformers 2.句子级别的应用 3.能够解决的任务 二.BERT代码解读 1. 数据预处理 1.1 InputExample类 1.2 In ...
- 实现一个基于动态代理的 AOP
实现一个基于动态代理的 AOP Intro 上次看基于动态代理的 AOP 框架实现,立了一个 Flag, 自己写一个简单的 AOP 实现示例,今天过来填坑了 目前的实现是基于 Emit 来做的,后面有 ...
- .NET 下基于动态代理的 AOP 框架实现揭秘
.NET 下基于动态代理的 AOP 框架实现揭秘 Intro 之前基于 Roslyn 实现了一个简单的条件解析引擎,想了解的可以看这篇文章 基于 Roslyn 实现一个简单的条件解析引擎 执行过程中会 ...
最新文章
- python3.4 使用pymysql 连接mysql数据库
- mockito_书评:Mockito Essentials
- Shopify:删除版权信息 Powered by Shopify 在网站底部
- freecodecamp_freeCodeCamp的服务条款
- 可下拉选项可模糊查询的文本输入框
- 开课吧课堂-Java面试题:面向对象的特征有哪些方面?
- (转)android studio工程编译不出来的一些error
- Luogu1390 公约数的和
- 市场热门行驶证识别性能测评对比
- windows server 2019 中文语言包
- 教你打造 Win7 中的高清设备图标
- 【熟知水星无线路由器的安装步骤】
- 印象笔记:部分Mac用户因为故障而丢失数据
- MySQL数据库操作与查询的综合测试题
- JVM调优的在线网站
- RK3588 ssh Failed to start OpenBSD Secure Shell server 以及E: Sub-process /usr/bin/dpkg returned无法下载软件
- Java使用graphhopper完成路线规划
- Windows10 官方原版镜像下载途径 Label:win10解决方案
- PDF417码制尺寸定义
- 小白程序员怎么由量变到质变写出高质量代码
热门文章
- 基于JQuery实现前端页面表单中省市区的选择并存入数据库JQuery+MySQL+Jfinal
- 期权和期货的定义及区别
- glusterfs搭建
- 【软件工具】百度搜索技巧
- 本地服务器模板网站怎么安装,使用dedecms搭建自己的本地网站(全程图解)
- 解决Ubuntu屏幕分辨率不正常问题
- 信息检索中的度量precison@k,recall@k,f1@k,MRR,ap,map,CG, DCG,NDCG
- bitset(位图)原理与用法
- JVM 垃圾回收(GC)
- java分割图片_OpenCV3 Java分割图像 提取图像的RGB三原色(Core.split)