公式推导

  三层BP神经网络如上图所示。其中, x i x_i xi​表示第 i i i个输入层节点的输入值,也是其输出值, z j z_j zj​表示第 j j j个隐藏层节点的输出值, y k y_k yk​表示第 k k k个输出层节点的输出值, v i j v_{ij} vij​表示从第i个输入层节点到第j个隐藏层节点的权重, w j k w_{jk} wjk​表示从第 j j j个隐藏层节点到第 k k k个输出层节点的权重,第 j j j个隐藏层节点的阈值为 θ j \theta_j θj​,第 k k k个输出层节点的阈值为 γ k \gamma_k γk​,激活函数采用Sigmoid函数: f ( x ) = 1 1 + e − x f\left(x\right)=\frac{1}{1+e^{-x}} f(x)=1+e−x1​ Sigmoid函数的导数: f ′ ( x ) = f ( x ) ( 1 − f ( x ) ) f'(x)=f(x)\big(1-f(x)\big) f′(x)=f(x)(1−f(x)) 第 j j j个隐藏层节点的输入: α j = ∑ i v i j x i \alpha_j=\sum_{i}{v_{ij}x_i} αj​=i∑​vij​xi​第j个隐藏层节点的输出: z j = f ( α j − θ j ) z_j=f\left(\alpha_j-\theta_j\right) zj​=f(αj​−θj​) 第k个输出层节点的输入: β k = ∑ j w j k z j \beta_k=\sum_{j}{w_{jk}z_j} βk​=j∑​wjk​zj​ 第k个输出层节点的输出: y k = f ( β k − γ k ) y_k=f\left(\beta_k-\gamma_k\right) yk​=f(βk​−γk​) 输出层的误差函数: E = ∑ k ( y k − t k ) 2 E=\sum_{k}\left(y_k-t_k\right)^2 E=k∑​(yk​−tk​)2 其中 t k t_k tk​是训练集的真实标签。根据反向传播的误差调整节点之间的连接权重,每一个权重的修正方向是误差函数梯度的反方向。设 η \eta η为学习率, Δ w j k \Delta w_{jk} Δwjk​的计算公式: Δ w j k = − η ∂ E ∂ w j k = − η ( y k − t k ) y k ( 1 − y k ) z j \begin{aligned} \Delta w_{jk}&=-\eta\frac{\partial E}{\partial w_{jk}} \\ &=-\eta\left(y_k-t_k\right)y_k(1-y_k)z_j \end{aligned} Δwjk​​=−η∂wjk​∂E​=−η(yk​−tk​)yk​(1−yk​)zj​​ w j k w_{jk} wjk​的更新公式: w j k n e w = w j k + Δ w j k w_{jk\ new}=w_{jk}+\Delta w_{jk} wjk new​=wjk​+Δwjk​ Δ γ k \Delta \gamma_k Δγk​的计算公式: Δ γ k = − η ∂ E ∂ γ k = η ( y k − t k ) y k ( 1 − y k ) \begin{aligned} \Delta \gamma_k&=-\eta\frac{\partial E}{\partial \gamma_k} \\ &=\eta\left(y_k-t_k\right)y_k(1-y_k) \end{aligned} Δγk​​=−η∂γk​∂E​=η(yk​−tk​)yk​(1−yk​)​ γ k \gamma_k γk​的更新公式: γ k n e w = γ k + Δ γ k \gamma_{k\ new}=\gamma_k+\Delta \gamma_k γk new​=γk​+Δγk​ Δ v i j \Delta v_{ij} Δvij​的计算公式: Δ v i j = − η ∂ E ∂ v i j = − η ∑ k ( ( y k − t k ) y k ( 1 − y k ) w j k ) z j ( 1 − z j ) x i \begin{aligned} \Delta v_{ij}&=-\eta\frac{\partial E}{\partial v_{ij}} \\ &=-\eta \sum_k \big((y_k-t_k)y_k(1-y_k)w_{jk}\big)z_j(1-z_j)x_i \end{aligned} Δvij​​=−η∂vij​∂E​=−ηk∑​((yk​−tk​)yk​(1−yk​)wjk​)zj​(1−zj​)xi​​ v i j v_{ij} vij​的更新公式: v i j n e w = v i j + Δ v i j v_{ij\ new}=v_{ij}+\Delta v_{ij} vij new​=vij​+Δvij​ Δ θ j \Delta \theta_j Δθj​的计算公式: Δ θ j = − η ∂ E ∂ θ j = η ∑ k ( ( y k − t k ) y k ( 1 − y k ) w j k ) z j ( 1 − z j ) \begin{aligned} \Delta \theta_j&=-\eta\frac{\partial E}{\partial \theta_j} \\ &=\eta \sum_k \big((y_k-t_k)y_k(1-y_k)w_{jk}\big)z_j(1-z_j) \end{aligned} Δθj​​=−η∂θj​∂E​=ηk∑​((yk​−tk​)yk​(1−yk​)wjk​)zj​(1−zj​)​ θ j \theta_j θj​的更新公式: θ j n e w = θ j + Δ θ j \theta_{j\ new}=\theta_j+\Delta \theta_j θj new​=θj​+Δθj​ 数据预处理采用 Z-score 算法: x n e w = x − μ σ x_{new}=\frac{x-\mu}{\sigma} xnew​=σx−μ​ 其中 μ \mu μ是样本均值, σ \sigma σ是样本的标准差,该算法使样本数据符合均值为0,标准差为1的标准正态分布。
  BP神经网络的训练步骤如下图所示。

C语言代码

//********bpnn.h file********//
#ifndef BPNN_H
#define BPNN_H
#define MAX_NUM_INPUT   260   // maximum nodes number of input layer
#define MAX_NUM_HIDDEN  100  // maximum nodes number of hidden layer
#define MAX_NUM_OUTPUT  1   // maximum nodes' number of output layer
#define MAX_NUM_LAYER_OUT 260
typedef struct BPNN
{int trained;    // 0 untrained, 1 trainedint num_input;  // nodes number of input layerint num_hidden; // nodes number of hidden layerint num_output; // nodes number of output layerdouble rate;    // learning ratedouble weight_input_hidden[MAX_NUM_INPUT][MAX_NUM_HIDDEN];   // weight of the input layer to the hidden layerdouble weight_hidden_output[MAX_NUM_HIDDEN][MAX_NUM_OUTPUT]; // weight of the hidden layer to the output layerdouble threshold_hidden[MAX_NUM_HIDDEN]; // threshold of hidden layerdouble threshold_output[MAX_NUM_OUTPUT]; // threshold of output layerdouble error[MAX_NUM_OUTPUT];       // error of output of each nodedouble error_total;                 // total errordouble mean_std[MAX_NUM_INPUT][2];  // mean and standard deviation of training data
}BPNN;
void bpnn_Init(BPNN *bpnn_ptr,int num_input,int num_hidden,int num_output,double learn_rate);
void bpnn_ForwardPropagation(BPNN *bpnn_ptr,const double *data,const double *label,double *layer_out);
void bpnn_BackPropagation(BPNN *bpnn_ptr,const double *layer_out);
void bpnn_Train(BPNN *bpnn_ptr,double *data,double *label,int num_sample,int num_input,int num_hidden,int num_output,double learn_rate,int num_iter);
void bpnn_Predict(BPNN *bpnn_ptr,double *data,double *label,int num_sample);
void bpnn_FileOutput(BPNN *bpnn_ptr,char *model);
void bpnn_LoadModel(BPNN *bpnn_ptr,char *model);
void bpnn_Normalize(BPNN *bpnn_ptr,double *x,int row,int col);
void Min_Max(double *x,int row,int col);
double Zscore(double x,double mean,double std);
double Sigmoid(double x);
#endif
//********bpnn.c file********//
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#include "math.h"
#include "time.h"
#include "BPNN.h"
/*** @description: Init bp network.* @param:  bpnn_ptr    point to bp network.*          num_input   number of  properties of each sample and nodes number of input layer.*          num_hidden  nodes number of hidden layer.*          num_output  number of label of each sample and nodes number of output layer.*          learn_rate  learning rate.         * @return: none.*/
void bpnn_Init(BPNN *bpnn_ptr,int num_input,int num_hidden,int num_output,double learn_rate)
{int i,j;bpnn_ptr->trained = 0;bpnn_ptr->num_input = num_input;bpnn_ptr->num_hidden = num_hidden;bpnn_ptr->num_output = num_output;bpnn_ptr->rate = learn_rate;bpnn_ptr->error_total = 0;srand((unsigned)time(NULL));// set random number seedfor (i=0;i<num_input;i++){for(j=0;j<num_hidden;j++){bpnn_ptr->weight_input_hidden[i][j] = ((double)rand())/RAND_MAX-0.5; // init weight to [-0.5, 0.5]}}for (i=0;i<num_hidden;i++){bpnn_ptr->threshold_hidden[i] = 0; // init threshold of hidden layer to 0for(j=0;j<num_output;j++){bpnn_ptr->weight_hidden_output[i][j] = ((double)rand())/RAND_MAX-0.5; // init weight to [-0.5, 0.5]}}for(j=0;j<num_output;j++){bpnn_ptr->threshold_output[j] = 0; // init threshold of output layer to 0bpnn_ptr->error[j] = 0; // init error of output to 0}
}
/*** @description: forward propagation get output of network.* @param:  bpnn_ptr    point to bp network.*          data        point to one of training data.*          label       point to one of training label.*          layer_out   layer_out[i*MAX_NUM_LAYER_OUT+j]  output of node j in layer i. * @return: none.*/
void bpnn_ForwardPropagation(BPNN *bpnn_ptr,const double *data,const double *label,double *layer_out)
{int i,j;double temp;for(i=0;i<bpnn_ptr->num_input;i++) // calculate output of input layer{layer_out[i] = data[i];}for(j=0;j<bpnn_ptr->num_hidden;j++) // calculate output of hidden layer.{temp = -(bpnn_ptr->threshold_hidden[j]);for(i=0;i<bpnn_ptr->num_input;i++){temp += (bpnn_ptr->weight_input_hidden[i][j])*layer_out[i];}layer_out[MAX_NUM_LAYER_OUT+j] = Sigmoid(temp);}bpnn_ptr->error_total = 0;for(j=0;j<bpnn_ptr->num_output;j++) // calculate output of output layer.{temp = -(bpnn_ptr->threshold_output[j]);for(i=0;i<bpnn_ptr->num_hidden;i++){temp += (bpnn_ptr->weight_hidden_output[i][j])*layer_out[MAX_NUM_LAYER_OUT+i];}layer_out[2*MAX_NUM_LAYER_OUT+j] = Sigmoid(temp);bpnn_ptr->error[j] = layer_out[2*MAX_NUM_LAYER_OUT+j]-label[j];bpnn_ptr->error_total +=  0.5l*(bpnn_ptr->error[j])*(bpnn_ptr->error[j]);}
}
/*** @description: back propagation update weight and threshold.* @param:  bpnn_ptr    point to bp network.*          layer_out   layer_out[i*MAX_NUM_LAYER_OUT+j]  output of node j in layer i. * @return: none.*/
void bpnn_BackPropagation(BPNN *bpnn_ptr,const double *layer_out)
{double g[MAX_NUM_OUTPUT],e[MAX_NUM_HIDDEN],t;double rate;int i,j;rate = (bpnn_ptr->rate);for(i=0;i<bpnn_ptr->num_output;i++){g[i] = (bpnn_ptr->error[i])*(layer_out[2*MAX_NUM_LAYER_OUT+i])*(1-layer_out[2*MAX_NUM_LAYER_OUT+i]);bpnn_ptr->threshold_output[i] += rate*g[i];}for(i=0;i<bpnn_ptr->num_hidden;i++){for(j=0;j<bpnn_ptr->num_output;j++){bpnn_ptr->weight_hidden_output[i][j] += -rate*g[j]*layer_out[MAX_NUM_LAYER_OUT+i];}}for(i=0;i<bpnn_ptr->num_hidden;i++){t = 0;for(j=0;j<bpnn_ptr->num_output;j++){t += (bpnn_ptr->weight_hidden_output[i][j])*g[j];}e[i] = t*layer_out[MAX_NUM_LAYER_OUT+i]*(1-layer_out[MAX_NUM_LAYER_OUT+i]);bpnn_ptr->threshold_hidden[i] += rate*e[i];}for(i=0;i<bpnn_ptr->num_input;i++){for(j=0;j<bpnn_ptr->num_hidden;j++){bpnn_ptr->weight_input_hidden[i][j] += -rate*e[j]*layer_out[i];}}
}
/*** @description: train back propagation network.* @param:  bpnn_ptr    point to bp network.*          data        point to training data, a row is a sample.*          label       point to training label.*          num_sample  number of samples.*          num_input   number of  properties of each sample and nodes number of input layer.*          num_hidden  nodes number of hidden layer.*          num_output  number of label of each sample and nodes number of output layer.*          learn_rate  learning rate.*          num_iter    number of iteration.* @return: none.*/
void bpnn_Train(BPNN *bpnn_ptr,double *data,double *label,int num_sample,int num_input,int num_hidden,int num_output,double learn_rate,int num_iter)
{int iter,sample,i;double layer_out[3][MAX_NUM_LAYER_OUT]; // layer_out[i][j] output of node j in layer i, i = 0 input, i = 1 hidden, i = 2 outputprintf("Training...\r\n");bpnn_Init(bpnn_ptr,num_input,num_hidden,num_output,learn_rate);bpnn_Normalize(bpnn_ptr,data,num_sample,num_input);Min_Max(label,num_sample,num_output);for(iter=0;iter<num_iter;iter++){for(sample=0;sample<num_sample;sample++){bpnn_ForwardPropagation(bpnn_ptr,&data[sample*num_input],&label[sample*num_output],&layer_out[0][0]);bpnn_BackPropagation(bpnn_ptr,&layer_out[0][0]);}if(bpnn_ptr->error_total<0.0000001)break;}bpnn_ptr->trained = 1;printf("Training over!\r\nerror rate: %.4f\r\niteration times: %d\r\n",bpnn_ptr->error_total,iter);
}
/*** @description: use network to predict label.* @param:  bpnn_ptr    point to bp network.*          data        point to one of training data.*          label       return label.*          num_sample  number of samples.* @return: none.*/
void bpnn_Predict(BPNN *bpnn_ptr,double *data,double *label,int num_sample)
{double layer_out[3][MAX_NUM_LAYER_OUT]; // layer_out[i][j]  output of node j in layer i, i = 0 input, i = 1 hidden, i = 2 outputint i,j;if(bpnn_ptr->trained == 0){printf("Network untrained!");return;}bpnn_Normalize(bpnn_ptr,data,num_sample,bpnn_ptr->num_input); // data have to be normalizedfor(i=0;i<num_sample;i++){bpnn_ForwardPropagation(bpnn_ptr,&data[i*(bpnn_ptr->num_input)],&label[i*(bpnn_ptr->num_output)],&layer_out[0][0]);for(j=0;j<bpnn_ptr->num_output;j++){label[i*(bpnn_ptr->num_output)+j] = layer_out[2][j];}}
}
/*** @description: output information of bp network to "bpnn_out.txt", output model of network to "'model'.bin".* @param:  bpnn_ptr    point to bp network.*          model       name of model.* @return: none.*/
void bpnn_FileOutput(BPNN *bpnn_ptr,char *model)
{FILE *file = NULL;int i,j;file = fopen("bpnn_out.txt","w");if(file == NULL){printf("Error!");exit(1);}fprintf(file,"Number of nodes in input layer: %d\n",bpnn_ptr->num_input);fprintf(file,"Number of nodes in hidden layer: %d\n",bpnn_ptr->num_hidden);fprintf(file,"Number of nodes in output layer: %d\n",bpnn_ptr->num_output);fprintf(file,"\nHidden layer threshold: ");for(i=0;i<bpnn_ptr->num_hidden;i++){fprintf(file," %.2lf ",(bpnn_ptr->threshold_hidden[i]));}fprintf(file,"\nOutput layer threshold: ");for(i=0;i<bpnn_ptr->num_output;i++){fprintf(file," %.2lf ",(bpnn_ptr->threshold_output[i]));}fprintf(file,"\n\nWeight of input layer to hidden layer: ");for(i=0;i<bpnn_ptr->num_input;i++){fprintf(file,"\n%d row: ",i);for(j=0;j<bpnn_ptr->num_hidden;j++){fprintf(file," %.2lf ",(bpnn_ptr->weight_input_hidden[i][j]));}    }fprintf(file,"\n\nWeight of input layer to hidden layer: ");for(i=0;i<bpnn_ptr->num_hidden;i++){fprintf(file,"\n%d row: ",i);for(j=0;j<bpnn_ptr->num_output;j++){fprintf(file," %.3lf ",(bpnn_ptr->weight_hidden_output[i][j]));}    }fprintf(file,"\n\n\"%s\" is network model.",model);fclose(file);file = fopen(model,"wb");if(file == NULL){printf("Error!");exit(1);}fwrite(bpnn_ptr,sizeof(BPNN),1,file);fclose(file);
}
/*** @description: load model from "'model'.bin".* @param:  bpnn_ptr    point to bp network.*          model       name of model.* @return: none.*/
void bpnn_LoadModel(BPNN *bpnn_ptr,char *model)
{FILE *file = NULL;file = fopen(model,"rb");if(file == NULL){printf("Error!");exit(1);}fread(bpnn_ptr,sizeof(BPNN),1,file);fclose(file);
}
/*** @description: normalize input values.* @param:  bpnn_ptr  point to bp network.*          x         point to matrix's address.*          row       number of samples.*          col       number of properties of each sample and nodes number of input layer.* @return: none.*/
void bpnn_Normalize(BPNN *bpnn_ptr,double *x,int row,int col)
{double sum1,sum2,mean,std;int i,j;if(bpnn_ptr->trained){for(j=0;j<col;j++){for(i=0;i<row;i++){x[i*col+j] = Zscore(x[i*col+j],bpnn_ptr->mean_std[j][0],bpnn_ptr->mean_std[j][1]);}            }return;}for(j=0;j<col;j++){sum1 = 0;sum2 = 0;for(i=0;i<row;i++){sum1 += x[i*col+j];sum2 += x[i*col+j]*x[i*col+j];}mean = sum1/row;std = pow((sum2/row)-(mean*mean),0.5);bpnn_ptr->mean_std[j][0] = mean;    // mean valuebpnn_ptr->mean_std[j][1] = std;  // standard deviationfor(i=0;i<row;i++){x[i*col+j] = Zscore(x[i*col+j],mean,std);}}
}
/*** @description: min-max normalization, normalizing the minimum and maximum values of data to [0, 1].* @param:  x   point to matrix's address.*          row number of samples.*          col number of properties of each sample and nodes number of input layer.* @return: none.*/
void Min_Max(double *x,int row,int col)
{double max,min,temp;int i,j;for(j=0;j<col;j++){max = x[j];min = x[j];for(i=0;i<row;i++){temp = x[i*col+j];max = (temp>max)?temp:max;min = (temp<min)?temp:min;}for(i=0;i<row;i++){temp = x[i*col+j];x[i*col+j] = (temp-min)/(max-min);}}}
/*** @description: Z-score normalization.* @param:  x     data.*          mean  mean value.*          std   standard deviation.* @return: function output.*/
double Zscore(double x,double mean,double std)
{return (x-mean)/std;
}
/*** @description: sigmoid function.* @param:  x   input variable.* @return: function output.*/
double Sigmoid(double x)
{return 1.0l/(1.0l+exp(-x));
}

  核心代码都贴出来了,代码里每个函数都有注释,因为vscode的中文时不时有乱码,所以翻译成英文了,机翻的可能不标准。使用时,先将训练数据集和测试数据集读到数组(数组格式请看代码注释)里,然后调用bpnn_Train函数训练网络,再调用bpnn_predict函数预测测试数据,其他函数功能请看注释。
完整代码下载地址:三层BP神经网络C语言代码 。

三层BP神经网络公式推导及C语言实现相关推荐

  1. BP神经网络公式推导及实现(MNIST)

    BP神经网络的基础介绍见:http://blog.csdn.net/fengbingchun/article/details/50274471,这里主要以公式推导为主. BP神经网络又称为误差反向传播 ...

  2. 基于三层BP神经网络的人脸识别

    实验四.基于三层BP神经网络的人脸识别 一. 实验要求 采用三层前馈BP神经网络实现标准人脸YALE数据库的识别,编程语言为C系列语言. 二.BP神经网络的结构和学习算法 实验中建议采用如下最简单的三 ...

  3. 模式识别八--三层BP神经网络的设计与实现

    文章转自:http://www.kancloud.cn/digest/prandmethod/102850 本文的目的是学习和掌握BP神经网络的原理及其学习算法.在MATLAB平台上编程构造一个3-3 ...

  4. 模式识别:三层BP神经网络的设计与实现

    本文的目的是学习和掌握BP神经网络的原理及其学习算法.在MATLAB平台上编程构造一个3-3-1型的singmoid人工神经网络,并使用随机反向传播算法和成批反向传播算法来训练这个网络,这里设置不同的 ...

  5. bp神经网络预测模型python,r语言bp神经网络预测

    如何建立bp神经网络预测 模型 . 建立BP神经网络预测模型,可按下列步骤进行:1.提供原始数据2.训练数据预测数据提取及归一化3.BP网络训练4.BP网络预测5.结果分析现用一个实际的例子,来预测2 ...

  6. Python实现三层BP神经网络

    题外话 看论文,仿真实现,最基本的能力!研一的时候定要多看看论文,提升自己的代码能力! 引言 本篇博客默认读者有一定的BP神经网络的基础,BP神经网络的基本知识就不阐述了! BP神经网络结构 本文内容 ...

  7. 基于sympy的python实现三层BP神经网络算法

    #!/usr/bin/python # -*- coding: utf-8 -*- """ 写一个三层的BP神经网络(3,2,1),3是输入数据的维度,隐层设置节点数为2 ...

  8. 三层BP神经网络的python实现

    这是一个非常漂亮的三层反向传播神经网络的python实现,下一步我准备试着将其修改为多层BP神经网络. 下面是运行演示函数的截图,你会发现预测的结果很惊人! 提示:运行演示函数的时候,可以尝试改变隐藏 ...

  9. 详解 BP 神经网络基本原理及 C 语言实现

    BP(back propagation)即反向传播,是一种按照误差反向传播来训练神经网络的一种方法,BP神经网络应用极为广泛. BP 神经网络主要可以解决以下两种问题: 1.分类问题:用给定的输入向量 ...

最新文章

  1. 目标检测--RON: Reverse Connection with Objectness Prior Networks for Object Detection
  2. ES6关于Promise的用法
  3. 如鹏网.Net基础2 专题课:ASCII码和拆数
  4. 尽管普通的sql语句代码可以实现数据插入的操作,但是更好的代码应该是参数的方式:...
  5. opencv进阶学习笔记2:numpy操作图像,色彩空间,查找指定颜色范围,通道分离与合并
  6. Flask-Login Flask-Security 登录与权限控制
  7. python找出一个数的所有因子_python – 找到最大素因子的正确算法
  8. .net 开发怎么实现前后端分离_ASP.NET Core模块化前后端分离快速开发框架介绍
  9. Ubuntu 14.04卸载安装失败的Mysql数据库,以及重新安装配置
  10. linux登录界面配置、\etc\motd有趣的图案
  11. 【MOS管】基础知识和简易电路
  12. Unity开发手机游戏从第一行代码到Taptap上线,我经历了什么
  13. vue 自定义 文字背景
  14. ZeroDivisionError: integer division or modulo by zero
  15. 由“c++链接错误:未定义的引用“引发的思考
  16. 6个实用的红米手机技巧
  17. 【20保研】中国科学技术大学2019年第二届大学生大数据夏令营通知
  18. text-decoration
  19. 定义结构体变量的三种方式
  20. CTF-隐写术(六)

热门文章

  1. 一种被称牛屎的“软封装”——COB
  2. android 关于plurals 和xliff 的使用方法
  3. 申宝正规股票市场分化较为严重
  4. 联合索引(复合索引)在B+树上的结构
  5. 计算机主机视频介绍,如何在win10中查看计算机视频内存win10中的计算机视频内存简介...
  6. USBCopyer 插上优盘自动复制文件 v3.5
  7. 【理论篇】是时候彻底弄懂BERT模型了(收藏)
  8. 生日礼物(背包问题)
  9. c++左值和右值、 左值引用与右值引用
  10. 彩票随机中奖机制js(学会记得点赞)