人工神经网络导论

There has been hype about artificial intelligence, machine learning, and neural networks for quite a while now. I have been working on these things for over a year now so I would like to share some of my knowledge and give my point of view on Neural networks. This will not be a math-heavy introduction because I just want to build the idea here.

关于人工智能,机器学习和神经网络的炒作已经有一段时间了。 我从事这些事情已经一年多了,所以我想分享我的一些知识并提出我对神经网络的观点。 这将不是一个繁重的数学介绍,因为我只想在这里构建想法。

I will start from the neural network and then I will explain every component of a neural network. If you feel like something is not right or need any help with any of this, Feel free to contact me, I will be happy to help.

我将从神经网络开始,然后再解释神经网络的每个组成部分。 如果您觉得有什么不对劲或需要任何帮助,请随时与联系,我们将竭诚为您服务。

何时使用神经网络? (When to use the Neural Network?)

Let’s assume we want to solve a problem where you are given some set of images and you have to build an automated system that can categories each of those images to its correct label.

L等假设我们要解决,你给出了几个组图像的问题,你必须建立一个自动化系统,每个这些图像到正确的标签可以在类别。

The problem looks simple but how do we come with some logic using raw pixel values and target labels. We can try comparing pixels and edges but we won’t be able to come with some idea which can do this task effectively or say the accuracy of 90% or more.

这个问题看起来很简单,但是我们如何使用原始像素值和目标标签来提供一些逻辑。 我们可以尝试比较像素和边缘,但无法提出可以有效完成此任务的想法,或者说90%或更高的准确性。

When we have this kind of problem where we have high dimensional data like Images and we don’t know the relationship between Input(Images) and the Output(Labels), In this kind of scenario we should use Neural Networks.

当我们遇到像图像这样的高维数据而又不知道Input( 图片 )与Output( 标签 )之间的关系时,在这种情况下,我们应该使用神经网络。

什么是神经网络? (What is the Neural network?)

Artificial neural networks, usually simply called neural networks, are computing systems vaguely inspired by the biological neural networks that constitute animal brains. An ANN is based on a collection of connected units or nodes called artificial neurons, which loosely model the neurons in a biological brain

人工神经网络(通常简称为神经网络)是一种计算系统,受到构成动物大脑的生物神经网络的启发。 人工神经网络基于称为人工神经元的连接单元或节点的集合,它们可以对生物脑中的神经元进行松散建模

A neural network is a set of neurons stacked in a way one after the other such that the neural network learns the relationship between the input and the output variable. It can solve all kinds of problems like classification, regression, or generative problems like next word prediction and Image captioning.

神经网络是一组神经元,它们以一种又一种的方式堆叠在一起,从而使神经网络学习输入变量和输出变量之间的关系。 它可以解决各种问题,例如分类,回归或生成问题,例如下一个单词预测和图像字幕。

We already have a lot of algorithms in machine learning like SVM, logistic regression, linear regression…so many more which also do the same thing i.e they also try to learn the relationship between input and output variable, so why neural networks?

在机器学习中,我们已经有很多算法,例如SVM,逻辑回归,线性回归……还有很多同样的事情,它们也尝试学习输入和输出变量之间的关系,那么为什么要使用神经网络呢?

为什么在传统机器学习上使用神经网络? (Why are Neural Networks used over Traditional Machine learning?)

Traditional ML algorithms are good and they are not computed intensive But they do not work well on high dimensional data unstructured data such as images, audio, or text.

传统的机器学习算法很好,并且计算量不大,但是它们不适用于高维数据非结构化数据,例如图像,音频或文本。

Traditional algorithms are still the building blocks of neural networks but they do not capture the relationship as good as neural networks. The neural network can learn any complex relationship given enough data and proper compute power.

传统算法仍然是神经网络的构建块,但是它们没有像神经网络那样捕获这种关系。 只要有足够的数据和适当的计算能力,神经网络就可以学习任何复杂的关系。

Now that we know when to use a neural network, We can start exploring the components and how these things work.

既然我们知道了何时使用神经网络,我们就可以开始探索这些组件以及它们如何工作。

神经网络的组成部分是什么? (What are the components of a Neural Network?)

A Neural network has some basic components which are:

神经网络具有一些基本组件,它们是:

  1. Neurons or layers神经元或层
  2. Loss function损失函数
  3. Optimizer优化器

什么是神经元? (What is a Neuron?)

A Neuron is the building block of neural networks. A neuron has weights and bias for input that is fed to it.

神经元是神经网络的基础。 神经元具有权重和对输入的偏见。

Let’s assume we have a problem statement where we have a classification problem where we a set of features like Weight, height, BMI, Medical history, Age, and based on that we have to classify if a person is likely to have a heart problem or not.

假设我们有一个问题陈述,其中有一个分类问题,其中我们有一系列的特征,例如体重,身高,BMI,病史,年龄,并基于此我们必须对一个人是否可能患有心脏疾病或不。

Now, We want to give our neural network this data and want it to learn the mapping between these features and output( heart disease or not).

现在,我们想为我们的神经网络提供这些数据,并希望它学习这些特征与输出(是否患有心脏病)之间的映射

Let me introduce us to one of the functions used in neural networks. This function is a Sigmoid or Logistic function.

让我向我们介绍神经网络中使用的功能之一。 此函数是Sigmoid或Logistic函数。

乙状结肠功能 (Sigmoid Function)

Traditional Algorithms like logistic regression uses the same function which is First it will take up all the inputs or features and then assign each feature a weight W, In the end, it will pass it through a sigmoid function which spits out the probabilities.

传统算法(例如逻辑回归)使用相同的函数,即首先将占用所有输入或特征,然后为每个特征分配权重W,最后将其通过S型函数传递出概率。

Artificial Neuron
人工神经元

Let’s assume we have 4 features X1, X2, X3, and X4, and based on that we want to classify if the person is likely to have heart disease or not. We can assume that the two classes are 1 and 0.

假设我们有4个特征X1,X2,X3和X4,并据此对人是否可能患有心脏病进行分类。 我们可以假设这两个类是1和0。

Weight matrix will look something like this:

权重矩阵如下所示:

[W1*X1 +W2*X2+ W3*X3+W4*X4] +bias→ One value

[W1 * X1 + W2 * X2 + W3 * X3 + W4 * X4] +偏置→一个值

We will pass this value to the sigmoid function which is

我们将这个值传递给Sigmoid函数

Sigmoid function
乙状结肠功能

Here X represents the value of [W1*X1 +W2*X2+ W3*X3+W4*X4] +bias

这里X表示[W1 * X1 + W2 * X2 + W3 * X3 + W4 * X4] + bias的值

Suppose we have a value of X = 0 then e to the power→0 will give us 1 and then the whole expression will generate a value of 1/2. This means if the features and weights are zero we will get a probability of 0.5.

假设我们有一个值X = 0,那么e的幂→0将给我们1,然后整个表达式将生成一个值1/2。 这意味着,如果特征和权重为零,则概率为0.5。

Here we only have two classes so the probability of 0.5 means the model is not sure about the predicted class i.e Both the classes have an equal chance when all of the computation is Zero.

在这里,我们只有两个类别,因此概率为0.5表示模型不确定预测类别,即,当所有计算均为零时,两个类别的机会均等。

  1. If the value is high then the sigmoid will generate a value closer to 1 which means the chances of class 1 is higher.

    如果该值高,则乙状结肠将产生一个接近1的值,这意味着第1类的机会更高

2. If the value is low then the sigmoid will generate a value closer to 0 which means the chances of class 0 is higher.

2. 如果该值较低,则该S形将生成一个更接近于0的值,这意味着类别0的机会更高

These scores are dependent on the features and weights of the model i.e W’s and X’s that we used Since we can’t change the features we have to update the weight in such a way that the output is the expected output.

这些分数取决于我们使用的模型的特征和权重,即W和X。由于我们无法更改特征,因此必须以输出为预期输出的方式来更新权重。

We also have one more term known as a bias that just shifts the sigmoid towards the right or left. This is just used to fit the data better.

我们还有一个术语,称为偏差,仅将S型向右或向左移动。 这只是用来更好地拟合数据。

The updating of the weights is the job for the optimizer which will be discussed later.

权重的更新是优化器的工作,稍后将进行讨论。

The idea is simply we have some features/inputs which are mapped to each weight and that dot product weight with the features are fed to the activation function known as sigmoid which generates a score.

想法很简单,我们有一些映射到每个权重的特征/输入,并且具有这些特征的点乘积权重被馈送到称为sigmoid的激活函数,该函数生成得分。

This function is used in logistic regression and is heavily used in neural networks but in the case of the neural networks, it is used in the form of layers. One neuron in the layer is just a sigmoid function with some weights and bias.

此函数用于逻辑回归,并在神经网络中大量使用,但是在神经网络的情况下,它以层的形式使用。 该层中的一个神经元只是乙状结肠功能,具有一些权重和偏差。

层数 (Layers)

The whole idea of neural networks is based on Universal Approximation Theorem.

神经网络的整体思想是基于通用逼近定理的。

The intuition behind this theorem is if we have a very complex function between the input and output. We can learn the approximation function by dividing that function into smaller chunks and each chunk is learned by one neuron or a part of the neural network.

该定理背后的直觉是,如果我们在输入和输出之间具有非常复杂的功能。 我们可以通过将函数分成较小的块来学习逼近函数,并且每个块都是由一个神经元或神经网络的一部分学习的。

Thus by stacking up layers of neurons, we can learn complex functions. If we want to learn more about that click here.

因此,通过堆叠神经元层,我们可以学习复杂的功能。 如果我们想了解更多有关该点击的信息 在这里

Now that we know about sigmoid and Universal Approximate Theorem. We can stack up neurons and form a layer.

现在我们知道了S型和通用近似定理。 我们可以堆叠神经元并形成一层。

This is how a neural network looks like.

这就是神经网络的样子。

This is a very basic neural network that has 3 neurons in the input layer which means it can take in 3 features as input.

这是一个非常基本的神经网络,在输入层具有3个神经元,这意味着它可以接受3个特征作为输入。

It has 4 neurons in the hidden layer which represents 4 sigmoid functions. Each of them is learning a part of the complex function between input and output.

它在隐藏层中有4个神经元,代表4个S型功能。 他们每个人都在学习输入和输出之间复杂功能的一部分。

Finally, We have 2 neurons in the output which represents two categories.

最后,我们在输出中有2个神经元,代表两个类别。

Now that we have created an architecture we want the network to learn, We need some more components like loss function and Optimizer.

现在,我们已经创建了我们希望网络学习的体系结构,我们需要更多的组件,例如损失函数和优化器。

损失函数 (Loss Function)

The loss function is just a mathematical expression that tells the network how good it is performing. We have different loss functions for different problem statements. Loss function defines what kind of relationship is the network trying to learn.

损失函数只是一个数学表达式,它告诉网络性能如何。 对于不同的问题陈述,我们具有不同的损失函数。 损失函数定义了网络试图学习哪种关系。

回归损失函数 (Regression Loss functions)

If we want the network to predict something like the Air Quality index or rating of a restaurant that means we don’t want the network to predict classes or probabilities instead we want it to predict numbers in some range.

如果我们希望网络预测诸如空气质量指数或餐厅等级之类的信息,则意味着我们不希望网络预测类别或概率,而是希望网络预测某个范围内的数字。

In that case, we would like to use mean square error or, root mean square error which will compare the value generated by the network with the Ground Truth or actual value, and Based on the difference of the two it will give loss value.

在这种情况下,我们想使用均方误差或均方根误差,将网络生成的值与地面真值或实际值进行比较,并基于两者的差给出损失值。

If the difference between the two values i.e. Predicted and True Value is high then the loss will be high else low.

如果两个值(即“预测值”和“真值”)之间的差异较大,则损失将很高,否则损失会很小。

分类损失函数 (Classification Loss functions)

If we want the network to predict something like a person is likely to have heart disease or not, Music genre detection or Classify between the image of dogs and cats.

如果我们希望网络预测某个人可能患有的心脏病,则可以通过音乐流派检测或在猫和狗的图像之间进行分类。

In that case, we will like to go for Binary cross-entropy or categorical cross-entropy which will take the predicted probabilities from the network and compare them with the actual probability distribution, based on the difference it will give a loss value.

在这种情况下,我们将选择二进制交叉熵或分类交叉熵,它们将从网络中获取预测的概率,并将其与实际概率分布进行比较,并根据差值给出损失值。

We can also define our custom loss function if you want to solve some new problems.

如果您想解决一些新问题,我们还可以定义自定义损失函数。

优化器 (Optimizer)

The role of neurons and layers is to generate scores and the role of the loss function is to tell how far is the predicted score from the Ground truth or target.

神经元和神经元的作用是生成分数,损失函数的作用是告诉预测分数与地面真相或目标相距多远。

The Optimizer comes into the picture after the loss is calculated, The optimizer just tries to find the relationship between the loss and weights and biases of the network. The goal of the optimizer is to bring the loss as low as possible so that the predictions that the model is making are closer to the target.

在计算出损失之后,优化器就会出现。优化器只是试图找到损失与网络权重和偏差之间的关系。 优化器的目标是使损耗尽可能小,以使模型所做的预测更接近目标。

The optimizer will try to capture the relationship between each weight and bias in the network with the loss function.

优化器将尝试使用损失函数来捕获网络中每个权重与偏差之间的关系。

Loss function = Some function of (weights and bias )

损失函数=的某些函数(权重和偏差)

The derivative of any function w.r.t to some variable X tells us the relationship between that function with X. It gives us information about how much a function is going to change if the value of x changes and in which direction is it going to change.

任何函数wrt到某个变量X的导数都告诉我们该函数与X之间的关系。它为我们提供了有关x的值更改时函数将更改多少以及将向哪个方向更改的信息。

The change in one weight = Loss function/ derivative(some Weight W)

一个权重的变化=损失函数/导数(一些权重W)

Since the loss function is a function of multiple weights and biases the change is a partial derivative of that loss function with respect to one weight. This change is also known as Gradient

由于损失函数是多个权重的函数并且存在偏差,因此变化是该损失函数相对于一个权重的偏导数。 此更改也称为渐变

Now we know the relationship between loss function and weights. We can update the weights in such a way that the loss function is minimum. This process is run in parallel for each weight in the network and they are updated every time we calculate a loss function.

现在我们知道损失函数和权重之间的关系。 我们可以以损失函数最小的方式更新权重。 该过程针对网络中的每个权重并行运行,并且每次我们计算损失函数时都会更新它们。

The Gradient is the positive change of loss function with respect to the weight which means if we update the weights with respect to Gradient then we will be increasing the loss instead of decreasing it so to avoid that situation we don’t add the gradient instead we Subtract the gradient every time we update the weight which results in decreasing of the loss function.

梯度是损失函数相对于权重的正变化,这意味着如果我们相对于梯度更新权重,则将增加损失而不是减少损失,为避免这种情况,我们不添加梯度每次更新权重时都减去梯度,这会导致损失函数减小。

Updated Weight = Previous Weight- Gradient

更新的权重=先前的权重-渐变

Various Optimizers can be used such as Stochastic Gradient Descent, Mini Batch Gradient Descent, or Adam. Some ideas are used such as learning rate, momentum but the basic idea is gradient. This Algorithm is known as Gradient Descent

可以使用各种优化程序,例如随机梯度下降,小批量梯度下降或Adam。 一些想法被使用,例如学习速度,动量,但是基本想法是梯度。 该算法称为梯度下降

I will be writing some more on Neural networks where I will try to cover maths as well as the idea behind the maths.

我将在神经网络上写更多文章,在其中我将尝试涵盖数学以及数学背后的思想。

翻译自: https://medium.com/swlh/introduction-to-neural-networks-d0ff7e9a647b

人工神经网络导论


http://www.taodudu.cc/news/show-4860605.html

相关文章:

  • 【翻译】天机芯Nature_Towards artificial general intelligence with hybrid Tianjic chip architecture
  • Kaggle比赛之Artifical Neural Networks Applied to Taxi Destination Prediction代码整理
  • 人工蜂群算法(Artifical Bee Colony)
  • Kggle比赛之Artifical Neural Networks Applied to Taxi Destination Prediction
  • 下载 | 最新教程《Artifical Neural Networks》
  • artifical reality
  • 机器学习专栏——(一)人工智能概述
  • 线性代数与解析几何——Part4 欧式空间 酉空间
  • 【图卷积网络】01-卷积神经网络:从欧氏空间到非欧氏空间
  • matlab曲线拟合后提取拟合方程
  • matlab 曲线拟合插值问题
  • matlab条件限制曲线,matlab曲线拟合:对参数的限制
  • 【MATLAB】matlab曲线拟合与矩阵计算技巧
  • vs2017 redist 下载地址
  • 安装Redist,运行智慧工厂管理系统
  • QT打包时系统提示 Cannot find Visual Studio redist directory
  • 解决安装VS2022时,出现未能安装包“Microsoft.VisualCpp.Redist.14,version=14.32.31332,chip”=x86
  • Redist-Java 有序列表操作
  • 【技巧】Python找不到指定的模块可能需要安装VC_redist(微软公司出的C++库)
  • vc_redist 静默安装的方法
  • redist-安装
  • 在windows7下安装vs2017插件 GLSL language integration v0.10.120.vsix报错:microsoft.visualc.redist.12
  • vc_redist 又名VC runtime library,或MSCVRT
  • 杀戮空间2开服服务器架设教程UE3Redist
  • 安装redis及redis集群及解决连接不上redist问题
  • VS2017安装警告。未能安装包“Microsoft.VisualCpp.Redist.14,version=14.16.27033.4,chip=x86”
  • redist 3 常用命令
  • windows Server 2012 R2安装 “vc_redist.x64.exe“ 报错
  • 静默安装VC_redist.x64.exe
  • redist mysql_SQL Redist content: Command line option syntax error. Type C

人工神经网络导论_神经网络导论相关推荐

  1. python 神经网络原理_神经网络工作原理

    更多:神经网络- 机器学习这一强大的分支结束了 AI 的寒冬,迎来了人工智能的新时代.简而言之,神经网络可能是今天最具有根本颠覆性的技术. 看完这篇神经网络的指南,你也可以和别人聊聊深度学习了.为此, ...

  2. 神经网络历史_神经网络完整有趣且令人费解的历史

    神经网络历史 关于DL的一切(Everything about DL) We will be looking at the history of neural networks. After thor ...

  3. 神经网络 数学_神经网络与纯数学之间的联系

    神经网络 数学 by Marco Tavora 由Marco Tavora 神经网络与纯数学之间的联系 (Connections between Neural Networks and Pure Ma ...

  4. python 神经网络工具_神经网络15分钟入门!使用python从零开始写一个两层神经网络...

    本篇是该系列的第三篇,建议在阅读本篇文章之前先看前两篇文章. 在本文中将使用python实现之前描述的两层神经网络,并完成所提出的"象限分类"的问题. 需要注意的是,虽然标题叫做神 ...

  5. 机器学习 导论_机器学习导论

    机器学习 导论 什么是机器学习? (What is Machine Learning?) Machine learning can be vaguely defined as a computers ...

  6. python 神经网络原理_神经网络理论基础及Python实现

    一.多层前向神经网络 多层前向神经网络由三部分组成:输出层.隐藏层.输出层,每层由单元组成; 输入层由训练集的实例特征向量传入,经过连接结点的权重传入下一层,前一层的输出是下一层的输入;隐藏层的个数是 ...

  7. python对编写神经网络作用_神经网络(BP)算法Python实现及应用

    本文实例为大家分享了Python实现神经网络算法及应用的具体代码,供大家参考,具体内容如下 首先用Python实现简单地神经网络算法: import numpy as np # 定义tanh函数 de ...

  8. 定点 浮点 神经网络 量化_神经网络量化方法

    神经网络虽然在多个领域取得了非常巨大的成就,但是其本质是大量参数的拟合和泛化,如果想处理更加复杂的任务,在没有过拟合的情况下,增加训练数据和加大网络规模无疑是简单有效的手段.现实情况就是这么做的,但是 ...

  9. 计算机科学导论_[计算机科学导论]第一章:计算机学什么

    总览 1.1 黑匣子 1.1 黑匣子 黑匣子定义 计算机硬件 实现用户需求 输入设备 存储设备 运算控制设备 输出设备 计算机软件 描述用户需求 操作系统 为软件提供服务,控制硬件工作 主要职能 管理 ...

最新文章

  1. 清华大佬手把手教你使用Python进行数据分析和可视化
  2. 排序算法-------堆排序
  3. [转]Java中Set的深入研究
  4. 【设计模式】单一职责原则
  5. 华为申请鸿蒙系统邮箱,华为鸿蒙系统
  6. 电路 | 稳压电路设计
  7. 虚拟机找不到共享文件夹
  8. oracle 11g 数据库
  9. python输入字母终止_将用户输入限制为字母
  10. (pytorch-深度学习)双向循环神经网络
  11. ambari 修改服务器名,Ambari修改主页面方法
  12. RabbitMQ——事务
  13. Deepin-安装QQ音乐(Windows程序)
  14. C++ Primer 第5版--练习8.5
  15. 抢椅子游戏java_抢椅子游戏作文(精选10篇)
  16. ATF官方文档翻译(二):Authentication Framework Chain of Trust(身份验证框架和信任链)(2)
  17. android版怎么下载地址,负重前行怎么下载到手机 最新安卓版下载地址
  18. win10企业版激活(自测有效)
  19. 教师的常用教学软件_数学教学软件不知道怎么选择?来看看这些软件
  20. STF安装与使用(windows)

热门文章

  1. 设计一种可全向移动的球形机器人
  2. 对“基因编辑”的看法
  3. PLSQL连接 ORACLE11g详解
  4. Duplicate entry for key 'PRIMARY'
  5. 什么是Dos、DDoS?如何防范DDoS?
  6. 竞猜活动区块链方案探索
  7. date java 格式化 sss_Java的日期格式化常用方法
  8. 初学者之路——————水声通信总结
  9. 6大中文分词工具测试比较
  10. Stamped读写锁