支持向量机概念图解

One of the Dual purposes Supervised Machine Learning Algorithms, serves as both a Regression(Support Vector Regressor) and a Classification Algorithm(Support Vector Classifier). This article will primarily focus on how SVM works as a classifier.

双重目的的监督机器学习算法之一，既用作回归(支持向量回归器)，又用作分类算法(支持向量分类器)。本文将主要关注SVM作为分类器的工作方式。

The Support Vector Machine, also known as a ‘Support Vector Network’ is a Discriminative Machine Learning Classification Algorithm. It classifies data points into two classes at a time(this does not mean it is only a binary classifier, it only separates data points into two classes at a time), using a decision boundary(a hyperplane in this case). The primary objective of the Support vector Classifier is finding the ‘Optimal Separating Hyperplane(Decision Boundary)’.

支持向量机，也称为“支持向量网络”，是一种判别式机器学习分类算法。它使用决策边界(在这种情况下为超平面 )一次将数据点分为两类(这并不意味着它只是二进制分类器，一次仅将数据点分为两类)。支持向量分类器的主要目标是找到“ 最佳分离超平面(决策边界) ”。

I have an article that briefly explains the difference between Generative and Discriminative Algorithms. Find the link embedded here and at the end of the article as well.

我有一篇文章简要解释了生成算法和判别算法之间的区别。在此处以及文章末尾找到嵌入的链接。

The Support Vector Machine(SVM) as a classifier can conveniently perform tasks for both linearly separable and non-linearly separable data points, using its superpower(the kernel trick).

支持向量机(SVM)作为分类器，可以利用其超能力( 内核技巧 )方便地对线性可分离和非线性可分离数据点执行任务。

This unique algorithm was first introduced in the 1960s and later improved on in the 1990s by Vapnik et al. It was the first algorithm at that time to beat the Neural Network in the hand digits classification. It hence gained popularity and has been used for several other use cases. It finds applications in several areas such as:

这种独特的算法最早是在1960年代引入的，后来在1990年代由Vapnik等人改进。它是当时在手数字分类中击败神经网络的第一种算法。因此，它获得了普及，并已用于其他几个用例。它在以下几个领域找到了应用：

Face detection
人脸检测
Bioinformatics(Cancer classification, diabetes prediction, protein classification et cetera)
生物信息学(癌症分类，糖尿病预测，蛋白质分类等)
Text and Hypertext classification
文本和超文本分类
Anomaly detection
异常检测
For Clustering (Known as Support Vector Clustering(SVC))
用于聚类(称为支持向量聚类(SVC))
Satellite image classification, et cetera.
卫星图像分类等。

So ‘Support Vector + Machine’ huh. What does it really mean? Let’s break down the basic terminologies.

因此，“ 支持向量 + 机器 ”是吧。到底是什么意思让我们分解基本术语。

支持矢量机的基本术语 (SUPPORT VECTOR MACHINE BASIC TERMINOLOGY)

Vector: This is simply the training examples/data points. It is also known as ‘Feature vectors’ in machine learning.

向量：这只是训练示例/数据点。在机器学习中，它也被称为“ 特征向量” 。
Support + Vector: This is simply a subset of the data closest to the hyperplane/decision boundary.

支持+向量 ：这只是最接近超平面/决策边界的数据的子集。

Image Source: https://github.com/Rajat2712/SVM

These vectors are intuitively called ‘Support vectors’ because they support the hyperplane/decision boundary and act as a pillar.

这些向量直观地称为“ 支持向量 ”，因为它们支持超平面/决策边界并充当Struts。

The Hyperplane: In geometry, it is an n-dimensional generalization of a plane, a subspace with one less dimension(n-1) than its origin space. In one-dimensional space, it is a point, In two-dimensional space it is a line, In three-dimensional space, it is an ordinary plane, in four or more dimensional spaces, it is then called a ‘Hyperplane’. Take note of this, it is really how the Support Vector Machine works behind the scenes, the dimensions are the features represented in the data. For example, say we want to carry out a classification problem and we want to be able to tell if a product gets purchased or not(a binary classification), if there is just one feature(say Gender) available as a feature in the dataset, then it is in one-dimensional space and the subspace(the separating/decision boundary) representation will be (n-1=0) a 0-dimensional space(if I can call it that), represented with just a point showing the separation of classes(Purchased or not). If there are two features(Age and Gender), it is a two-dimensional space(2D), with either of Age and Gender on the X and y-axis, the decision boundary will be represented as a simple line. Similarly, if the features are three(Age, Gender, Income), the decision boundary will be a two-dimensional plane in a three-dimensional space(n-1). Furthermore, if we have a four or more dimensional space data points, then it is called a ‘Hyperplane’ with (n-1 dimension). It is important to note that the number of features for a given machine learning problem can be selected using a technique called ‘feature selection’, as not all features are necessarily useful, some can be redundant and create unnecessary noise in the data.

超平面 ：在几何中，它是平面的n维概括，是一个子空间，其维数(n-1)比其原始空间小。在一维空间中，它是一个点；在二维空间中，它是一条线；在三维空间中，它是一个普通平面；在四个或更多维空间中，它被称为“ 超平面 ”。注意这一点，实际上是支持向量机在后台工作的方式，尺寸是数据中表示的特征。例如，假设我们要执行分类问题，并且希望能够判断是否购买了某种产品(二进制分类)，数据集中是否只有一个特征(例如性别)可用，则它位于一维空间中，子空间(分离/决策边界)表示形式将为(n-1 = 0)0维空间(如果可以这样称呼)，仅用一个点表示班级分离(是否购买)。如果有两个要素(年龄和性别)，则它是一个二维空间(2D)，X和y轴上分别有年龄和性别，决策边界将用一条简单的线表示。同样，如果特征是三个(年龄，性别，收入)，则决策边界将是三维空间(n-1)中的二维平面。此外，如果我们有一个四维或更多维的空间数据点，则它被称为“ 超平面” (n-1维)。重要的是要注意，可以使用一种称为“ 特征选择 ”的技术来选择给定机器学习问题的特征数量 ，因为并非所有特征都一定有用，某些特征可能是多余的，并在数据中产生不必要的噪音。

The Hyperplane is simply a concept that separates an n-dimensional space into two groups/halves. In machine learning terms, it is a form of a decision boundary that algorithms like the Support Vector Machine uses to classify or separate data points. There are two parts to it, the negative side hyperplane and the positive part hyperplane, where data points/instances can lie on either part, signifying the group/class they belong to.

超平面只是一个将n维空间分为两组/两半的概念。用机器学习的术语来说，它是决策边界的一种形式，支持向量机之类的算法用于对数据点进行分类或分离。它有两个部分，负侧超平面和正侧超平面，数据点/实例可以位于任一部分上，表示它们所属的组/类。

Margin: This is the distance between the decision boundary at both sides and the nearest/closest support vectors. It can also be defined as the shortest distance between the hyperplane and the support vectors with weight w, and bias b.

裕度：这是两侧决策边界与最近/最近支持向量之间的距离。也可以将其定义为超平面与权重为w且偏置为b的支持向量之间的最短距离。

Linearly Separable Data points: Data points can be said to be linearly separable if a separating boundary/hyperplane can easily be drawn showing distinctively the different class groups. Linear separable data points mostly require linear machine learning classifiers such as Logistic regression for example.

线性可分离的数据点 ：如果可以轻松地绘制出明显代表不同类别组的分离边界/超平面，则数据点可以说是线性可分离的。线性可分离数据点大多需要线性机器学习分类器，例如Logistic回归。

Non-Linearly Separable data points: This is the exact opposite of Linearly separable data points. View the image below, notice that no matter how one tries to draw a straight line, some data points will one way or the other get misclassified. SVM has a special way of classifying this type of data. It uses Kernel functions to represent these data points in a higher-dimensional space and then finds the optimal separating hyperplane.

非线性可分离数据点 ：这与线性可分离数据点完全相反。查看下面的图像，请注意，无论如何尝试绘制一条直线，某些数据点都会以一种方式或另一种方式被错误分类。 SVM具有对此类数据进行分类的特殊方法。它使用核函数在较高维空间中表示这些数据点，然后找到最佳的分离超平面。

Hard Margin: This is the type of margin used for linearly separable data points in the Support vector machine. Just as the name, ‘Hard Margin’, it is very rigid in classification, hence can result in overfitting. It works best when the data is linearly separable without outliers and a lot of noise.

硬边界 ：这是用于支持向量机中线性可分离数据点的边界类型。正如名称“ Hard Margin”一样，它在分类上非常严格，因此可能导致过拟合。当数据可线性分离且没有异常值和大量噪声时，此方法效果最佳。
Soft Margin: This is the type of margin used for non-linearly separable data points. As the name literally implies, it is less rigid than the Hard-margin. It is robust to outliers and allows misclassifications. However, it can also result to underfitting in some cases.

软裕度：这是用于非线性可分离数据点的裕度类型。顾名思义，它不如“硬边距”那么硬。它对异常值具有鲁棒性，并允许分类错误。但是，在某些情况下也会导致拟合不足。

Maximum Margin Hyperplane: As mentioned earlier, the primary objective of the Support vector Classifier is to find the optimal separating hyperplane, where data points are efficiently classified with fewer errors as possible. The Maximum Margin hyperplane is a hyperplane drawn such that it gives the ‘largest Margin’, which is used for classification. It is the simplest way to classify data if they are linear. The chosen hyperplane has the largest separating distance on both sides of the support vectors. Take a look at the image below, there are three separating hyperplanes drawn, while one does not accurately separate the data points(h1), the other two(h2 and h3) do. However, the maximum margin hyperplane in this example is h3.

最大余量超平面 ：如前所述，支持向量分类器的主要目标是找到最佳的分离超平面，在该平面上以尽可能少的错误对数据点进行有效分类。最大边距超平面是绘制的超平面，它给出了“最大边距”，用于分类。如果数据是线性的，这是对数据进行分类的最简单方法。所选的超平面在支持向量的两侧具有最大的分隔距离。看一下下面的图片，绘制了三个分离的超平面，而一个未正确分离数据点(h1)，另外两个(h2和h3)分离。但是，此示例中的最大余量超平面为h3。

Notice the difference in margin distance between h2 and h3.

请注意，h2和h3之间的边距距离不同。

The Kernel Trick: Kernels or Kernel Functions are methods with which linear classifiers such as SVM use to classify non-linearly separable data points. This is done by representing the data points in a higher-dimensional space than its original. For example, a 1D data can be represented as a 2D data in space, a 2D data can be represented as a 3D data et cetera. So why is it called a ‘kernel trick’? SVM cleverly re-represents non-linear data points using any of the kernel functions in a way that it seems the data have been transformed, then finds the optimal separating hyperplane. However, in reality, the data points still remain the same, they have not actually been transformed. This is why it is called a ‘kernel Trick’.

内核技巧 ：内核或内核函数是线性分类器(例如SVM)用于对非线性可分离数据点进行分类的方法。这是通过在比原始数据维度更高的空间中表示数据点来完成的。例如，一维数据可以表示为空间中的2D数据，二维数据可以表示为3D数据等。那么为什么称之为“内核把戏”？ SVM使用任何内核函数巧妙地重新表示非线性数据点，似乎数据已被转换，然后找到最佳的分离超平面。但是，实际上，数据点仍然保持不变，实际上并没有进行转换。这就是为什么它被称为“ 内核技巧 ”。

A trick indeed. Don’t you agree?

确实是个把戏。 你不同意吗？

The kernel trick offers a way to calculate relationships between data points using kernel functions, and represent the data in a more efficient way with less computation. Models that use this technique are called ‘kernelized models’.

内核技巧提供了一种使用内核函数来计算数据点之间关系的方法，并以更少的计算以更有效的方式表示数据。使用该技术的模型称为“ 内核化模型 ”。

There are several functions SVM uses to perform this task. Some of the most common ones are:

SVM使用多种功能来执行此任务。一些最常见的是：

Polynomial Kernel Function: This transforms data points by using the dot product and transforming data to an ‘n-dimension’, n could be any value from 2, 3 et cetera, i.e the transformation will be either a squared product or higher. Therefore representing data in higher-dimensional space using the new transformed points.

多项式核函数 ：这通过使用点积并将数据转换为“ n维”来转换数据点，n可以是2、3等的任何值，即转换将是平方乘积或更高的乘积。因此，使用新的变换点在高维空间中表示数据。
The Radial Basis Function(RBF): This function behaves like a ‘weighted nearest neighbor model’. It transforms data by representing in infinite dimensions, then using the weighted nearest neighbor (observation with the most influence on the new data point) for classification. The Radial function can be either Gaussian or Laplace. This is dependent on a hyperparameter known as gamma. This is the most commonly used kernel.

径向基函数(RBF) ：此函数的行为类似于“加权最近邻模型”。它通过以无限维表示，然后使用加权的最近邻居(对新数据点影响最大的观测)进行分类来转换数据。径向函数可以是高斯或拉普拉斯。这取决于称为gamma的超参数。这是最常用的内核。
The Sigmoid Function: also known as the hyperbolic tangent function(Tanh), finds more application in neural networks as an activation function. This function is used in image classification.

Sigmoid函数 ：也称为双曲正切函数(Tanh)，作为激活函数在神经网络中有更多的应用。此功能用于图像分类。
The Linear Kernel: Used for linear data. This just simply represents the data points using a linear relationship.

线性内核 ：用于线性数据。这只是使用线性关系简单地表示数据点。

For Polynomial kernel, x and y represent the classes of observations, K represents the polynomial coefficient and p represents the degree of the polynomial. Both k and p are calculated using the cross-validation.

对于多项式核， x和y表示观测类别， K表示多项式系数， p表示多项式的次数。使用交叉验证计算k和p 。

For Radial Basis kernel, the formula represented above is the Gaussian RBF, the representation is as follows:

对于Radial Basis内核 ，上面表示的公式是Gaussian RBF，其表示如下：

The C-parameter: This is a regularization parameter used to prevent overfitting. It is inversely related to the Margin, such that if a larger C value is chosen, the margin is smaller, and if a smaller C value is chosen the margin is larger. It aids with the trade-off between bias and variance. SVM just like most machine learning algorithms has to deal with this as well.

C参数 ：这是用于防止过度拟合的正则化参数。它与裕度成反比，因此，如果选择较大的C值，则裕度较小；如果选择较小的C值，则裕度较大。它有助于在偏差和方差之间进行权衡。就像大多数机器学习算法一样，SVM也必须处理这一问题。

结束注 (END NOTE)

The aim of this article was to explain some basic terminologies associated with the Support Vector Machine. Please reference the links below for further study.

本文的目的是解释与支持向量机相关的一些基本术语。请参考下面的链接进行进一步研究。

Thank you for reading!

感谢您的阅读！

连接社交媒体 (CONNECT ON SOCIAL MEDIA)

LinkedIn: www.linkedin.com/in/aminah-mardiyyah-rufa-i

领英(LinkedIn)： www.linkedin.com/in/aminah-mardiyyah-rufa-i

Twitter: @diyyah92

推特：@ diyyah92

REFERENCES AND RESOURCES

参考资料和资源

翻译自: https://medium.com/swlh/the-support-vector-machine-basic-concept-a5106bd3cc5f

支持向量机概念图解

http://www.taodudu.cc/news/show-863387.html

如何设置Jupiter Notebook服务器并从任何地方访问它（Windows 10）
无监督学习 k-means_监督学习-它意味着什么？
logistic 回归_具有Logistic回归的优秀初学者项目
脉冲多普勒雷达_是人类还是动物？多普勒脉冲雷达和神经网络的目标分类
pandas内置绘图_使用Pandas内置功能探索数据集
sim卡rfm_信用卡客户的RFM集群
需求分析与建模最佳实践_社交媒体和主题建模：如何在实践中分析帖子
机器学习数据模型_使用PyCaret将机器学习模型运送到数据—第二部分
大数据平台蓝图_数据科学面试蓝图
算法竞赛训练指南代码仓库_数据仓库综合指南
深度学习图像分类_深度学习时代您应该阅读的10篇文章了解图像分类
蝙蝠侠遥控器pcb_通过蝙蝠侠从Circle到ML：第一部分
cnn卷积神经网络_5分钟内卷积神经网络（CNN）
基于树的模型的机器学习
数据分析模型和工具_数据分析师工具包：模型
图像梯度增强_使用梯度增强机在R中进行分类
机器学习文本分类代码_无需担心机器学习-如何在少于10行代码中对文本进行分类
lr模型和dnn模型_建立ML或DNN模型的技巧
数量和质量评价模型_数量对于语言模型可以具有自己的质量
mlflow_使用MLflow跟踪进行超参数调整
聊天产生器
深度学习领域专业词汇_深度学习时代的人文领域专业知识
图像分类
CSDN-Markdown基本语法
python3（一）数字Number
python3（二）Numpy
python3（三）Matplotlib
python3（四）Pandas库
python3（六）监督学习
pycharm中如何调用Anoconda的库

支持向量机概念图解_支持向量机：基本概念相关推荐

支持向量机python代码_支持向量机及python实现（一）
支持向量机英文名称Support Vector Machine简称SVM,它是由前苏联科学家Corinna Cortes在1995年首先提出的,它在解决小样本.非线性及高维模式识别中表现出许多特有的优 ...
搜索引擎的概念鄂州_搜索引擎的概念
搜索引擎的概念[编辑] 概述搜索引擎是指根据一定的策略.运用特定的计算机程序搜集互联网上的信息,在对信息进行组织和处理后,并将处理后的信息显示给用户,是为用户提供检索服务的系统. 搜索引擎定义从 ...
python概念建模_机器学习基本概念
1 什么是模型? 模型就是针对存在于不同变量之间的数学或概率联系的一种规范(函数关系.统计关系). 2 什么是机器学习? 机器学习是指创建并使用那些由学习数据而得出的模型.也叫做预测模型或数据挖掘. ...
苹果概念手机_苹果游戏概念手机：两个屏幕+侧滑盖颠覆性设计，不仅仅只有这些...
iPhone手机这几年的风格越来越稳了,似乎向外界透漏着,不求有多大突破,但求不出错的设计理念.纵观这几年,手机功能和设计向前推动完全靠着安卓手机,比如:屏下指纹识别.屏下摄像头.TOf镜头等新鲜好玩 ...
支持向量机回归分析_支持向量机和回归分析
支持向量机回归分析 It is a common misconception that support vector machines are only useful when solving cl ...
Oracle Buffer Cache的keep、recycle、default pool概念图解
Oracle Buffer Cache的keep.recycle.default pool概念图解转载于:https://blog.51cto.com/maclean/1278284
御龙在天不显示服务器,《御龙在天》突破瓶颈服务器技术概念图解
据业内资深玩家叙述,国内近10年来,国战类网戏一直是中国网游界备受关注的领域,然人数少,门槛高,一到国战就卡机等国战网游潜规则一直困扰热衷国战网游的玩家,甚至有不少玩家都开始选择放弃国战网游. 首款Q ...
支持向量机的基本思想_支持向量机的分类思想
支持向量机是一种经典的机器学习算法,在小样本数据集的情况下有非常广的应用.本文将循序渐进地讲解支持向量机的分类思想. 目录: 函数间隔和几何间隔支持向量机的分类思想总结 1.函数间隔和几何间隔为 ...
缓动动画_核心动画概念：缓入缓出
缓动动画 With the arrival of CSS transitions, animation is now completely at home on web pages. In anima ...

支持向量机概念图解_支持向量机：基本概念

支持矢量机的基本术语 (SUPPORT VECTOR MACHINE BASIC TERMINOLOGY)

结束注 (END NOTE)

连接社交媒体 (CONNECT ON SOCIAL MEDIA)

相关文章：

支持向量机概念图解_支持向量机：基本概念相关推荐

最新文章

热门文章