本文为《Linear algebra and its applications》的读书笔记

Hyperplanes

Hyperplanes play a special role in the geometry of R n \R^n Rn because they divide the space into two disjoint pieces, just as a plane separates R 3 \R^3 R3 into two parts and a line cuts through R 2 \R^2 R2. The key to working with hyperplanes is to use simple i m p l i c i t implicit implicit descriptions, rather than the e x p l i c i t explicit explicit or parametric representations of lines and planes used in the earlier work with affine sets.

An implicit equation of a line in R 2 \R^2 R2 has the form a x + b y = d ax + by= d ax+by=d. An implicit equation of a plane in R 3 \R^3 R3 has the form a x + b y + c z = d ax+ by+ cz= d ax+by+cz=d. Both equations describe the line or plane as the set of all points at which a linear expression (also called a l i n e a r linear linear f u n c t i o n a l functional functional (线性函数)) has a fixed value, d d d.

If f f f is a linear functional on R n \R^n Rn, then the standard matrix of this linear transformation f f f is a 1 × n 1\times n 1×n matrix A A A, say A = [ a 1 a 2 . . . a n ] A =\begin{bmatrix} a_1& a_2&...& a_n\end{bmatrix} A=[a1a2...an]. So

If f f f is a nonzero functional, then r a n k A = 1 rankA = 1 rankA=1, and d i m N u l A = n − 1 dim NulA = n - 1 dimNulA=n−1. Thus, the subspace [ f : 0 ] [f: 0] [f:0] has dimension n − 1 n-1 n−1 and so is a hyperplane. Also, if d d d is any number in R R R, then

Recall that the set of solutions of A x = b A\boldsymbol x= \boldsymbol b Ax=b is obtained by translating the solution set of A x = 0 A\boldsymbol x=\boldsymbol 0 Ax=0. Then

Thus the sets [ f : d ] [f: d] [f:d] are hyperplanes parallel to [ f : 0 ] [f: 0] [f:0]. See Figure 1.

When A A A is a 1 × n 1\times n 1×n matrix, the equation A x = d A\boldsymbol x = d Ax=d may be written with an inner product n ⋅ x \boldsymbol n\cdot\boldsymbol x n⋅x, using n \boldsymbol n n in R n \R^n Rn with the same entries as A A A. Thus, from (2),

Then [ f : 0 ] = { x ∈ R n : n ⋅ x = 0 } [f: 0]=\{\boldsymbol x\in R^n:\boldsymbol n\cdot \boldsymbol x=0\} [f:0]={x∈Rn:n⋅x=0}, which shows that [ f , 0 ] [f,0] [f,0] is the orthogonal complement of the subspace spanned by n \boldsymbol n n. In the terminology of calculus and geometry for R 3 \R^3 R3, n \boldsymbol n n is called a normal vector (法向量) to [ f : 0 ] [f:0] [f:0]. (A “normal” vector in this sense need not have unit length.) Also, n \boldsymbol n n is said to be normal to each parallel hyperplane [ f : d ] [f:d] [f:d], even though n ⋅ x \boldsymbol n\cdot x n⋅x is not zero when d ≠ 0 d\neq 0 d=0.

Another name for [ f : d ] [f:d] [f:d] is a level set (水平集) of f f f , and n n n is sometimes called the gradient (梯度) of f f f when f ( x ) = n ⋅ x f(\boldsymbol x)= \boldsymbol n\cdot \boldsymbol x f(x)=n⋅x for each x \boldsymbol x x.

The next three examples show connections between implicit and explicit descriptions of hyperplanes.

EXAMPLE 4
In R 2 \R^2 R2, give an explicit description of the line x − 4 y = 13 x-4y=13 x−4y=13 in parametric vector form.
SOLUTION

EXAMPLE 5
Let

, and let L 1 L_1 L1 be the line through v 1 \boldsymbol v_1 v1 and v 2 \boldsymbol v_2 v2. Find a linear functional f f f and a constant d d d such that L 1 = [ f : d ] L_1=[f:d] L1=[f:d].
SOLUTION
The line L 1 L_1 L1 is parallel to the translated line L 0 L_0 L0 through v 2 − v 1 \boldsymbol v_2-\boldsymbol v_1 v2−v1 and the origin. The defining equation for L 0 L_0 L0 has the form

Since n \boldsymbol n n is orthogonal to the subspace L 0 L_0 L0, which contains v 2 − v 1 \boldsymbol v_2-\boldsymbol v_1 v2−v1, then

By inspection, a solution is [ a b ] = [ 2 5 ] [\ \ a\ \ \ b\ \ ]= [\ \ 2\ \ \ 5\ \ ] [ a b ]=[ 2 5 ]. Let f ( x , y ) = 2 x + 5 y f(x, y) =2x+5y f(x,y)=2x+5y. From (5), L 0 = [ f : 0 ] L_0=[f:0] L0=[f:0], and L 1 = [ f : d ] L_1=[f:d] L1=[f:d] for some d d d. Since v 1 \boldsymbol v_1 v1 is on line L 1 L_1 L1, d = f ( v 1 ) = 2 ( 1 ) + 5 ( 2 ) = 12 d=f(\boldsymbol v_1)=2(1) +5(2)=12 d=f(v1)=2(1)+5(2)=12. Thus, the equation for L 1 L_1 L1 is 2 x + 5 y = 12 2x + 5y = 12 2x+5y=12.

EXAMPLE 6
Let

Find an implicit description [ f : d ] [f:d] [f:d] of the plane H 1 H_1 H1 that passes through v 1 , v 2 , \boldsymbol v_1, \boldsymbol v_2, v1,v2, and v 3 \boldsymbol v_3 v3.
SOLUTION
H 1 H_1 H1 is parallel to a plane H 0 H_0 H0 through the origin that contains the translated points

Since these two points are linearly independent, H 0 = S p a n { v 2 − v 1 , v 3 − v 1 } H_0= Span\{\boldsymbol v_2-\boldsymbol v_1, \boldsymbol v_3-\boldsymbol v_1\} H0=Span{v2−v1,v3−v1}. Let n = [ a b c ] \boldsymbol n=\begin{bmatrix} a\\b\\c\end{bmatrix} n=⎣⎡abc⎦⎤ be the normal to H 0 H_0 H0. Then v 2 − v 1 \boldsymbol v_2- \boldsymbol v_1 v2−v1 and v 3 − v 1 \boldsymbol v_3 -\boldsymbol v_1 v3−v1 are each orthogonal to n \boldsymbol n n:

These two equations form a system whose augmented matrix can be row reduced:

Row operations yield

Set c = 4 c= 4 c=4, for instance. Then n = [ − 2 5 4 ] \boldsymbol n=\begin{bmatrix} -2\\5\\4\end{bmatrix} n=⎣⎡−254⎦⎤ and H 0 = [ f : 0 ] H_0=[f:0] H0=[f:0], where f ( x ) = − 2 x 1 + 5 x 2 + 4 x 3 f(\boldsymbol x)=-2x_1+5x_2 + 4x_3 f(x)=−2x1+5x2+4x3.

The parallel hyperplane H 1 H_1 H1 is [ f : d ] [f :d] [f:d]. To find d d d, use the fact that v 1 \boldsymbol v_1 v1 is in H 1 H_1 H1, and compute d = f ( v 1 ) = f ( 1 , 1 , 1 ) = 7 d = f(\boldsymbol v_1)= f(1, 1, 1)= 7 d=f(v1)=f(1,1,1)=7.

The procedure in Example 6 generalizes to higher dimensions. However, for the
special case of R 3 \R^3 R3, one can also use the cross-product formula (叉积公式) to compute n n n, using a symbolic determinant as a mnemonic device:

If only the formula for f f f is needed, the cross-product calculation may be written as an ordinary determinant:

PROOF
Suppose that H H H is a hyperplane, take p ∈ H \boldsymbol p\in H p∈H, and let H 0 = H − p H_0= H -\boldsymbol p H0=H−p. Then H 0 H_0 H0 is an ( n − 1 ) (n-1) (n−1)-dimensional subspace. Next, take any point y \boldsymbol y y that is not in H 0 H_0 H0. By the Orthogonal Decomposition Theorem,

where y 1 \boldsymbol y_1 y1 is a vector in H 0 H_0 H0 and n \boldsymbol n n is orthogonal to every vector in H 0 H_0 H0. The function f f f defined by

is a linear functional, by properties of the inner product. Now, [ f : 0 ] [f :0] [f:0] is a hyperplane that contains H 0 H_0 H0, by construction of n \boldsymbol n n. It follows that

Finally, let d = f ( p ) = n ⋅ p d= f(\boldsymbol p)=\boldsymbol n\cdot \boldsymbol p d=f(p)=n⋅p. Then, as in (3) shown earlier,

The converse statement that [ f : d ] [f :d] [f:d] is a hyperplane follows from (1) and (3) above.

Many important applications of hyperplanes depend on the possibility of “separating” two sets by a hyperplane. The following terminology and notation will help to make this idea more precise.

Topology: 拓扑
open ball: 开球
A set is open: 开集
A set is closed: 闭集
A set is bounded: 有界集
A set is compact: 紧致集

EXERCISE 27
Give an example of a closed subset S S S of R 2 \R^2 R2 such that c o n v S conv S convS is not closed.
SOLUTION
S = { p ∣ p = ( x , y ) , y = 1 / x , x ≥ 1 / 2 } S=\{\boldsymbol p|\boldsymbol p=(x,y),y=1/x,x\geq1/2\} S={p∣p=(x,y),y=1/x,x≥1/2}

EXERCISE 29
Prove that the open ball B ( p , δ ) = { x : ∥ x − p ∥ < δ } B(\boldsymbol p,\delta)=\{\boldsymbol x:\left\|\boldsymbol x-\boldsymbol p\right\|<\delta\} B(p,δ)={x:∥x−p∥<δ} is a convex set.
SOLUTION
[Hint: Use the Triangle Inequality.] (三角不等式)

EXAMPLE 7
Let

as shown in Figure 3. Then the set S S S is closed since it contains all its boundary points. The set S S S is bounded since S ⊂ B ( 0 , 3 ) S\subset B(\boldsymbol 0, 3) S⊂B(0,3). Thus S S S is also compact.

N o t a t i o n Notation Notation: If f f f is a linear functional, then f ( A ) ≤ d f(A)\leq d f(A)≤d means f ( x ) ≤ d f(\boldsymbol x)\leq d f(x)≤d for each x ∈ A \boldsymbol x\in A x∈A.

strictly seperate: 严格分割

Notice that strict separation requires that the two sets be disjoint, while mere separation does not.

PROOF
Suppose that ( c o n v A ) ∩ ( c o n v B ) = ϕ (conv A)\cap(convB)=\phi (convA)∩(convB)=ϕ. Since the convex hull of a compact set is compact, Theorem 12 ensures that there is a hyperplane H H H that strictly separates c o n v A convA convA and c o n v B convB convB. Clearly, H H H also strictly separates the smaller sets A A A and B B B.

Conversely, suppose the hyperplane H = [ f : d ] H =[f :d] H=[f:d] strictly separates A A A and B B B. Without loss of generality, assume that f ( A ) < d f(A) < d f(A)<d and f ( B ) > d f(B) > d f(B)>d. Let x = c 1 x 1 + . . . + c k x k \boldsymbol x = c_1\boldsymbol x_1+...+ c_k\boldsymbol x_k x=c1x1+...+ckxk be any convex combination of elements of A A A. Then

Thus f ( c o n v A ) < d f(conv A) < d f(convA)<d. Likewise, f ( c o n v B ) > d f(convB) > d f(convB)>d, so H = [ f : d ] H=[f :d] H=[f:d] strictly separates c o n v A convA convA and c o n v B convB convB. By Theorem 12, c o n v A convA convA and c o n v B convB convB must be disjoint.

EXERCISE 14
Let F 1 F_1 F1 and F 2 F_2 F2 be 4-dimensional flats in R 6 \R^6 R6, and suppose that F 1 ∩ F 2 ≠ ϕ F_1\cap F_2 \neq\phi F1∩F2=ϕ. What are the possible dimensions of F 1 ∩ F 2 F_1\cap F_2 F1∩F2?
SOLUTION

下面的答案是我自己写的，感觉论证啰嗦且不太严谨，仅供参考
如果有好的解答，欢迎一起交流~

Let F 1 = W 1 + p 1 , F 2 = W 2 + p 2 F_1=W_1+\boldsymbol p_1,F_2=W_2+\boldsymbol p_2 F1=W1+p1,F2=W2+p2, where W 1 , W 2 W_1,W_2 W1,W2 are two 4-dimensional subspaces. Suppose a 1 , . . . , a 4 \boldsymbol a_1,...,\boldsymbol a_4 a1,...,a4 and b 1 , . . . , b 4 \boldsymbol b_1,...,\boldsymbol b_4 b1,...,b4 be the basis of W 1 W_1 W1 and W 2 W_2 W2 respectively. Let F 1 ∩ F 2 = W + p F_1\cap F_2=W+\boldsymbol p F1∩F2=W+p and x ∈ W \boldsymbol x\in W x∈W, then there exist m i , n i ∈ R m_i,n_i\in R mi,ni∈R ( 1 ≤ i ≤ 4 1\leq i\leq4 1≤i≤4) such that
p 1 + m 1 a 1 + . . . + m 4 a 4 = p 2 + n 1 b 1 + . . . + n 4 b 4 = x + p p 1 − p + m 1 a 1 + . . . + m 4 a 4 = p 2 − p + n 1 b 1 + . . . + n 4 b 4 = x ( 1 ) \boldsymbol p_1+m_1\boldsymbol a_1+...+m_4\boldsymbol a_4=\boldsymbol p_2+n_1\boldsymbol b_1+...+n_4\boldsymbol b_4=\boldsymbol x+\boldsymbol p\\ \boldsymbol p_1-\boldsymbol p+m_1\boldsymbol a_1+...+m_4\boldsymbol a_4=\boldsymbol p_2-\boldsymbol p+n_1\boldsymbol b_1+...+n_4\boldsymbol b_4=\boldsymbol x\ \ \ (1) p1+m1a1+...+m4a4=p2+n1b1+...+n4b4=x+pp1−p+m1a1+...+m4a4=p2−p+n1b1+...+n4b4=x (1)

Notice that d i m F 1 ∩ F 2 = d i m W dimF_1\cap F_2=dimW dimF1∩F2=dimW ( d i m W dimW dimW is the dimension of the solution set of x \boldsymbol x x).

Since x \boldsymbol x x can be 0 \boldsymbol 0 0, there exist m i ′ , n i ′ ∈ R m_i',n_i'\in R mi′,ni′∈R ( 1 ≤ i ≤ 4 1\leq i\leq4 1≤i≤4) such that

p 1 + m 1 ′ a 1 + . . . + m 4 ′ a 4 = p 2 + n 1 ′ b 1 + . . . + n 4 ′ b 4 = p ( 2 ) \boldsymbol p_1+m_1'\boldsymbol a_1+...+m_4'\boldsymbol a_4=\boldsymbol p_2+n_1'\boldsymbol b_1+...+n_4'\boldsymbol b_4=\boldsymbol p\ \ \ (2) p1+m1′a1+...+m4′a4=p2+n1′b1+...+n4′b4=p (2)

From ( 2 ) (2) (2), we know that p 1 − p \boldsymbol p_1-\boldsymbol p p1−p is a linear combination of { a 1 , . . . , a 4 } \{\boldsymbol a_1,...,\boldsymbol a_4\} {a1,...,a4} and p 2 − p \boldsymbol p_2-\boldsymbol p p2−p is a linear combination of { b 1 , . . . , b 4 } \{\boldsymbol b_1,...,\boldsymbol b_4\} {b1,...,b4}.Thus by ( 1 ) (1) (1), there exist t i , s i ∈ R t_i,s_i\in R ti,si∈R ( 1 ≤ i ≤ 4 1\leq i\leq4 1≤i≤4) such that
t 1 a 1 + . . . + t 4 a 4 = s 1 b 1 + . . . + s 4 b 4 = x ( 3 ) t 1 a 1 + . . . + t 4 a 4 − s 1 b 1 − . . . − s 4 b 4 = 0 ( 4 ) t_1\boldsymbol a_1+...+t_4\boldsymbol a_4=s_1\boldsymbol b_1+...+s_4\boldsymbol b_4=\boldsymbol x\ \ \ (3)\\ t_1\boldsymbol a_1+...+t_4\boldsymbol a_4-s_1\boldsymbol b_1-...-s_4\boldsymbol b_4=\boldsymbol 0\ \ \ (4) t1a1+...+t4a4=s1b1+...+s4b4=x (3)t1a1+...+t4a4−s1b1−...−s4b4=0 (4)

Let A = [ a 1 . . . a 4 b 1 . . . b 4 ] A=\begin{bmatrix}\boldsymbol a_1&...&\boldsymbol a_4&\boldsymbol b_1&...&\boldsymbol b_4\end{bmatrix} A=[a1...a4b1...b4], then according to ( 4 ) (4) (4):
A [ t 1 . . . t 4 − s 1 . . . − s 4 ] = 0 ∵ 4 ≤ r a n k A ≤ 6 , d i m N u l A = 8 − r a n k A ∴ 2 ≤ d i m N u l A ≤ 4 A\begin{bmatrix}t_1\\...\\t_4\\-s_1\\...\\-s_4\end{bmatrix}=\boldsymbol 0\\\because 4\leq rankA\leq6,dimNulA=8-rankA\\\therefore 2\leq dimNulA\leq 4 A⎣⎢⎢⎢⎢⎢⎢⎡t1...t4−s1...−s4⎦⎥⎥⎥⎥⎥⎥⎤=0∵4≤rankA≤6,dimNulA=8−rankA∴2≤dimNulA≤4

According to ( 3 ) (3) (3), it can be shown that the dimension of the solution set of x \boldsymbol x x and [ t 1 . . . t 4 ] \begin{bmatrix}t_1\\...\\t_4\end{bmatrix} ⎣⎡t1...t4⎦⎤ are equal. We can also observe that [ t 1 . . . t 4 ] ↦ [ s 1 . . . s 4 ] \begin{bmatrix}t_1\\...\\t_4\end{bmatrix}\mapsto\begin{bmatrix}s_1\\...\\s_4\end{bmatrix} ⎣⎡t1...t4⎦⎤↦⎣⎡s1...s4⎦⎤ is a linear transformation, thus d i m N u l A = dimNulA= dimNulA= the dimension of the solution set of [ t 1 . . . t 4 ] \begin{bmatrix}t_1\\...\\t_4\end{bmatrix} ⎣⎡t1...t4⎦⎤. So d i m F 1 ∩ F 2 = d i m N u l A dimF_1\cap F_2=dimNulA dimF1∩F2=dimNulA and 2 ≤ d i m F 1 ∩ F 2 ≤ 4 2\leq dimF_1\cap F_2\leq4 2≤dimF1∩F2≤4.

8.4 Hyperplanes (超平面)相关推荐

Best Fitting Hyperplanes for Classification(用于分类的最佳拟合超平面)
0.摘要 -在本文中,我们提出了比经典的大边际分类器更适合开放集识别和物体检测任务的新型方法.所提出的方法使用了最佳拟合超平面方法,其主要思想是找到最佳拟合超平面,使每个超平面接近其中一个类别的样本, ...
平面划分问题、超平面规划
直线划分平面问题题目描述给定n条直线,判断这n条直线最多能将平面划分为多少区域. 解析首先观察1条直线的划分情况. 显而易见,1条直线分平面为两个区域. 然后是2条直线的划分情况. 接着是3条直 ...
论文解读：（TransH）Knowledge Graph Embedding by Translating on Hyperplanes
转自: https://blog.csdn.net/qq_36426650/article/details/103336589?utm_medium=distribute.pc_relevant.no ...
R语言e1071包中的支持向量机：仿真数据(螺旋线性不可分数据集)、简单线性核的支持向量机SVM(模型在测试集上的表现、可视化模型预测的结果、添加超平面区域与原始数据标签进行对比分析)、如何改进核函数
R语言e1071包中的支持向量机:仿真数据(螺旋线性不可分数据集).简单线性核的支持向量机SVM(模型在测试集上的表现.可视化模型预测的结果.添加超平面区域与原始数据标签进行对比分析).如何改进核函数 ...
R语言螺旋线型线性不可分数据xgboost分类：使用xgboost模型来解决螺旋数据的分类问题、可视化模型预测的结果、添加超平面区域渲染并与原始数据标签进行对比分析
R语言螺旋线型线性不可分数据xgboost分类:使用xgboost模型来解决螺旋数据的分类问题.可视化模型预测的结果.添加超平面区域渲染并与原始数据标签进行对比分析目录
支持向量机（SVM）：超平面及最大间隔化、支持向量机的数学模型、软间隔与硬间隔、线性可分支持向量机、线性支持向量机、非线性支持向量机、核函数、核函数选择、SMO算法、SVM vs LR、优缺点
支持向量机(SVM):超平面及最大间隔化.支持向量机的数学模型.软间隔与硬间隔.线性可分支持向量机.线性支持向量机.非线性支持向量机.核函数.核函数选择.SMO算法.SVM vs LR.优缺点目录
支持向量所在超平面方程_支持向量机通俗导论：理解SVM的三层境界（一）
前言动笔写这个支持向量机(support vector machine)是费了不少劲和困难的,原因很简单,一者这个东西本身就并不好懂,要深入学习和研究下去需花费不少时间和精力,二者这个东西也不好讲清 ...
UA SIE545 优化理论基础1 凸分析1 线性流形与超平面
UA SIE545 优化理论基础1 凸分析1 线性流形与超平面线性流形超平面线性流形假设FFF是一个数域,VVV是FFF上的一个线性空间.称M⊂VM \subset VM⊂V是一个线性流形,如 ...
DL之GD：利用LogisticGD算法(梯度下降)依次基于一次函数和二次函数分布的数据集实现二分类预测(超平面可视化)
DL之GD:利用LogisticGD算法(梯度下降)依次基于一次函数和二次函数分布的数据集实现二分类预测(超平面可视化) 目录利用LogisticGD算法(梯度下降)依次基于一次函数和二次函数分布的 ...

8.4 Hyperplanes (超平面)

目录

Hyperplanes

8.4 Hyperplanes (超平面)相关推荐

最新文章

热门文章