Python论文复现：《基于稀疏指标的优化变分模态分解方法》

信号分解方法中，虽然变分模态分解（Variational Mode Decomposition, VMD）有严格的数学推导，能有效抑制端点效应、模态混叠等问题，但其分解模态数需预设。然而实际工程中，真实信号的频谱较为嘈杂且频带个数较难确定，一般观察分析具体信号的频谱图设置合理的模态数。
相比人工选取方法，自适应选取方法通常定义分解好坏的指标，进一步确定该指标下的最佳模态数。考虑到《基于稀疏指标的优化变分模态分解方法》从IMF频带稀疏性（VMD分解的初衷）的角度寻优，且稀疏指标有严格的理论支撑《信号的稀疏性分析》，故本文对其进行复现。

vmd分解

VMD 假定所有分量都是集中在各自中心频率附近的窄带信号，根据分量窄带条件建立约束优化问题，从而估计信号分量的中心频率以及重构相应分量。具体原理不再赘述，由于之前有粉丝不知道我用的什么代码，故在此公开，可单独放在vmdpy.py文件，后面的主程序Auto_VMD.py会调用：

import numpy as npdef  VMD(f, alpha, tau, K, DC, init, tol):"""u,u_hat,omega = VMD(f, alpha, tau, K, DC, init, tol)Variational mode decompositionPython implementation by Vinícius Rezende Carvalho - vrcarva@gmail.comcode based on Dominique Zosso's MATLAB code, available at:https://www.mathworks.com/matlabcentral/fileexchange/44765-variational-mode-decompositionOriginal paper:Dragomiretskiy, K. and Zosso, D. (2014) ‘Variational Mode Decomposition’, IEEE Transactions on Signal Processing, 62(3), pp. 531–544. doi: 10.1109/TSP.2013.2288675.Input and Parameters:---------------------f       - the time domain signal (1D) to be decomposedalpha   - the balancing parameter of the data-fidelity constrainttau     - time-step of the dual ascent ( pick 0 for noise-slack )K       - the number of modes to be recoveredDC      - true if the first mode is put and kept at DC (0-freq)init    - 0 = all omegas start at 01 = all omegas start uniformly distributed2 = all omegas initialized randomlytol     - tolerance of convergence criterion; typically around 1e-6Output:-------u       - the collection of decomposed modesu_hat   - spectra of the modesomega   - estimated mode center-frequencies"""if len(f)%2:f = f[:-1]# Period and sampling frequency of input signalfs = 1./len(f)ltemp = len(f)//2 fMirr =  np.append(np.flip(f[:ltemp],axis = 0),f)  fMirr = np.append(fMirr,np.flip(f[-ltemp:],axis = 0))# Time Domain 0 to T (of mirrored signal)T = len(fMirr)t = np.arange(1,T+1)/T  # Spectral Domain discretizationfreqs = t-0.5-(1/T)# Maximum number of iterations (if not converged yet, then it won't anyway)Niter = 500# For future generalizations: individual alpha for each modeAlpha = alpha*np.ones(K)# Construct and center f_hatf_hat = np.fft.fftshift((np.fft.fft(fMirr)))f_hat_plus = np.copy(f_hat) #copy f_hatf_hat_plus[:T//2] = 0# Initialization of omega_komega_plus = np.zeros([Niter, K])if init == 1:for i in range(K):omega_plus[0,i] = (0.5/K)*(i)elif init == 2:omega_plus[0,:] = np.sort(np.exp(np.log(fs) + (np.log(0.5)-np.log(fs))*np.random.rand(1,K)))else:omega_plus[0,:] = 0# if DC mode imposed, set its omega to 0if DC:omega_plus[0,0] = 0# start with empty dual variableslambda_hat = np.zeros([Niter, len(freqs)], dtype = complex)# other initsuDiff = tol+np.spacing(1) # update stepn = 0 # loop countersum_uk = 0 # accumulator# matrix keeping track of every iterant // could be discarded for memu_hat_plus = np.zeros([Niter, len(freqs), K],dtype=complex)    #*** Main loop for iterative updates***while ( uDiff > tol and  n < Niter-1 ): # not converged and below iterations limit# update first mode accumulatork = 0sum_uk = u_hat_plus[n,:,K-1] + sum_uk - u_hat_plus[n,:,0]# update spectrum of first mode through Wiener filter of residualsu_hat_plus[n+1,:,k] = (f_hat_plus - sum_uk - lambda_hat[n,:]/2)/(1.+Alpha[k]*(freqs - omega_plus[n,k])**2)# update first omega if not held at 0if not(DC):omega_plus[n+1,k] = np.dot(freqs[T//2:T],(abs(u_hat_plus[n+1, T//2:T, k])**2))/np.sum(abs(u_hat_plus[n+1,T//2:T,k])**2)# update of any other modefor k in np.arange(1,K):#accumulatorsum_uk = u_hat_plus[n+1,:,k-1] + sum_uk - u_hat_plus[n,:,k]# mode spectrumu_hat_plus[n+1,:,k] = (f_hat_plus - sum_uk - lambda_hat[n,:]/2)/(1+Alpha[k]*(freqs - omega_plus[n,k])**2)# center frequenciesomega_plus[n+1,k] = np.dot(freqs[T//2:T],(abs(u_hat_plus[n+1, T//2:T, k])**2))/np.sum(abs(u_hat_plus[n+1,T//2:T,k])**2)# Dual ascentlambda_hat[n+1,:] = lambda_hat[n,:] + tau*(np.sum(u_hat_plus[n+1,:,:],axis = 1) - f_hat_plus)# loop countern = n+1# converged yet?uDiff = np.spacing(1)for i in range(K):uDiff = uDiff + (1/T)*np.dot((u_hat_plus[n,:,i]-u_hat_plus[n-1,:,i]),np.conj((u_hat_plus[n,:,i]-u_hat_plus[n-1,:,i])))uDiff = np.abs(uDiff)        #Postprocessing and cleanup#discard empty space if converged earlyNiter = np.min([Niter,n])omega = omega_plus[:Niter,:]idxs = np.flip(np.arange(1,T//2+1),axis = 0)# Signal reconstructionu_hat = np.zeros([T, K],dtype = complex)u_hat[T//2:T,:] = u_hat_plus[Niter-1,T//2:T,:]u_hat[idxs,:] = np.conj(u_hat_plus[Niter-1,T//2:T,:])u_hat[0,:] = np.conj(u_hat[-1,:])    u = np.zeros([K,len(t)])for k in range(K):u[k,:] = np.real(np.fft.ifft(np.fft.ifftshift(u_hat[:,k])))# remove mirror partu = u[:,T//4:3*T//4]# recompute spectrumu_hat = np.zeros([u.shape[1],K],dtype = complex)for k in range(K):u_hat[:,k]=np.fft.fftshift(np.fft.fft(u[k,:]))return u, u_hat, omega

边际谱

论文作者是在每个IMF的边际谱上计算稀疏化指标，而边际谱是希尔伯特谱在时间维度上的积分。笔者首先将求边际谱的代码函数化，由于需要进行希尔伯特变换，本代码需要调用PyEMD与scipy库。
h ( z ) = ∫ 0 T H ( t , f ) d t h(z) = \int_0^T H(t,f)dt\ h(z)=∫0TH(t,f)dt

from PyEMD import Visualisation
from scipy.signal import hilbert#求窄带信号的边际谱
def mspect(Fs,signal,draw=1):fmin,fmax=0,Fs/2size=len(signal)//2df=(fmax-fmin)/(size-1)t=np.arange(0,len(signal)/Fs,1/Fs)vis = Visualisation()#希尔伯特变化signal=signal.reshape(1,-1)#求瞬时频率freqs = abs(vis._calc_inst_freq(signal, t, order=False, alpha=None))#求瞬时幅值amp= abs(hilbert(signal))#去掉为1的维度freqs=np.squeeze(freqs)amp=np.squeeze(amp)result=np.zeros(size)for i,j in zip(freqs,amp):if i>=fmin and i<=fmax:result[round((i-fmin)/df)]+=jf=np.arange(fmin,size*df,df)#可视化if draw==1:                           #可视化plt.figure()plt.rcParams['font.sans-serif']='Times New Roman'plt.plot(f,result)plt.xlabel('f/HZ',fontsize=16)plt.ylabel('amplitude',fontsize=16)plt.title('Marginal Spectrum',fontsize=20)return f,result

基于稀疏指标自适应寻找最佳分解K值

总结论文思路如下：
1）初始化VMD参数，惩罚因子 α \alpha α为3000，拉格朗日乘子更新因子为0.01，分解模态数K为2；
2）VMD分解并计算各IMF的边际谱，计算各IMF的稀疏度（考虑了能量权值因子）；

S i = max ⁡ { M S i } max ⁡ { max ⁡ { M S 1 } ⋯ max ⁡ { M S k } } { E ( M S i 2 ) / [ E ( M S i ) ] 2 } {S}_{i}=\frac{\max \left\{ M{{S}_{i}} \right\}}{\max \left\{ \max \left\{ M{{S}_{1}} \right\}\cdots \max \left\{ M{{S}_{k}} \right\} \right\}}\left\{ E(MS_{i}^{2})/{{\left[ E(M{{S}_{i}}) \right]}^{2}} \right\} Si=max{max{MS1}⋯max{MSk}}max{MSi}{E(MSi2)/[E(MSi)]2}

3）取各IMF边际谱稀疏度作为该分解模态数K下的整体稀疏度；

S K = 1 K ∑ i = 1 K S i {{S}_{K}}=\frac{1}{K}\sum\limits_{i=1}^{K}{{{S}_{i}}} SK=K1i=1∑KSi

4）当 S K < S K − 1 ( K > 2 ) {{S}_{K}}<{{S}_{K-1}}(K>2) SK<SK−1(K>2)时，选取最佳分解模态数为K-1，进入步骤5），反之令 K = K + 1 K=K+1 K=K+1回到步骤2）继续迭代；
5）采用最佳的分解模态数进行VMD分解。
该文在确定最佳分解模态数时是选取第一个极大值点，或者稀疏度随K单调递减时，选取第一个点2。然而，实际信号极值点可能不为最大值点，且若出现先递减后递增（稀疏度大于K=2）的情况时，该方法无法取到最佳K值。
本人对选取方法做了一点小改变，即：预设最大K值（依据信号复杂度设置，本人取10），计算K从2至最大值期间的稀疏度，取最大稀疏度对应的K值作为最佳分解模态数。主函数Auto_VMD.py具体代码如下（画时频图的代码，我之前的博文有）：

from vmdpy import VMD
import matplotlib.pyplot as plt
import numpy as np
from PyEMD import Visualisation
from scipy.signal import hilbert#求窄带信号的边际谱
def mspect(Fs,signal,draw=1):fmin,fmax=0,Fs/2size=len(signal)//2df=(fmax-fmin)/(size-1)t=np.arange(0,len(signal)/Fs,1/Fs)vis = Visualisation()#希尔伯特变化signal=signal.reshape(1,-1)#求瞬时频率freqs = abs(vis._calc_inst_freq(signal, t, order=False, alpha=None))#求瞬时幅值amp= abs(hilbert(signal))#去掉为1的维度freqs=np.squeeze(freqs)amp=np.squeeze(amp)result=np.zeros(size)for i,j in zip(freqs,amp):if i>=fmin and i<=fmax:result[round((i-fmin)/df)]+=jf=np.arange(fmin,size*df,df)#可视化if draw==1:                           #可视化plt.figure()plt.rcParams['font.sans-serif']='Times New Roman'plt.plot(f,result)plt.xlabel('f/HZ',fontsize=16)plt.ylabel('amplitude',fontsize=16)plt.title('Marginal Spectrum',fontsize=20)return f,result#基于稀疏指标自适应确定K值的VMD分解
def Auto_VMD_main(signal,Fs,draw=1,maxK=10):#vmd参数设置alpha = 3000       # moderate bandwidth constraint   2000tau = 0.            # noise-tolerance (no strict fidelity enforcement)DC = 0             # no DC part imposedinit = 1           # initialize omegas uniformlytol = 1e-7#寻找最佳KS=[[],[]]flag,idx=-2,2for K in range(2,maxK+1):IMFs,_,_=VMD(signal, alpha, tau, K, DC, init, tol)                                    #分解信号M_spect=[]max_M=[]for i in range(len(IMFs)):# _,_=fftlw(Fs,IMFs[i,:],1)_,M=mspect(Fs,IMFs[i,:],0)max_M.append(max(M))temp=np.mean(M**2)/(np.mean(M)**2)M_spect.append(temp)max_M=max_M/max(max_M)S_index=np.mean(max_M*M_spect)if S_index>flag:flag=S_indexidx=KS[0].append(K)S[1].append(S_index)#用最佳K值分解信号IMFs, _, _ = VMD(signal, alpha, tau, idx, DC, init, tol)#可视化寻优过程与最终结果if draw==1:plt.figure()plt.rcParams['font.sans-serif']='Times New Roman'plt.plot(S[0],S[1])plt.scatter([idx],[flag],c='r',marker='*')plt.xlabel('K',fontsize=16)plt.ylabel('Sparse index',fontsize=16)plt.title('Optimization Process',fontsize=20)plt.figure()for i in range(len(IMFs)):plt.subplot(len(IMFs),1,i+1)plt.plot(t,IMFs[i])if i==0:plt.rcParams['font.sans-serif']='Times New Roman'plt.title('Decomposition Signal',fontsize=14)elif i==len(IMFs)-1:plt.rcParams['font.sans-serif']='Times New Roman'plt.xlabel('Time/s')return IMFsif __name__=='__main__':#仿真信号1Fs=6000   #采样频率t = np.arange(0, 1.0, 1.0 / Fs)signal=np.multiply(np.sin(2*np.pi*100*t),(np.cos(2*np.pi*1000*t)+np.cos(2*np.pi*1500*t)+np.cos(2*np.pi*2000*t)))# #仿真信号2# Fs=1000   #采样频率# t = np.arange(0, 1.0, 1.0 / Fs)# f1,f2,f3 = 100,200,300# signal = np.piecewise(t, [t < 1, t < 0.6, t < 0.3],#                     [lambda t: np.sin(2 * np.pi * f1 * t), lambda t: np.sin(2 * np.pi * f2 * t),#                       lambda t: np.sin(2 * np.pi * f3 * t)])    # #仿真信号3# Fs=1000   #采样频率# t = np.arange(0, 1.0, 1.0 / Fs)# f1,f2,f3 = 100,200,300# signal = 3*np.sin(2*np.pi*f1*t)+6*np.sin(2*np.pi*f2*t)+5*np.sin(2*np.pi*f3*t) IMFs=Auto_VMD_main(signal,Fs,draw=1,maxK=10)from eemd_hht import hhtlwtt,ff,c_matrix=hhtlw(IMFs,t,f_range=[0,Fs/2],t_range=[0,t[-1]],ft_size=[128,128])     #画希尔伯特谱

仿真信号分析

仿真如下信号验证，采样频率为6000 Hz，信号时间长度为1 s：

y = sin ⁡ ( 2 π f 1 t ) ∗ [ cos ⁡ ( 2 π f 2 t ) + cos ⁡ ( 2 π f 3 t ) + cos ⁡ ( 2 π f 4 t ) ] y=\sin \left( 2\pi {{f}_{1}}t \right)*\left[ \cos \left( 2\pi {{f}_{2}}t \right)+\cos \left( 2\pi {{f}_{3}}t \right)+\cos \left( 2\pi {{f}_{4}}t \right) \right] y=sin(2πf1t)∗[cos(2πf2t)+cos(2πf3t)+cos(2πf4t)]

f 1 , f 2 , f 3 , f 4 = 100 , 1000 , 1500 , 2000 {{f}_{1}},{{f}_{2}},{{f}_{3}},{{f}_{4}}=100,1000,1500,2000 f1,f2,f3,f4=100,1000,1500,2000
由于傅里叶变化的思想是采用标准正弦波来拟合信号，而本文信号由于出现正弦波相乘，经过三角函数积化和差可转换为6个正弦波，故原始信号频谱包含6个频率成分。
y = 1 2 sin ⁡ ( 2 π ( f 1 + f 2 ) t ) + sin ⁡ ( 2 π ( f 1 − f 2 ) t ) + sin ⁡ ( 2 π ( f 1 + f 3 ) t ) + y=\frac{1}{2}\sin \left( 2\pi \left( {{f}_{1}}+{{f}_{2}} \right)t \right)+\sin \left( 2\pi \left( {{f}_{1}}-{{f}_{2}} \right)t \right)+\sin \left( 2\pi \left( {{f}_{1}}+{{f}_{3}} \right)t \right)+ y=21sin(2π(f1+f2)t)+sin(2π(f1−f2)t)+sin(2π(f1+f3)t)+
sin ⁡ ( 2 π ( f 1 − f 3 ) t ) + sin ⁡ ( 2 π ( f 1 + f 4 ) t ) + sin ⁡ ( 2 π ( f 1 − f 4 ) t ) \sin \left( 2\pi \left( {{f}_{1}}-{{f}_{3}} \right)t \right)+\sin \left( 2\pi \left( {{f}_{1}}+{{f}_{4}} \right)t \right)+\sin \left( 2\pi \left( {{f}_{1}}-{{f}_{4}} \right)t \right) sin(2π(f1−f3)t)+sin(2π(f1+f4)t)+sin(2π(f1−f4)t)
原始信号的时域图与频谱图如下：

采用稀疏度自动寻找最佳K值为6，寻优过程以及分解后的时频图如下：

转子试验台数据分析

采用转子试验台数据（无故障状态）分析，采样频率为30720 Hz，样本长度为1024，原始时域及频域图如下，大致可以分为6-8个频带，图中圈圈为本方法确定的最佳模态分量数下各IMF的频带中心，基本符合信号包含的窄带个数：
采用稀疏度自动寻找最佳K值为6，寻优过程以及分解后的时频图如下：
各IMF的边际谱如下，基本对应频谱中的各频率成分：

Python论文复现：VMD之自适应选择分解模态数K值相关推荐

point 如何求elbow_如何选择kmeans中的k值——肘部法则–Elbow Method和轮廓系数–Silhouette...
肘部法则–Elbow Method 我们知道k-means是以最小化样本与质点平方误差作为目标函数,将每个簇的质点与簇内样本点的平方距离误差和称为畸变程度(distortions),那么,对于一个簇, ...
如何选择kmeans中的k值——肘部法则–Elbow Method和轮廓系数–Silhouette Coefficient...
肘部法则–Elbow Method 我们知道k-means是以最小化样本与质点平方误差作为目标函数,将每个簇的质点与簇内样本点的平方距离误差和称为畸变程度(distortions),那么,对于一个簇, ...
论文复现：Learning Efficient Convolutional Networks through Network Slimming
论文核心论文提出了一种结构化剪枝策略,剪枝对象为 channel ,对 channel 重要性的评价标准使用的是 Batch Normalization 层中的缩放因子,这不会给网络带来额外的开销. ...
python K-means聚类分析聚类数的选择-肘部法则和轮廓系数
如何选择kmeans中的k值--肘部法则–Elbow Method和轮廓系数–Silhouette Coefficient
[转载]「交叉验证」到底如何选择K值？
「交叉验证」到底如何选择K值? 原文链接:https://cloud.tencent.com/developer/article/1410946 交叉验证(cross validation)一般被用于 ...
k近邻算法之 k值的选择
k近邻算法之 k值的选择举例说明: K值过小: [过拟合] 容易受到异常点的影响 [如:美人鱼本身就是喜剧片,假如统计的时候记为动作片,则对预测值的影响太大] k值过大: [欠拟合] ...
R语言层次聚类：通过内平方和WSS选择最优的聚类K值、可视化不同K下的BSS和WSS、通过Calinski-Harabasz指数（准则）与聚类簇个数的关系获取最优聚类簇的个数
R语言层次聚类:通过内平方和(Within Sum of Squares,WSS)选择最佳的聚类K值.以内平方和(WSS)和K的关系并通过弯头法ÿ
R语言层次聚类：通过内平方和（Within Sum of Squares, WSS）选择最优的聚类K值、以内平方和（WSS）和K的关系并通过弯头法（elbow method）获得最优的聚类个数
通过内平方和(Within Sum of Squares, WSS)选择最佳的聚类K值.以内平方和(WSS)和K的关系并通过弯头法(elbow method)获得最佳的聚类个数目录
K-近邻算法之K值的选择（带案例）
三.K值的选择 K值选择问题,李航博士的一书「统计学习方法」上所说: 选择较小的K值,就相当于用较小的领域中的训练实例进行预测,"学习"近似误差会减小,只有与输入实例较近或相似的训 ...

Python论文复现：VMD之自适应选择分解模态数K值