原文，如果觉得还行，可以github点个赞
level: SIGKDD ACM
author:Thanawin Rakthanmanon
date: August 12–16, 2012
keyword:

Sequence data matching

Paper: DTW

Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping

Research Objective

Application Area: time series motif discovery [25] [26], anomaly detection [35] [31], time series summarization, shapelet extraction [39], clustering, and classification [6]
Purpose: fast sequential search instead of approximately search

Proble Statement

Time Series Subsequences must be Normalized ，or tiny changes we made are completely dwarfed by changes we might expect to see in a real world deployment. ,but it is not sufficient to normalize the entire dataset.

Arbitrary Query Lengths cannot be Indexed

Methods

Problem Formulation:
- Definition1: Time serias to search : T=t1,t2,...,tmT=t_1,t_2,...,t_mT=t1,t2,...,tm
- Definition 2: subsequence to query: Q=Ti,k=ti,ti+1,...,ti+k−1,iϵ[1,m−k+1]Q = T_{i,k}=t_i,t_{i+1},...,t_{i+k-1}, i\epsilon [1,m-k+1]Q=Ti,k=ti,ti+1,...,ti+k−1,iϵ[1,m−k+1]
- Definition 3: the Euclidean distance(ED) between Q and C, where |Q|=|C|, the distance is :
  ED(Q,C)=∑i=1n(qi−ci)2ED(Q,C)=\sqrt{\sum_{i=1}^n(q_i-c_i)^2} ED(Q,C)=i=1∑n(qi−ci)2
ED&&DTW

【Opinion 1】 Using the Squared Distance
ED(Q,C)=∑i=1n(qi−ci)2−−−>ED(Q,C)=∑i=1n(qi−ci)2ED(Q,C)=\sqrt{\sum_{i=1}^n(q_i-c_i)^2} ---> ED(Q,C)=\sum_{i=1}^n(q_i-c_i)^2 ED(Q,C)=i=1∑n(qi−ci)2−−−>ED(Q,C)=i=1∑n(qi−ci)2

【Opinion 2】 Using Lower Bounding

a）LB_Kim

通过计算两个序列的1,2,3,4四个点对应的ED距离，来计算两个序列的相似性，其中1,2点为首尾点，3,4点表示函数的最小点和最大点。其时间复杂度为O(n)。还有些改进版的，再在这四个点的基础上多取一些点来进行计算。公式：LBKim(Q,C)=Max(i=1,2,3,4)d(fiQ,fiC)LB_{Kim}(Q,C)=Max_{(i=1,2,3,4)}d(f_i^Q,f_i^C)LBKim(Q,C)=Max(i=1,2,3,4)d(fiQ,fiC) , 经过Z-normalization 后，影响不大

b）LB_Yi

在LB_Kim的基础上做了改进，通过定义被比较序列C的最大和最小值的范围，来进行相似性的比较，公式如下：

LBYI(Q,C)=∑qi>max(C)d(qi,max(C))+∑qi<min(C)d(qi,min(C))LBY_I(Q,C)=\sum_{q_i>max(C)}d(q_i,max(C))+\sum_{q_i<min(C)}d(q_i,min(C)) LBYI(Q,C)=qi>max(C)∑d(qi,max(C))+qi<min(C)∑d(qi,min(C))

如下图所示：

c) LB_Keogh

LB_Keogh 下界函数，相比于LB_Kim以及LB_Yi具有更好的效果。Keogh使用了上下包络线，该下界距离更为紧凑, 不容易产生漏报。U 和 L 指的是上下包络函数。
LBFKeogh(Q,C)=∑i=1n{(qi−ui)2,qi>ui(qi−li)2,qi<li0LBFKeogh(Q,C)=\sum_{i=1}^n\begin {cases} (q_i-u_i)^2,q_i>u_i\\(q_i-l_i)^2,q_i<l_i\\0 \end{cases} LBFKeogh(Q,C)=i=1∑n⎩⎪⎨⎪⎧(qi−ui)2,qi>ui(qi−li)2,qi<li0

**【Opinion 3】 Using Early Abandoning of ED and LB_Keogh **

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-rnQu33qY-1590543177305)(DTW.assets/image-20200228100225112.png)]

【Opinion 3】 Using Early Abandoning of DTW

对于右图，从左边开始DTW匹配，右边使用之前计算地LB_keogh 代码如下：

  /// Calculate Dynamic Time Wrapping distance/// A,B: data and query, respectively/// cb : cummulative bound used for early abandoning/// r  : size of Sakoe-Chiba warpping bandprivate static double dtw(double[] A, double[] B, double[] cb, int m, int r,double bsf = double.PositiveInfinity){double[] cost;double[] cost_prev;double[] cost_tmp;int i, j, k;double x, y, z, min_cost;/// Instead of using matrix of size O(m^2) or O(mr), we will reuse two array of size O(r).cost = new double[2 * r + 1]; //(double*)malloc(sizeof(double)*(2*r+1));for (k = 0; k < 2 * r + 1; k++) cost[k] = double.PositiveInfinity;cost_prev = new double[2 * r + 1]; //(double*)malloc(sizeof(double)*(2*r+1));for (k = 0; k < 2 * r + 1; k++) cost_prev[k] = double.PositiveInfinity;for (i = 0; i < m; i++){k = max(0, r - i);min_cost = double.PositiveInfinity;for (j = max(0, i - r); j <= min(m - 1, i + r); j++, k++){// Initialize all row and columnif ((i == 0) && (j == 0)){cost[k] = dist(A[0], B[0]);min_cost = cost[k];continue;}if ((j - 1 < 0) || (k - 1 < 0)) y = double.PositiveInfinity;else y = cost[k - 1];if ((i - 1 < 0) || (k + 1 > 2 * r)) x = double.PositiveInfinity;else x = cost_prev[k + 1];if ((i - 1 < 0) || (j - 1 < 0)) z = double.PositiveInfinity;else z = cost_prev[k];// Classic DTW calculationcost[k] = min(min(x, y), z) + dist(A[i], B[j]);// Find minimum cost in row for early abandoning (possibly to use column instead of row).if (cost[k] < min_cost){min_cost = cost[k];}}// We can abandon early if the current cummulative distace with lower bound together are larger than bsfif (i + r < m - 1 && min_cost + cb[i + r + 1] >= bsf){return min_cost + cb[i + r + 1];}// Move current array to previous array.cost_tmp = cost;cost = cost_prev;cost_prev = cost_tmp;}k--;// the DTW distance is in the last cell in the matrix of size O(m^2) or at the middle of our array.double final_dtw = cost_prev[k];return final_dtw;}

【Opinion 4】 The UCR Suite

Early Abandoning Z-Normalization

Reordering Early Abandoning

Reversing the Query/Data Role in LB_Keogh

对应代码：

Cascading Lower Bounds

Experiment

Random works of length 20 million with increasing long query

Supporting Very Long queries: DNA

Application: Online Time serial motifs, classification of historical musical scores, classification of ancient coins, clustering of star light curves

Notes

Code available: http://www.cs.ucr.edu/~eamonn/UCRsuite.html

Code Explain:

包络线绘制

/// Finding the envelop of min and max value for LB_Keogh/// Implementation idea is intoruduced by Danial Lemire in his paper/// "Faster Retrieval with a Two-Pass Dynamic-Time-Warping Lower Bound", Pattern Recognition 42(9), 2009.public static void lower_upper_lemire(double[] t, int len, int r, double[] l, double[] u){Deque du = new Deque();Deque dl = new Deque();init(ref du, 2 * r + 2);init(ref dl, 2 * r + 2);push_back(ref du, 0);push_back(ref dl, 0);for (int i = 1; i < len; i++){if (i > r){u[i - r - 1] = t[front(ref du)];l[i - r - 1] = t[front(ref dl)];}if (t[i] > t[i - 1]){pop_back(ref du);while (!du.Empty && t[i] > t[back(ref du)])pop_back(ref du);}else{pop_back(ref dl);while (!dl.Empty && t[i] < t[back(ref dl)])pop_back(ref dl);}push_back(ref du, i);push_back(ref dl, i);if (i == 2 * r + 1 + front(ref du))pop_front(ref du);else if (i == 2 * r + 1 + front(ref dl))pop_front(ref dl);}for (int i = len; i < len + r + 1; i++){u[i - r - 1] = t[front(ref du)];l[i - r - 1] = t[front(ref dl)];if (i - front(ref du) >= 2 * r + 1)pop_front(ref du);if (i - front(ref dl) >= 2 * r + 1)pop_front(ref dl);}}

 /// LB_Keogh 1: Create Envelop for the query/// Note that because the query is known, envelop can be created once at the begenining.////// Variable Explanation,/// order : sorted indices for the query./// uo, lo: upper and lower envelops for the query, which already sorted./// t     : a circular array keeping the current data./// j     : index of the starting location in t/// cb    : (output) current bound at each position. It will be used later for early abandoning in DTW.private static double lb_keogh_cumulative(long[] order, double[] t, double[] uo, double[] lo, double[] cb,long j, int len, double mean, double std,double best_so_far = double.PositiveInfinity){double lb = 0;double x, d;for (int i = 0; i < len && lb < best_so_far; i++){x = (t[(order[i] + j)] - mean) / std;d = 0;if (x > uo[i])d = dist(x, uo[i]);else if (x < lo[i])d = dist(x, lo[i]);lb += d;cb[order[i]] = d;}return lb;}

level:
author: François Petitjean Faculty of IT, Monash University
date: 2014
keyword:

DTW Sequence Analyze

Paper: DBA

Dynamic Time Warping Averaging of Time Series allows Faster and more Accurate Classification

Summary

exploit a recent result to allow meaningful averaging of warped times series, to allows us to create ultra-efficient Nearest "Centroid " classifiers that at least as accurate as their more lethargic Nearest Neighbor cousins.
application area: reducing the data cardinality, reducing the data dimensionality(the idea works well when the raw data is oversampled), reducing the number of objects the nearest neighbor algorithm must see.

Research Objective

Application Area: sequence analyse
Purpose: using DBA method to represent a category

Proble Statement

previous work:

NN-DTW algorithm are competitive or superior in domains as diverse as gesture recognition, robotics and ECG classification[1].
DBA can be used to speed up NN-DTW by constructing the most representative time series of each class and using only those for training.
sometiems NCC and NN can have approximately the same accuracy, in such cases we prefer NCC because it is faster and requires less memory.
sometiems NCC can be more accurate than NN, in such cases we prefer NCC because of the accuracy gains, and the reduced computational requirements come for free.

Methods

Problem Formulation:
- Definitions: Dataset D={T1,...,TN}D=\{ T_1,...,T_N\}D={T1,...,TN} , T=(t1,t2,...,tL)T=(t_1,t_2,...,t_L)T=(t1,t2,...,tL), L is the length.
- Averaging under time warping : Finding the multiple alignment of a set of sequences, or its average sequence(often called consesus sequence in biology) is a typical chicken-and-egg problem: knowig the average sequence provides a multiple alignment and vice versa, Finding the solution to the multiple alignment problem( and thus finding of an average sequence) has been shown to be NP-complete with the exact solution requiringO(LN)O(L^N)O(LN) operations for N sequences of length L.
- Average object: given a set of objects O={O1,...,ON)}O=\{O_1,...,O_N)\}O={O1,...,ON)} in a space E indeced by a measure d, the average object $\vec{o} is the object that minimizes the sum of the squares to the set:
argmino⃗ϵE∑i=1Nd2(o⃗,Oi)argmin_{\vec{o}\epsilon E} \sum_{i=1}^Nd^2(\vec{o},O_i) argminoϵEi=1∑Nd2(o,Oi)
- Average time series for DTW:
argminT⃗ϵE∑i=1NDTW2(T⃗,Ti)argmin_{\vec{T}\epsilon E}\sum_{i=1}^N DTW^2(\vec{T},T_i) argminTϵEi=1∑NDTW2(T,Ti)
- DBA: the best-so-far method to average time series for Dynamic Time Warping: DBA iteratively refines an average sequence T⃗\vec{T}T and folows an expectation-maximization scheme:
  - Consider T⃗\vec{T}T fixed and find the best multiple alignment M of the set of sequences D consistently with T⃗\vec{T}T
  - consider the M fixed and update T⃗\vec{T}T as the best average sequence consistent with M

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ETQ6sbfq-1590543177312)(DTW.assets/image-20200305122617085.png)]

Evaluation

Left) NN has error-rate of 12.60%, while the Nearest Centroid classifier (right) with the same instances achieves an error-rate of just 5.22%

Notes

Nearest centroid classifier [1]
code available: 23 Matlab and java source code for DBA

level:
author: Germain Forestier,University of Haute-Alsace, Mulhouse, France
date:
keyword:

DTW, Series generate

Paper: Generating Synthetic series

Generating synthetic time series to augment sparse datasets

Summary

extend DBA to calculate a weighted average of time series under DTW.
enlarge training sets by generating synthetic (or artificial) examples.
can generate an unlimited number of synthetic time series and tailor the weights distribution to achieve diversity.
deal with cold start problem, and use synthetic time series to double the trainning sets size regardless of their original sizes.

Research Objective

Application Area: Sequence Analyse
Purpose: cold start problem, synthetic time series to double the training sets size regardless of their original sizes.

Proble Statement

based information

in some cases, it can be easier to experss our knowledge of the problem by generating synthetic data than by modifying the classifier itself. For instance, images containing street numbers on houses can be slightly rotated without changing what number they actually are. Voice can be slightly accelerated or slowed down without modifying the meaning. we can replace some words in a sentence by a close synonym without completely altering its meaning.

previous work:

Le Guennec et al. proposed to stretch or shrink randomly selected slices of a time series in order to create synthetic examples.

Methods

Problem Formulation:
- Definition 1: A dataset D={T1,T2,...,TN}D=\{T_1,T_2,...,T_N\}D={T1,T2,...,TN}, for T1T_1T1: T1=<t1,t2,...,tL>T_1=<t_1,t_2,...,t_L>T1=<t1,t2,...,tL>, L is the length.
- Definition2: Average time series for DTW:
argminT⃗ϵE∑i=1NDTW2(T⃗,Ti)argmin\vec{T}\epsilon E\sum_{i=1}^NDTW^2(\vec{T},T_i) argminTϵEi=1∑NDTW2(T,Ti)
- Definition 3: weighted average of time series under DTW, given a weighted set of time series $D={(T_1,w_1),…,(T_N,w_N)} $in a space E induced by DTW, T⃗\vec{T}T the average time series is the time series that minimizes :

argminT⃗ϵE∑i=1Nwi∗DTW2(T⃗,Ti)argmin\vec{T}\epsilon E\sum_{i=1}^Nw_i*DTW^2(\vec{T},T_i) argminTϵEi=1∑Nwi∗DTW2(T,Ti)

DBA uses expectation-maximization scheme and iteratively refines a starting average T⃗\vec{T}T by:
- Expectation: Considering T⃗\vec{T}T fixed and finding the best multiple alignment M of the set of sequence D consistenly with T⃗\vec{T}T
- Maximization: considering M fixed and updating $\vec{T} $ as the best average sequence consistent with M.

【Problem Define】

how to compute a weighted average consistently with dynamic time warping
how to decide upon the weights to give to each times series.

[Weighted average of time series for DTW]

DTW_alignment: 找到DTW匹配时的path所对应的序列对，将序列对D中对应元素值相加，然后统计个数取平均,其中代码片段如下：

while (pathMatrix[i][j] != DBA.NIL) {updatedMean[i] += T[j];nElementsForMean[i]++;move = pathMatrix[i][j];i += moveI[move];j += moveJ[move];
}
assert (i != 0 || j != 0);
updatedMean[i] += T[j];
nElementsForMean[i]++;

medoid 方法java代码如下：

private static int approximateMedoidIndex(double[][] sequences, double[][] mat) {/** we are finding the medoid, as this can take a bit of time, if* there is more than 50 time series, we sample 50 as possible* medoid candidates*/ArrayList<Integer> allIndices = new ArrayList<>();for (int i = 0; i < sequences.length; i++) {allIndices.add(i);}Collections.shuffle(allIndices);ArrayList<Integer> medianIndices = new ArrayList<>();for (int i = 0; i < sequences.length && i < 50; i++) {medianIndices.add(allIndices.get(i));}int indexMedoid = -1;double lowestSoS = Double.MAX_VALUE;for (int medianCandidateIndex : medianIndices) {double[] possibleMedoid = sequences[medianCandidateIndex];double tmpSoS = sumOfSquares(possibleMedoid, sequences, mat);if (tmpSoS < lowestSoS) {indexMedoid = medianCandidateIndex;lowestSoS = tmpSoS;}}return indexMedoid;}private static double sumOfSquares(double[] sequence, double[][] sequences, double[][] mat) {double sos = 0.0;for (int i = 0; i < sequences.length; i++) {double dist = DTW(sequence, sequences[i], mat);sos += dist * dist;}return sos;}

[Average All]

We ﬁrst propose to sample the weights vector following a ﬂat Dirichlet distribution with unit concentration parameter w ⇠Dir(1). We used a low value for the shape parameter (0.2 in this paper) of the Gamma-distributed random variable used for the Dirichlet distribution in order to give more weight to a time series that is then used as the initial object to update in Weighted DBA algorithm

the following two methods ﬁrst select a subset of the time series to average

[ Average Selected (AS) ]

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-GNiNl38b-1590543177319)(DTW.assets/image-20200305112129474.png)]

[ Average Selected with Distance (ASD)]

Notes

code available： https://github.com/fpetitjean/DBA
Dirichlet distribution

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-fAgLrOsi-1590543177321)(DTW.assets/image-20200305113855885.png)]

level: 2017 PMLR , Sydney Australia
author: Macro Cuturi, Mathieu Blondel
date: 2017
keyword:

DTW, Sequence prediction

Paper: Soft-DTW

Soft-DTW: a Differentiable Loss Function for Time-Series

Summary

propose a differentiable learning loss between time series, building upon the DTW discrepancy.
computes the soft-minimum of all alignment costs, that both its value and gradient can be computed with quadratic time complexity.
propose to use soft-DTW as a fitting term to compare the output of a machine synthesizing a time series segment with a ground truth observation.

Research Objective

Application Area: Sequence Analyse
Purpose: the compare data is not a vector, but a sequence

Proble Statement

teh gradients of soft-DTW to all of its variables can be computed as a by-product of the computation of the discrepancy itself, with an added quadratic storage cost.

Methods

【Opinion 1】DTW and soft-DTW loss function

【Wait to read clearly】don’t understand the follows.

Evaluation

Environment:
- Dataset:
Average with soft-DTW loss

Clustering with the soft-DTW geometry

Multistep-ahead prediction

Conclusion

Notes 去加强了解

coda available: https: //github.com/mblondel/soft-dtw.

【Paper】DTWSequence Analysis相关推荐

【Paper】2019_Consensus Control of Multiple AUVs Recovery System Under Switching Topologies and Time D
Zhang W, Zeng J, Yan Z, et al. Consensus control of multiple AUVs recovery system under switching to ...
【Paper】2009_Controllability of Multi-Agent Systems from a Graph-Theoretic Perspective 精炼版
详细版请参考:[Paper]2009_Controllability of Multi-Agent Systems from a Graph-Theoretic Perspective 文章目录 5. ...
【Paper】2003_Consensus Problems in Networks of Agents with Switching Topology and Time-Delays
此篇文章主要在于仿真代码,关于文献的解释请参考 [Paper]2003_Murr_Consensus Problems in Networks of Agents with Switching Top ...
【Paper】2021_Distributed Consensus Tracking of Networked Agent Systems Under Denial-of-Service Attack
Y. Wan, G. Wen, X. Yu and T. Huang, "Distributed Consensus Tracking of Networked Agent Systems ...
【Paper】2017_水下潜航器编队海洋勘测的协调控制方法研究
友情链接:[paper]2019_Consensus Control of Multiple AUVs Recovery System Under Switching Topologies and T ...
【Paper】2015_El H_Decentralized Control Architecture for UAV-UGV Cooperation
Decentralized Control Architecture for UAV-UGV Cooperation 1 Introduction 2 Problem Statement and Ar ...
【Paper】2009_Controllability of Multi-Agent Systems from a Graph-Theoretic Perspective
[Rahmani, Amirreza and Ji, Meng and Mesbahi, Mehran and Egerstedt, Magnus. Controllability of Multi- ...
【翻译】Mathematical Analysis of Algorithms
这是Knuth 的一篇论文,原文下载在这里有:http://download.csdn.net/detail/u013012544/6982095,是北京大学本科生算法设计与分析2014年春季课程的必 ...
【Paper】2004_Information Flow and Cooperative Control of Vehicle Formations
Fax J A, Murray R M. Information flow and cooperative control of vehicle formations[J]. IEEE transac ...

【Paper】DTWSequence Analysis

Paper: DTW

Research Objective

Proble Statement

Methods

Experiment

Notes

Code Explain:

Paper: DBA

Summary

Research Objective

Proble Statement

Methods

Evaluation

Notes

Paper: Generating Synthetic series

Summary

Research Objective

Proble Statement

Methods

Notes

Dirichlet distribution

Paper: Soft-DTW

Summary

Research Objective

Proble Statement

Methods

Evaluation

Conclusion

Notes 去加强了解

【Paper】DTWSequence Analysis相关推荐

最新文章

热门文章