字符串删除重复字符

介绍 (Introduction)

我经常回答一些问题,其中的字符串需要“清除”多个空格字符。 最常见的解决方法是删除前导或尾随空格。 对于这个问题,有非常方便的固有VB函数(LTrim,RTrim,Trim)。 但是,这些功能不会影响第一个和最后一个非空格字符(内部空格)之间的任何重复空格字符。

In this article, I will explore different solutions to this problem and evaluate their performance. You should be able to apply this in any of the Office products (Access, Excel, PowerPoint, Word), VBScript, or VB6. However, the collection object is not available in the VBScript environment, so you won't be able to use or test any of the methods that use the collection object in your VBS batch jobs.

在本文中,我将探讨该问题的不同解决方案并评估其性能。 您应该能够在任何Office产品(Access,Excel,PowerPoint,Word),VBScript或VB6中应用此功能。 但是,收集对象在VBScript环境中不可用,因此您将无法在VBS批处理作业中使用或测试使用收集对象的任何方法。

Note: You can use these methods to remove any repeating characters inside a string, not just repeated space characters.

注意:您可以使用这些方法删除字符串内的所有重复字符,而不仅仅是重复的空格字符。

问题背景 (Problem Context)

您经常在必须处理的数据中没有发言权。 如果您获得格式正确,规范化和结构化的数据,请算上您自己的幸运。 但是,如果您收到以下任何内容,则本文可能会对您有所帮助。

  • unstructured data (text files) that you will need to parse or format您需要解析或格式化的非结构化数据(文本文件)
  • text or memo (varchar) database fields with raw data具有原始数据的文本或备注(varchar)数据库字段
  • HTML or XML that may look good in a browser, but isn't easy to parse or formatHTML或XML在浏览器中看起来不错,但不容易解析或格式化
  • XML or JSON (any tagged data format) that you need to compact before sending or saving发送或保存之前需要压缩的XML或JSON(任何带标签的数据格式)

绩效方法 (Performance Methodologies)

所测方法的性能受字符串长度和要删除的(内部)空格字符数量的影响。 为简单起见,我将在葛底斯堡地址上评估具有不同排列的代码和方法-一段,整个地址,十个地址在一个字符串中并列的地址和一百个地址在一个字符串中并列的一百个副本。 本文最后对置换例程进行了描述。

In addition to different permutations of the Gettysburg Address, I am measuring the code against both the string permutations and a version of each string permutation with additional inserted space characters between the words. A description of the space-insertion routine is at the end of this article.

除了葛底斯堡地址的不同排列之外,我还针对字符串排列和每个字符串排列的版本(在单词之间添加了空格)对代码进行了测量。 本文结尾介绍了空格插入例程。

I'm showing different methods in order of simplicity of the code, with the intrinsic VB functions (Replace, Split, Join) being covered first before the ActiveX objects (Regexp, WorksheetFunction.Trim). In the Performance Results and comparisons section, I order the methods from fastest to slowest. I will include some explanation of why these behave differently.

我将以简化代码的顺序显示不同的方法,首先在ActiveX对象( Regexp,WorksheetFunction.Trim )之前介绍固有的VB函数( Replace,Split,Join )。 在“性能结果和比较”部分中,我将方法从最快到最慢排序。 我将解释为什么它们的行为会有所不同。

Although I used different timing methods in my tests, I only include the most precise method in the attached sample code. When timed statements in the code run very quickly, sub millisecond, most of the easiest-to-code timing methods fail to recognize that any time has passed.

尽管我在测试中使用了不同的计时方法,但在所附的示例代码中只包括了最精确的方法。 当代码中的定时语句运行速度很快(不到毫秒)时,大多数最容易编码的定时方法都无法识别出已经过去了任何时间。

It is always difficult to get reliable timing data from a Windows OS. There are so many things happening with other applications, utilities, and services, that you need to give yourself permission to ignore outliers. In these tests, I am only concerned with normal execution values. Upon the advice of a stats friend, I used the median function. This eliminates spikes in the results. The median value is the value that is middle value, or average of the two middle values in a sorted list of values.

从Windows操作系统总是很难获得可靠的时序数据。 其他应用程序,实用程序和服务发生了很多事情,您需要授予自己忽略异常值的权限。 在这些测试中,我只关心正常的执行值。 在统计朋友的建议下,我使用了中位数函数。 这样可以消除结果中的峰值。 中值是作为中间值的值,或者是排序的值列表中两个中间值的平均值。

Before each test, I set the priority of the Windows process to High and closed all non-essential applications. As part of the launching process, the application window is minimized. I walked away from my laptop for the 5-7 minutes required for each test run.

在每次测试之前,我将Windows进程的优先级设置为“高”,并关闭所有非必需的应用程序。 作为启动过程的一部分,应将应用程序窗口最小化。 每次测试运行5-7分钟后,我就离开笔记本电脑了。

数据准备 (Preparation of the data)

这是我测量算法的步骤。

  1. Make three separate 'performance test' executions进行三个单独的“性能测试”执行
  2. Each performance test creates four different permutation lengths of the address.每个性能测试都会创建地址的四个不同排列长度。
  3. For each address permutation, the set of algorithms are invoked against the plain permutation and a 'space-stuffed' version of the address.对于每个地址排列,针对简单排列和地址的“空间填充”版本调用一组算法。
  4. Invoke each algorithm 21 times调用每种算法21次
  5. Calculate the median of each algorithm计算每种算法的中位数
  6. Normalize each algorithm's median against the median of an Instr() for that iteration's permutation (plain or stuffed). The Instr() function is looking for a string that does not exist in that iteration's string.针对该迭代的排列(纯或填充),将每个算法的中位数与Instr()的中位数进行标准化。 Instr()函数正在查找该迭代字符串中不存在的字符串。
  7. The three normalized 'performance test' execution sets are averaged将三个标准化的“性能测试”执行集平均
  8. The averaged (normalized) results are sorted by each test's overall average.平均(标准化)结果按每个测试的总体平均值排序。

葛底斯堡地址数据资料 (Gettysburg Address Data Profile)

地址的四个排列为代码处理提供了广泛的数据配置文件。 没有插入任何其他内部空间,概要文件如下所示:

Permutation      Words   Non-space chars Total chars
First paragraph        30      146             175
Entire Gburg Addr     271     1186            1464
Gburg Addr x10       2710    11860           14640
Gburg Addr x100     27100   118600          146400

简单替换-[仅两个] (Simple Replace -- [just two])

不要陷入陷阱,即一次调用Replace()函数将可靠地删除所有重复字符的实例。 您需要迭代Replace()函数,直到不再有重复的字符为止。

Function JustTwo(ByVal parmString As String) As String'===================================='Replace all double space strings with a single space.'Iterate until there are no more double space character'strings'====================================Dim strTemp As StringstrTemp = parmStringDo Until InStr(strTemp, "  ") = 0strTemp = Replace(strTemp, "  ", " ")LoopJustTwo = strTemp
End Function

雄心勃勃的替换-[三两] (Ambitious Replace - [three & two])

现在我们知道我们有一个简单,可靠和快速的方法,现在该问问它是否可以更快地运行了。 显然,答案是“是”。 否则,我将无法撰写一篇非常有趣且具有教育意义的文章。 在以下方法中,我添加了第二个循环,该循环将首先用三个连续的空格更改所有字符串,然后再用两个连续的空格更改所有字符串。

Function ThreeTwo(ByVal parmString As String) As String'================================================'Replace all three consecutive spaces with one space, 'then replace all two consecutive spaces with one space'================================================Dim strTemp As StringstrTemp = parmString'Replace three space strings with a single space until'no more instances of three space strings existDo Until InStr(strTemp, "   ") = 0strTemp = Replace(strTemp, "   ", " ")Loop'Replace two space strings with a single space until no 'more instances of two space strings existDo Until InStr(strTemp, "  ") = 0strTemp = Replace(strTemp, "  ", " ")LoopThreeTwo = strTemp
End Function

复杂替换-[多个替换] (Complex Replace - [multiple replaces])

Let's kick the Ambitious Replace approach up a notch...BAM! If you are familiar with the Shell sort, this should be a somewhat familiar algorithm. We attempt to replace some longer space character sequences before shorter ones. This algorithm extends the Ambitious Replace algorithm beyond just three and two length space sequences. If you know something about your data, you might get great results by optimizing the string sizes you replace.

让我们将雄心勃勃的替换方法提升一个档次... BAM! 如果您熟悉Shell排序,那么这应该是一个有点熟悉的算法。 我们尝试在较短的空格字符序列之前替换一些较长的空格字符序列。 该算法将“雄心勃勃的替换”算法扩展到仅三个和两个长度的空间序列。 如果您对数据有所了解,则可以通过优化替换的字符串大小来获得良好的结果。

Function MultiLengths(ByVal parmString As String, _ByVal parmLengths As Variant) As String'=================================='Iterate the parmLengths array and invoke the Replace() function with a space string'of each length.'==================================Dim vItem As VariantDim strTemp As StringDim strFind As StringstrTemp = parmStringFor Each vItem In parmLengthsstrFind = Space(vItem)    'create vItem length space stringDo Until InStr(strTemp, strFind) = 0strTemp = Replace(strTemp, strFind, " ")LoopNextMultiLengths = strTemp
End Function

内在功能表现 (Intrinsic function performance)

在我们研究Split()方法之前,可能仅了解Instr(),Split()和Join()函数占用了总经过时间的哪一部分。

Max InStr() perf:

最大InStr()性能:

Max (sec)     Min         Median      Avg         Q3
0.000003283 0.000001676 0.000002095 0.000002165 0.000002165 Plain 175
0.000005029 0.000004540 0.000004889 0.000004872 0.000004959 Plain 1464
0.000034990 0.000031010 0.000033873 0.000033730 0.000034292 Plain 14640
0.000322806 0.000295429 0.000320222 0.000318942 0.000321130 Plain 1464000.000003492 0.000002584 0.000002933 0.000002910 0.000002933 Stuffed 175
0.000012083 0.000011105 0.000011594 0.000011617 0.000011733 Stuffed 4700
0.000108883 0.000098127 0.000104622 0.000104429 0.000105181 Stuffed 47759
0.001042451 0.001010883 0.001034070 0.001032131 0.001037632 Stuffed 477271

Performance relative to the Max Instr():

相对于Max Instr()的性能:

              Plain 175   Plain 1464  Plain 14640 Plain 146400
Join() perf:    1.9          3.7         4.4         5.9
Split2 perf:    2.5          5.2         6.7         7.2
Split perf:     5.7         17.6        22.9        24.6Stuffed 514 Stuffed 4700    Stuffed 47759   Stuffed 477271
Join() perf:     1.5         1.5             1.4               1.9
Split2 perf:    18.2        39.3            47.5             398.0
Split perf:     35.5        79.4            96.6            1793.5

The fact that the InStr() function universally performed best in all tests prompted me to use its performance as a normalization value for the measured code results. Also, note the Split2 performance (delimiter = two space characters) is between 2 and 4 times faster than the Split performance (delimiter = single space character).

InStr()函数在所有测试中普遍表现最佳,这一事实促使我将其性能用作所测代码结果的归一化值。 另外,请注意,Split2性能(定界符=两个空格字符)比Split性能(定界符=单空格字符)快2到4倍。

简单拆分和合并-[拆分和合并] (Simple Split & Join - [split & join])

Split()函数是易于使用且有用的解析函数。 不幸的是,它没有忽略重复定界符的能力。 因此,我们必须迭代Split()结果,寻找非零长度的项目。

Function SplitJoin(ByVal parmString As String) As String'================================================'Split() string with single space character delimiter,'add non-empty strings to the strWords array.'Then Join() the strWord array items with'a single space character'================================================Dim strWords() As StringDim strParsed() As StringDim vItem As VariantDim lngLoop As LongDim lngWord As LongDim strtemp As StringstrParsed = Split(parmString, " ")ReDim strWords(0 To UBound(strParsed))lngLoop = 0lngWord = 0'Add non-empty strings to strWord arrayFor lngLoop = 0 To UBound(strParsed)strtemp = strParsed(lngLoop)If Len(strtemp) <> 0 ThenstrWords(lngWord) = strtemplngWord = lngWord + 1End IfNext'reduce size of the strWords array to equal the number'of non-empty strings we found.ReDim Preserve strWords(0 To lngWord - 1)SplitJoin = Join(strWords, " ")End Function

Note: My performance measurement of the parsed array iteration prompted me to replace the For Each…Next loop with the traditional For...Next loop.  It was measurably faster.

注意:我对已分析的数组迭代的性能测量促使我用传统的For ... Next循环替换了For Each…Next循环。 它明显更快。

Split2&Join-[split2&join] (Split2 & Join - [split2 & join])

虽然SimpleSplit方法确实有效,但它并不是最有效的算法。 查看性能测试结果,我发现简单性并不一定总能带来最快的代码。 恰当的类比是非常简单的排序算法的简单性,该算法在处理大量数据时表现很差。 在此算法中,我拆分为两个空格的字符串。 这种方法需要权衡。 为了正确处理,我必须从Split()函数结果中修剪()任何前导或尾随空格。

Function Split2Join(ByVal parmString As String) As String'================================================'Mostly the same as SplitJoin, but using double space character'string as delimiter for the Split() function''Split() string with single space character delimiter,'move non-empty strings to the front of the strParsed array,'Redim the strParsed array down to the number of words we have,'then Join() the strParsed array items with'a single space character'================================================Dim strParsed() As StringDim lngLoop As LongDim lngWord As LongDim strtemp As StringstrParsed = Split(parmString, "  ")lngWord = 0'Move non-empty strings to the front of the strParsed arrayFor lngLoop = 0 To UBound(strParsed)strtemp = strParsed(lngLoop)If Len(strtemp) <> 0 ThenstrParsed(lngWord) = strtemplngWord = lngWord + 1End IfNext'reduce size of the strParsed array to equal the number'of non-empty strings we found.ReDim Preserve strParsed(0 To lngWord - 1)Split2Join = Join(strParsed, " ")End Function

拆分并连接-[拆分并合并] (Split and concatenate - [split & concat])

如果您的字符串比较短,则可以使用上述“拆分”和“连接”方法。 但是,如本文的性能比较部分所示,字符串连接可以很快成为性能的野兽。

Function SplitConcat(ByVal parmString As String) As String'================================================'Split() string with single space character delimiter, 'concatenate non-empty strings to the returned value'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongstrParsed = Split(parmString, " ")lngLoop = 0strTemp = vbNullStringFor lngLoop = 0 To UBound(strParsed)strTemp = strParsed(lngLoop)If Len(strTemp) <> 0 ThenSplitConcat = SplitConcat & strtemp & " "     'concatenate with spaceEnd IfNextSplitConcat = RTrim(SplitConcat)End Function

Split2和并置-[split2和concat] (Split2 and concatenate - [split2 & concat])

在这里,我用所需的多余空间Trim()进行了相同的两个空间拆分。 尽管这样做的确比单空格的Split()更好,但是串联操作会破坏较大字符串的性能。

Function Split2Concat(ByVal parmString As String) As String'================================================'Mostly the same as SplitConcat, but using double space character'string as delimiter for the Split() function''Split() string with a double space character delimiter,'concatenate non-empty strings to the returned value'================================================Dim strParsed() As StringDim strtemp As StringDim lngLoop As LongstrParsed = Split(parmString, "  ")strtemp = vbNullStringFor lngLoop = 0 To UBound(strParsed)strtemp = Trim(strParsed(lngLoop))If Len(strtemp) <> 0 ThenSplit2Concat = Split2Concat & strtemp & " "     'concatenate with spaceEnd IfNextSplit2Concat = RTrim(Split2Concat)      'remove trailing spaceEnd Function

Split2和缓冲区-[Split2和缓冲区] (Split2 and buffer - [Split2 & buffer])

在这里,我用所需的多余空间Trim()进行了相同的两个空间拆分。 缓冲是一种使用Mid()函数的技术,它提供了一种快速替代串联的方法。

NOTE: This can not be done in the VBScript environment.

注意:这不能在VBScript环境中完成。

Function Split2Buffer(ByVal parmString As String) As String'================================================'Mostly the same as SplitBuffer, except using a double space'delimiter for the Split() function.''Split() string with a double space character delimiter,'assign non-empty strings to next output buffer position,'returned the trimmed output buffer string'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongDim lngWordPosn As LongDim strBuffer As StringstrParsed = Split(parmString, "  ")strTemp = vbNullStringlngWordPosn = 1strBuffer = Space(Len(parmString))For lngLoop = 0 To UBound(strParsed)strTemp = Trim(strParsed(lngLoop))If Len(strTemp) <> 0 ThenMid$(strBuffer, lngWordPosn, Len(strTemp)) = strTemplngWordPosn = lngWordPosn + Len(strTemp) + 1End IfNextSplit2Buffer = RTrim(strBuffer)End Function

将split2和缓冲区缓冲到函数变量中-[Split2BufferFcn] (Split2 and buffer to the function variable - [Split2BufferFcn])

在这里,我用所需的多余空间Trim()进行了相同的两个空间拆分。 缓冲是一种使用Mid()函数的技术,它提供了一种快速替代串联的方法。 在此测试中,我将函数返回字符串值而不是本地字符串变量用于缓冲。

NOTE: This can not be done in the VBScript environment.

注意:这不能在VBScript环境中完成。

Function Split2BufferFcn(ByVal parmString As String) As String'================================================'Mostly the same as SplitBuffer, except using a double space'delimiter for the Split() function.''Split() string with a double space character delimiter,'assign non-empty strings to next output buffer position,'returned the trimmed output buffer string'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongDim lngWordPosn As LongstrParsed = Split(parmString, "  ")strTemp = vbNullStringlngWordPosn = 1Split2BufferFcn = Space(Len(parmString))For lngLoop = 0 To UBound(strParsed)strTemp = Trim(strParsed(lngLoop))If Len(strTemp) <> 0 ThenMid$(Split2BufferFcn, lngWordPosn) = strTemplngWordPosn = lngWordPosn + Len(strTemp) + 1End IfNextSplit2BufferFcn = RTrim(Split2BufferFcn)End Function

分成集合并加入-[split&col] (Split into collection and Join - [split & col])

VB集合对象对于存储字符串非常有效,尤其是当您不知道需要存储多少字符串时。 我们还可以将非空字符串(单词)添加到集合对象中。 为了使用Join()函数,我们仍然必须填充一个数组。 为了清楚起见,我使用了一个不同的数组strWords而不是strParsed。

Function SplitCol(ByVal parmString As String) As String'================================================'Split() string with single space character delimiter,'adding the non-empty strings to a collection object.'Copy the collection items to an array and'Join() them as the returned value'================================================Dim strParsed() As StringDim strtemp As StringDim lngLoop As LongDim strWords() As StringDim colWords As New CollectionDim vItem As VariantstrParsed = Split(parmString, " ")For lngLoop = 0 To UBound(strParsed)strtemp = strParsed(lngLoop)If Len(strtemp) <> 0 ThencolWords.Add strtempEnd IfNextReDim strWords(1 To colWords.Count)lngLoop = 1For Each vItem In colWordsstrWords(lngLoop) = vItemlngLoop = lngLoop + 1NextSplitCol = Join(strWords, " ")End Function

将split2放入集合并加入-[split2&col] (Split2 into collection and Join - [split2 & col])

此测试与上面的SplitCol()测试之间的唯一区别是,对Split()函数使用了双空格定界符。

Function Split2Col(ByVal parmString As String) As String'================================================'Mostly the same as SplitCol, except using a double space'delimiter for the Split() function.''Split() string with double space character delimiter,'adding the non-empty strings to a collection object.'Copy the collection items to an array and'Join() them as the returned value'================================================Dim strParsed() As StringDim strtemp As StringDim lngLoop As LongDim strWords() As StringDim colWords As New CollectionDim vItem As VariantstrParsed = Split(parmString, "  ")For lngLoop = 0 To UBound(strParsed)strtemp = Trim(strParsed(lngLoop))If Len(strtemp) <> 0 ThencolWords.Add strtempEnd IfNextReDim strWords(1 To colWords.Count)lngLoop = 1For Each vItem In colWordsstrWords(lngLoop) = vItemlngLoop = lngLoop + 1NextSplit2Col = Join(strWords, " ")End Function

拆分为字典并加入-[拆分和dic] (Split into dictionary and Join - [split & dic])

使用集合对象方法(如上),我们仍然必须将单词转移到strWords数组中才能使用Join()函数。 但是,如果我们使用字典,即ActiveX对象,则可以将Join()函数直接应用于字典对象的Items数组。

Function SplitDic(ByVal parmString As String) As String'================================================'Split() string with a single space character delimiter,'adding non-empty strings to the dictionary,'then Join() the dictionary object's items array.'================================================Dim strParsed() As StringDim lngLoop As LongDim lngKey As LongDim strtemp As StringStatic dicWords As ObjectIf dicWords Is Nothing ThenSet dicWords = CreateObject("scripting.dictionary")ElsedicWords.RemoveAllEnd IfstrParsed = Split(parmString, " ")lngKey = 1For lngLoop = 0 To UBound(strParsed)strtemp = strParsed(lngLoop)If Len(strtemp) <> 0 ThendicWords.Add CStr(lngKey), strtemplngKey = lngKey + 1End IfNextSplitDic = Join(dicWords.items, " ")End Function

将split2分解成字典并加入-[split2&dic] (Split2 into dictionary and Join - [split2 & dic])

在这种方法中,我们用双倍空格而不是单个空格分割,并用非空字符串填充字典对象。

Function Split2Dic(ByVal parmString As String) As String'================================================'Mostly the same as SplitDic, but using double space character'string as delimiter for the Split() function''Split() string with a double space character delimiter,'adding non-empty strings to the dictionary,'then Join() the dictionary object's items array'================================================Dim strParsed() As StringDim lngLoop As LongDim lngKey As LongDim strtemp As StringStatic dicWords As ObjectIf dicWords Is Nothing ThenSet dicWords = CreateObject("scripting.dictionary")ElsedicWords.RemoveAllEnd IfstrParsed = Split(parmString, "  ")lngKey = 1For lngLoop = 0 To UBound(strParsed)strtemp = Trim(strParsed(lngLoop))If Len(strtemp) <> 0 ThendicWords.Add CStr(lngKey), strtemplngKey = lngKey + 1End IfNextSplit2Dic = Join(dicWords.items, " ")End Function

正则表达式对象 (The Regular Expression Object)

正则表达式ActiveX对象是用于解析文本和模式匹配的功能非常强大的工具。 它还具有执行查找/替换操作的能力。 虽然固有的VB功能通常胜过regexp方法,但在某些情况下regexp对象确实发光。 如果您不熟悉regexp对象,那么可以在参考部分中找到出色的介绍性文章的链接。

Since ActiveX objects take some time to instantiate, I measured two different ways of using the regexp object, minimizing the instantiation and pattern compilation overhead. The first way is to pass the regexp object into a function. The second way is to use a static variable in the function.

由于ActiveX对象需要一些时间来实例化,因此我测量了使用regexp对象的两种不同方式,从而最大程度地减少了实例化和模式编译开销。 第一种方法是将regexp对象传递给函数。 第二种方法是在函数中使用静态变量。

Although I tested three regexp patterns, only the first two should be used for removing duplicate space characters. The third pattern might also remove other non-visible characters, such as tabs, carriage returns, and line feeds. I included this last pattern in order to measure the overhead of looking for any non-visible character against looking for just the space character.

尽管我测试了三个正则表达式模式,但仅前两个应该用于删除重复的空格字符。 第三种模式可能还会删除其他不可见的字符,例如制表符,回车符和换行符。 我包括了最后一个模式,以衡量查找任何不可见字符与查找空格字符的开销。

()

The routine for the passed regexp object:

传递的regexp对象的例程:

Function RegexpReplace(ByVal parmString As String, parmRegexp As Object) As String'================================================'Use parameter regexp object to remove duplicate spaces.'The parameter regexp object will already have its pattern property set'by the calling code.'================================================RegexpReplace = parmRegexp.Replace(parmString, " ")
End Function

正则表达式替换-[regexp''+]和[RegexpReplace1] (Regexp replace -- [regexp '  '+] and [RegexpReplace1])

Pattern: " +"

模式: “ +”

Matches: a space character followed by one or more space characters.

匹配:一个空格字符,后跟一个或多个空格字符。

Function RegexpReplace1(ByVal parmString As String) As String'================================================'Use local static regexp object to remove duplicate spaces'================================================Static oRE As ObjectIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = "  +"End IfRegexpReplace1 = oRE.Replace(parmString, " ")
End Function

正则表达式替换-[regexp''{2,}]和[RegexpReplace2] (Regexp replace -- [regexp ' '{2,}] and [RegexpReplace2])

Pattern: " {2,}"

模式: “ {2,}”

Matches: two or more space characters.

匹配:两个或多个空格字符。

Function RegexpReplace2(ByVal parmString As String) As String'================================================'Use local static regexp object to remove duplicate spaces'================================================Static oRE As ObjectIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = " {2,}"End IfRegexpReplace2 = oRE.Replace(parmString, " ")
End Function

正则表达式替换-[regexp''\ s +]和[RegexpReplace3] (Regexp replace -- [regexp ' '\s+] and [RegexpReplace3])

Pattern: " \s+"

模式: “ \ s +”

Matches: Look for a space character, Chr(32), followed by one or more 'space class' characters.  The 'space class' characters are any of the following [space, tab, carriage return, line feed, form feed].

匹配项:查找一个空格字符Chr(32),然后是一个或多个“空格类”字符。 “空格类”字符是以下任何一个[空格,制表符,回车符,换行符,换页符]。

Function RegexpReplace3(ByVal parmString As String) As String'================================================'Use local static regexp object to remove duplicate spaces''WARNING: This pattern will remove characters other than'       space characters due to the use of the \s in the pattern'================================================Static oRE As ObjectIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = " \s+"End IfRegexpReplace3 = oRE.Replace(parmString, " ")
End Function

正则表达式解析-[regexp parse] (Regexp parse-- [regexp parse])

在这种方法中,我使用正则表达式对象解析字符串中的不同单词,然后使用Join()函数在每个单词之间使用单个空格字符来重建地址。

Pattern: "[]^ ]+"

模式: “ [] ^] +”

Matches: sequences of one or more non-space characters. This should preserve the words, punctuation, carriage returns, and line feed characters.

匹配:一个或多个非空格字符的序列。 这应该保留单词,标点,回车符和换行符。

Function RegexParse(ByVal parmString As String) As String'================================================'Use local static regexp object to parse the words.'Copy the parsed words to an array and Join them with'a single space delimiter'================================================Dim oMatches As ObjectDim oM As ObjectDim strWords() As StringDim lngLoop As LongStatic oRE As ObjectIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = "[^ ]+"End IfSet oMatches = oRE.Execute(parmString)ReDim strWords(0 To oMatches.Count - 1)lngLoop = 0For Each oM In oMatchesstrWords(lngLoop) = oM.ValuelngLoop = lngLoop + 1NextRegexParse = Join(strWords, " ")
End Function

正则表达式解析-[RegexParseBuffer] (Regexp parse -- [RegexParseBuffer])

在这种方法中,我使用正则表达式对象来解析字符串中的不同单词,然后使用Mid()函数通过缓冲技术在每个单词之间使用单个空格字符来重构地址。 此缓冲技术是级联的较快替代方案。

NOTE: This can not be done in the VBScript environment.

注意:这不能在VBScript环境中完成。

Pattern: "[]^ ]+"

模式: “ [] ^] +”

Matches: sequences of one or more non-space characters. This should preserve the words, punctuation, carriage returns, and line feed characters.

匹配:一个或多个非空格字符的序列。 这应该保留单词,标点,回车符和换行符。

Function RegexParseBuffer(ByVal parmString As String) As String'================================================'Use local static regexp object to parse the words.'Copy the parsed words to the buffer'================================================Dim oMatches As ObjectDim oM As ObjectStatic oRE As ObjectDim strBuffer As StringDim lngWordPosn As LongDim strTemp As StringIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = "[^ ]+"End IfSet oMatches = oRE.Execute(parmString)strBuffer = Space(Len(parmString))lngWordPosn = 1For Each oM In oMatchesstrTemp = oM.ValueMid$(strBuffer, lngWordPosn, Len(strTemp)) = strTemplngWordPosn = lngWordPosn + Len(strTemp) + 1NextRegexParseBuffer = RTrim(strBuffer)
End Function

WorksheetFunction对象 (The WorksheetFunction Object)

Excel应用程序对象具有WorksheetFunction方法的集合。 这些是您可以在单元格公式中使用的固有函数(有一些例外)。 您也可以在VBA和VBScript环境中使用这些功能。 删除内部重复空格时,可以使用WorksheetFunction.Trim()方法。 感谢Patrick Matthews提醒我该功能的存在。 过去使用它时,我以为我一直在使用VBA Trim()函数。 使用此功能附带限制和警告。 结果,当字符串的长度可能超过32K时,我不鼓励在Excel环境之外使用它。 当字符串太大时,测试代码将阻止执行此功能(本机和COM)。

  • The maximum string length allowed is 32K, which is the maximum cell content length.允许的最大字符串长度是32K,这是最大的单元格内容长度。
  • In non-Excel environments, you have to instantiate an Excel.Application object before you can use any of the WorksheetFunction.Trim() function.在非Excel环境中,必须先实例化Excel.Application对象,然后才能使用任何WorksheetFunction.Trim()函数。
  • There is measurable overhead going through the COM interface of the Excel.Application object.通过Excel.Application对象的COM接口有可测量的开销。

()

The Class code -- [clsXL trim]

To make it easier to use, I placed all the relevant code in a class. This ensures that the Excel.Application object is freed from memory when your application ends.

班级代码-[clsXL装饰]

为了更易于使用,我将所有相关代码放在了一个类中。 这样可以确保在应用程序结束时从内存中释放Excel.Application对象。

Option ExplicitDim oXL As Object
Dim fnTrim As ObjectPrivate Sub Class_Initialize()Set oXL = CreateObject("Excel.Application")Set fnTrim = oXL.WorksheetFunction
End SubPrivate Sub Class_Terminate()Set fnTrim = NothingoXL.QuitSet oXL = Nothing
End SubPublic Function CleanInternalSpaces(ByVal parmString As String) As StringCleanInternalSpaces = fnTrim.Trim(parmString)
End Function

WorksheetFunction.Trim()代码-[WksFunc Trim] (The WorksheetFunction.Trim() code - [WksFunc Trim])

您不需要将WorksheetFunction.Trim()调用放在函数内。 我将其放置在一个函数中,以便与类模块中的代码更好地并排比较性能。

Function WksFunctionTrim(ByVal parmString As String) As String'================================================'If running in the Excel VBA environment, invoke the'Trim Worksheetfunction'================================================WksFunctionTrim = WorksheetFunction.Trim(parmString)
End Function

性能结果与比较 (Performance Results and Comparisons)

在进入分析和建议之前,让我们看一下性能结果。

纯数据-只是地址的排列/切片 (Plain Data - Just permutations/slices of the address)

在这些Plain tests, there aren't any internal double space character strings to remove. These test results let us see the costs of invoking these routines unnecessarily.

Method          Plain 175   Plain 1464  Plain 14640 Plain 146400    Plain Avg
RegexpReplace3    3.7         4.1         4.2          4.9             4.2
RegexpReplace1    4.2         4.2         4.2          4.8             4.3
regexp ' \s+'      5.0         4.5         4.4          5.0             4.7
RegexpReplace2    4.0         4.9         5.4          6.1             5.1
just two          2.6         5.2         7.0          7.5             5.5
regexp ' {2,}'      5.8         6.0         5.7          6.1             5.9
regexp '  +'   8.7         6.5         4.8          4.7             6.2
Split2 & concat   4.7         6.8         8.2          9.9             7.4
Split2 & Join     5.5         7.2         8.6          9.2             7.6
Split2BufferFcn   4.9         8.6        11.0         12.7             9.3
Split2 & buffer   5.0         8.8        11.2         13.0             9.5
three & two       3.6         9.4        13.1         13.9            10.0
Split2 & col      8.8         9.5        10.0         12.5            10.2
WksFunc Trim     12.8        12.2         8.0                         11.0
Split2 & dic     11.4        10.2        10.6         12.0            11.0
Multiple repls   10.2        24.2        32.3         33.4            25.0
Split & buffer   17.8        59.5        83.5         89.3            62.5
Split & Join     22.1        73.9       101.7        111.4            77.3
Split & col      47.0       142.7       197.7        207.6           148.7
RegexParseBuf    48.8       167.5       238.7        244.5           174.9
clsXL trim      335.0       182.7        45.8                        187.8
regexp parse     60.6       192.6       271.3        277.9           200.6
Split & dic      82.9       273.8       396.3        525.4           319.6
Split & concat   25.6       100.4       400.4       4594.7          1280.3

There is such a wide range of (relative performance) values that we need to use a log scale when displaying all the methods.  The worst performers go off the top side of the chart.

在显示所有方法时,我们需要使用对数刻度来表示如此广泛的(相对性能)值。 表现最差的股票排在图表的顶部。

填充数据-用多余的空间填充这些切片 (Stuffed Data - Stuff those slices with extra spaces)

Method          Stuffed 514 Stuffed 4700    Stuffed 47759   Stuffed 477271  Stuffed Avg
RegexpReplace2    6.0         7.1             7.3              9.9             7.6
RegexpReplace1    6.4         7.4             7.4             10.0             7.8
regexp ' {2,}'      6.9         7.9             7.5             10.2             8.1
RegexpReplace3    6.4         8.3             8.6             11.3             8.7
regexp ' \s+'      7.2         8.6             8.7             11.2             8.9
regexp '  +'  10.3         9.0             7.9             10.1             9.3
WksFunc Trim     10.9         8.6                                              9.7
RegexParseBuff   39.3        75.8            83.5             87.5            71.5
regexp parse     48.1        86.4            93.3             95.7            80.9
Multiple repls   58.8       111.2           116.1            125.0           102.8
three & two      39.0        82.0            94.3            211.1           106.6
clsXL trim      254.5        92.6                                            173.6
Split2BufferFcn  64.1       141.3           165.0            525.6           224.0
Split2 & buffer  64.9       143.4           167.3            522.1           224.4
Split2 & Join    69.9       150.1           172.1            522.6           228.7
just two         64.1       138.2           160.1            563.7           231.5
Split2 & col     82.8       177.6           204.0            553.2           254.4
Split2 & dic    106.8       228.8           263.8            651.2           312.6
Split2 & concat  71.5       164.7           305.1           1877.6           604.7
Split & buffer  119.9       271.8           321.9           1993.2           676.7
Split & Join    124.8       273.5           325.1           2020.8           686.1
Split & col     141.4       305.4           359.1           2025.7           707.9
Split & dic     167.2       365.2           422.6           2153.2           777.0
Split & concat  127.1       289.7           461.5           3176.0          1013.6

When we look at the performance of the methods doing actual work, we still have to use a log scale.

当我们看一下实际工作中方法的性能时,我们仍然必须使用对数刻度。

绩效分析 (Performance Analysis)

简单的排列向我们表明,这些算法中的一些代价不菲。

The stuffed permutations show us that memory management and string handling can cause algorithms to behave very badly.

填充的排列向我们表明,内存管理和字符串处理可能导致算法表现异常。

绩效建议 (Performance Recommendations)

这是建议清单

  • Check to see if there is anything to be done, no matter what algorithms and functions you use.  Instr() is fast, so it is worth the overhead.无论您使用哪种算法和功能,都要检查是否有任何事情要做。 Instr()速度很快,因此值得增加开销。
  • Although the vbscript.regexp object isn't normally as fast as the Split() function, the simplicity of the Split() function, and limited pattern options, causes it to be slower than regexp when the pattern isn't implemented.尽管vbscript.regexp对象通常不如Split()函数那么快,但是Split()函数的简单性和有限的模式选项使它在未实现模式时比regexp慢。
  • The Replace() approach is usually faster than splitting the string, with the Regexp replace function much faster for repeated character removal.Replace()方法通常比拆分字符串快,而Regexp替换功能对于重复删除字符要快得多。
  • Trim() and Join() are also very fast functions.Trim()和Join()也是非常快的函数。
  • Avoid string concatenation when you are faced with the possibility of long strings, using Join() or the buffering technique.当您使用Join()或缓冲技术来避免长字符串的可能性时,请避免字符串串联。
  • The use of collections and dictionaries won't make up for the inefficiencies of the Split() function when removing duplicate character strings.删除重复的字符串时,使用集合和字典不能弥补Split()函数的效率低下。
  • Clear out or reset variables when timing计时时清除或重置变量
  • When creating log files, think about how the data will be used and make your parsing tasks easier.创建日志文件时,请考虑如何使用数据并简化解析任务。
  • Validate your code.  Take a unit testing approach and verify that what you are testing actually produces correct/expected results.验证您的代码。 采用单元测试方法,并验证您所测试的内容实际上产生正确/预期的结果。
  • Local variables perform better than repeatedly altering the function value.局部变量的性能优于重复更改函数值。
  • Local object variables perform better than parameterized/passed objects.局部对象变量的性能优于参数化/传递的对象。
  • When iterating arrays, the standard For...Next loop is faster than the For Each...Next loop.迭代数组时,标准的For ... Next循环比For Each ... Next循环快。

插入空格 (Inserting spaces)

当在每个排列中插入空格字符时,地址中每个单词之间可能存在1到26个空格,并且趋向于平均有13个连续空格。

Function StuffWithSpaces(ByVal parmString As String, parmSeed) As String'================================================'Add Random number of internal space characters to the address permutation'Since I am specifying a max space length of 26, the average space sequence'will be 13 characters long.'================================================Dim lngRnd As LongDim strWords() As StringDim lngLoop As LongConst cMaxSpaces As Long = 26Rnd -1                 'reset the random sequenceRandomize parmSeed       'initialize the random sequencestrWords = Split(parmString, " ")For lngLoop = 0 To UBound(strWords) - 1lngRnd = Int(Rnd() * cMaxSpaces) + 1strWords(lngLoop) = strWords(lngLoop) & Space(lngRnd)NextStuffWithSpaces = Join(strWords, vbNullString) 'don't add any more spaces with the'Join() operation
End Function

代码和文件 (Code and Files)

葛底斯堡地址的文本: GettysburgAddress.txtGettysburgAddress.txt

The log file from a test run:  DeSpaceLog.txt

测试运行中的日志文件: DeSpaceLog.txt

A parsed and massaged version of the log file with statistics:  DeSpaceLog.xls

具有统计信息的日志文件的经过解析和处理的版本: DeSpaceLog.xls

The code:

代码:

Option ExplicitPrivate Declare Function getTickCount Lib "kernel32" Alias "GetTickCount" () As LongPrivate Declare Function CPUFrequency Lib "kernel32" _Alias "QueryPerformanceFrequency" (cyFrequency As Currency) As LongPrivate Declare Function CPUTickCount Lib "kernel32" _Alias "QueryPerformanceCounter" (cyTickCount As Currency) As LongEnum eSizeRequesteFirstParagraph = 1eSameAsDocument = 2eTenFold = 3eHundredFold = 4
End EnumSub Despace()Dim strTemp As StringDim sngStart As SingleDim dblStart As DoubleDim lngStart As LongDim oRE As ObjectDim curFreq As CurrencyDim curStart As CurrencyDim curEnd As CurrencyDim vItem As VariantDim strFind As StringDim lngLoop As LongDim vParsed As VariantDim strWords() As StringDim colWords As New CollectionDim dicWords As ObjectDim oMatches As ObjectDim oM As ObjectDim strFileData As StringDim strTestString As StringDim lngSize As LongDim lngIterator As LongDim lngPlainStuffed As LongConst cIterations As Long = 21Dim colLog As New CollectionDim lngFirstHit As LongDim strCurrentTask As StringConst cPath As String = "c:\users\mark\documents\"Dim clsXL As New clsWksFuncTrimvParsed = Array()Open cPath & "gettysburgaddress.txt" For Input As #1strFileData = Input(LOF(1), #1)Close'======================================================='iterate the different codes with the following'   * first paragraph'   * entire file contents'   * x10 and x100 copies of the entire file contents'for each iteration,'   test with the base text (as written)'   test with inserted spaces.'=======================================================CPUFrequency curFreqFor lngSize = 1 To 4strTestString = StringSizes(strFileData, lngSize)For lngPlainStuffed = 0 To 1If lngPlainStuffed = 1 ThenstrTestString = StuffWithSpaces(strTestString, 42)End IfstrCurrentTask = lngSize & vbTab & Array("Plain: ", "Stuffed: ")(lngPlainStuffed) & vbTab & Len(strTestString) & vbTab & InStr(strTestString, "  ")For lngIterator = 1 To cIterationsCPUTickCount curStartlngFirstHit = InStr(strTestString, "zz")CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Max Instr() time: " & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")strTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = JustTwo(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "just two" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (2)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = ThreeTwo(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "three & two" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (3&2)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = MultiLengths(strTemp, Array(19, 11, 7, 3, 2))CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Multiple replaces" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Multi)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = "  +"strTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = RegexpReplace(strTemp, oRE)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "regexp '  +'" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Regexp 1)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfoRE.Pattern = " {2,}"strTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = RegexpReplace(strTemp, oRE)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "regexp ' {2,}'" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Regexp 2)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfoRE.Pattern = " \s+"strTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = RegexpReplace(strTemp, oRE)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "regexp ' \s+'" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Regexp 3)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = RegexpReplace1(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "RegexpReplace1" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (RegexpReplace1)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = RegexpReplace2(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "RegexpReplace2" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (RegexpReplace2)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = RegexpReplace3(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "RegexpReplace3" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (RegexpReplace3)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringIf Len(strTemp) < 32768 ThenstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = clsXL.CleanInternalSpaces(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "clsXL trim" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (clsXL trim)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = WksFunctionTrim(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "WksFunc Trim" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (WksFunc Trim)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed'                    StopEnd IfEnd IfstrTemp = strTestStringErase vParsedCPUTickCount curStartvParsed = Split(strTemp, " ")CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split time: " & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")strTemp = strTestStringErase vParsedCPUTickCount curStartvParsed = Split(strTemp, "  ")CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split2 time: " & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")strTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = SplitJoin(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split & Join" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split & Join)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrWords = Split(strTemp, " ")strTemp = vbNullStringCPUTickCount curStartstrTemp = Join(strWords, " ")CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Join time: " & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")strTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = Split2Join(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split2 & Join" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split2 & Join)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = SplitCol(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split & col" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split & col)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = Split2Col(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split2 & col" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split2 & col)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = SplitDic(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split & dic" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split & dic)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = Split2Dic(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split2 & dic" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split2 & dic)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = SplitConcat(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split & concat" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split & concat)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = Split2Concat(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split2 & concat" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split2 & concat)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = SplitBuffer(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split & buffer" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split & buffer)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = Split2Buffer(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split2 & buffer" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split2 & buffer)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = Split2BufferFcn(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "Split2BufferFcn" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Split2BufferFcn)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd If'                Erase strWordsoRE.Pattern = "[^ ]+"strTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = RegexParse(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "regexp parse" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (Regexp parse)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfstrTemp = strTestStringsngStart = TimerlngStart = getTickCount()CPUTickCount curStartstrTemp = RegexParseBuffer(strTemp)CPUTickCount curEndcolLog.Add strCurrentTask & vbTab & "RegexParseBuffer" & vbTab & "CPU cycles: " & vbTab & Format((curEnd - curStart) / curFreq, "0.000000000")If strTemp <> StringSizes(strFileData, lngSize) ThenDebug.Print "strTemp not cleaned properly (RegexParseBuffer)." & vbTab & "lngSize: " & lngSize & vbTab & "lngPlainStuffed: " & lngPlainStuffed
'                    StopEnd IfNext lngIteratorDoEventsNext lngPlainStuffedNext lngSizeOpen cPath & "DeSpaceLog.txt" For Output As #1For Each vItem In colLogPrint #1, vItemNextCloseDebug.Print Now, "Despace() Finished"AppActivate Application.CaptionSet clsXL = NothingMsgBox "Despace() Finished", vbOKOnly, Now
End SubFunction JustTwo(ByVal parmString As String) As String'================================================'Replace all double space strings with a single space.'Iterate until there are no more double space character'strings'================================================Dim strTemp As StringstrTemp = parmStringDo Until InStr(strTemp, "  ") = 0strTemp = Replace(strTemp, "  ", " ")LoopJustTwo = strTemp
End FunctionFunction MultiLengths(ByVal parmString As String, _ByVal parmLengths As Variant) As String'================================================'Iterate the parmLengths array and invoke the Replace() function'with a space string of each length.'================================================Dim vItem As VariantDim strTemp As StringDim strFind As StringstrTemp = parmStringFor Each vItem In parmLengthsstrFind = Space(vItem)    'create a vItem length string of spacesDo Until InStr(strTemp, strFind) = 0strTemp = Replace(strTemp, strFind, " ")LoopNextMultiLengths = strTemp
End FunctionFunction RegexParse(ByVal parmString As String) As String'================================================'Use local static regexp object to parse the words.'Copy the parsed words to an array and Join them with'a single space delimiter'================================================Dim oMatches As ObjectDim oM As ObjectDim strWords() As StringDim lngLoop As LongStatic oRE As ObjectIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = "[^ ]+"End IfSet oMatches = oRE.Execute(parmString)ReDim strWords(0 To oMatches.Count - 1)lngLoop = 0For Each oM In oMatchesstrWords(lngLoop) = oM.ValuelngLoop = lngLoop + 1NextRegexParse = Join(strWords, " ")
End FunctionFunction RegexParseBuffer(ByVal parmString As String) As String'================================================'Use local static regexp object to parse the words.'Copy the parsed words to the buffer'================================================Dim oMatches As ObjectDim oM As ObjectStatic oRE As ObjectDim strBuffer As StringDim lngWordPosn As LongDim strTemp As StringIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = "[^ ]+"End IfSet oMatches = oRE.Execute(parmString)strBuffer = Space(Len(parmString))lngWordPosn = 1For Each oM In oMatchesstrTemp = oM.ValueMid$(strBuffer, lngWordPosn, Len(strTemp)) = strTemplngWordPosn = lngWordPosn + Len(strTemp) + 1NextRegexParseBuffer = RTrim(strBuffer)
End FunctionFunction RegexpReplace(ByVal parmString As String, parmRegexp As Object) As String'================================================'Use parameter regexp object to remove duplicate spaces.'The parameter regexp object will already have its pattern property set'by the calling code.'================================================RegexpReplace = parmRegexp.Replace(parmString, " ")
End FunctionFunction RegexpReplace1(ByVal parmString As String) As String'================================================'Use local static regexp object to remove duplicate spaces'================================================Static oRE As ObjectIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = "  +"End IfRegexpReplace1 = oRE.Replace(parmString, " ")
End FunctionFunction RegexpReplace2(ByVal parmString As String) As String'================================================'Use local static regexp object to remove duplicate spaces'================================================Static oRE As ObjectIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = " {2,}"End IfRegexpReplace2 = oRE.Replace(parmString, " ")
End FunctionFunction RegexpReplace3(ByVal parmString As String) As String'================================================'Use local static regexp object to remove duplicate spaces''WARNING: This pattern will remove characters other than'       space characters due to the use of the \s in the pattern'================================================Static oRE As ObjectIf oRE Is Nothing ThenSet oRE = CreateObject("vbscript.regexp")oRE.Global = TrueoRE.Pattern = " \s+"End IfRegexpReplace3 = oRE.Replace(parmString, " ")
End FunctionFunction SplitBuffer(ByVal parmString As String) As String'================================================'Split() string with single space character delimiter,'assign non-empty strings to next output buffer position,'returned the trimmed output buffer string'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongDim lngWordPosn As LongDim strBuffer As StringstrParsed = Split(parmString, " ")strTemp = vbNullStringlngWordPosn = 1strBuffer = Space(Len(parmString))For lngLoop = 0 To UBound(strParsed)strTemp = strParsed(lngLoop)If Len(strTemp) <> 0 ThenMid$(strBuffer, lngWordPosn) = strTemplngWordPosn = lngWordPosn + Len(strTemp) + 1End IfNextSplitBuffer = RTrim(strBuffer)End FunctionFunction Split2Buffer(ByVal parmString As String) As String'================================================'Mostly the same as SplitBuffer, except using a double space'delimiter for the Split() function.''Split() string with a double space character delimiter,'assign non-empty strings to next output buffer position,'returned the trimmed output buffer string'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongDim lngWordPosn As LongDim strBuffer As StringstrParsed = Split(parmString, "  ")strTemp = vbNullStringlngWordPosn = 1strBuffer = Space(Len(parmString))For lngLoop = 0 To UBound(strParsed)strTemp = Trim(strParsed(lngLoop))If Len(strTemp) <> 0 ThenMid$(strBuffer, lngWordPosn, Len(strTemp)) = strTemplngWordPosn = lngWordPosn + Len(strTemp) + 1End IfNextSplit2Buffer = RTrim(strBuffer)End FunctionFunction Split2BufferFcn(ByVal parmString As String) As String'================================================'Mostly the same as SplitBuffer, except using a double space'delimiter for the Split() function.''Split() string with a double space character delimiter,'assign non-empty strings to next output buffer position,'returned the trimmed output buffer string'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongDim lngWordPosn As LongstrParsed = Split(parmString, "  ")strTemp = vbNullStringlngWordPosn = 1Split2BufferFcn = Space(Len(parmString))For lngLoop = 0 To UBound(strParsed)strTemp = Trim(strParsed(lngLoop))If Len(strTemp) <> 0 ThenMid$(Split2BufferFcn, lngWordPosn) = strTemplngWordPosn = lngWordPosn + Len(strTemp) + 1End IfNextSplit2BufferFcn = RTrim(Split2BufferFcn)End FunctionFunction SplitCol(ByVal parmString As String) As String'================================================'Split() string with single space character delimiter,'adding the non-empty strings to a collection object.'Copy the collection items to an array and'Join() them as the returned value'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongDim strWords() As StringDim colWords As New CollectionDim vItem As VariantstrParsed = Split(parmString, " ")For lngLoop = 0 To UBound(strParsed)strTemp = strParsed(lngLoop)If Len(strTemp) <> 0 ThencolWords.Add strTempEnd IfNextReDim strWords(1 To colWords.Count)lngLoop = 1For Each vItem In colWordsstrWords(lngLoop) = vItemlngLoop = lngLoop + 1NextSplitCol = Join(strWords, " ")End FunctionFunction Split2Col(ByVal parmString As String) As String'================================================'Mostly the same as SplitCol, except using a double space'delimiter for the Split() function.''Split() string with double space character delimiter,'adding the non-empty strings to a collection object.'Copy the collection items to an array and'Join() them as the returned value'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongDim strWords() As StringDim colWords As New CollectionDim vItem As VariantstrParsed = Split(parmString, "  ")For lngLoop = 0 To UBound(strParsed)strTemp = Trim(strParsed(lngLoop))If Len(strTemp) <> 0 ThencolWords.Add strTempEnd IfNextReDim strWords(1 To colWords.Count)lngLoop = 1For Each vItem In colWordsstrWords(lngLoop) = vItemlngLoop = lngLoop + 1NextSplit2Col = Join(strWords, " ")End FunctionFunction SplitConcat(ByVal parmString As String) As String'================================================'Split() string with single space character delimiter,'concatenate non-empty strings with a trailing space character'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongDim strSplitConcat As StringstrParsed = Split(parmString, " ")lngLoop = 0strTemp = vbNullStringFor lngLoop = 0 To UBound(strParsed)strTemp = strParsed(lngLoop)If Len(strTemp) <> 0 ThenstrSplitConcat = strSplitConcat & strTemp & " "     'concatenate with spaceEnd IfNextSplitConcat = RTrim(strSplitConcat)End FunctionFunction Split2Concat(ByVal parmString As String) As String'================================================'Mostly the same as SplitConcat, but using double space character'string as delimiter for the Split() function''Split() string with a double space character delimiter,'concatenate non-empty strings with a trailing space character'================================================Dim strParsed() As StringDim strTemp As StringDim lngLoop As LongDim strSplit2Concat As StringstrParsed = Split(parmString, "  ")strTemp = vbNullStringFor lngLoop = 0 To UBound(strParsed)strTemp = Trim(strParsed(lngLoop))If Len(strTemp) <> 0 ThenstrSplit2Concat = strSplit2Concat & strTemp & " "       'concatenate with spaceEnd IfNextSplit2Concat = RTrim(strSplit2Concat)       'remove trailing spaceEnd FunctionFunction SplitDic(ByVal parmString As String) As String'================================================'Split() string with a single space character delimiter,'adding non-empty strings to the dictionary,'then Join() the dictionary object's items array.'================================================Dim strParsed() As StringDim lngLoop As LongDim lngKey As LongDim strTemp As StringStatic dicWords As ObjectIf dicWords Is Nothing ThenSet dicWords = CreateObject("scripting.dictionary")ElsedicWords.RemoveAllEnd IfstrParsed = Split(parmString, " ")lngKey = 1For lngLoop = 0 To UBound(strParsed)strTemp = strParsed(lngLoop)If Len(strTemp) <> 0 ThendicWords.Add CStr(lngKey), strTemplngKey = lngKey + 1End IfNextSplitDic = Join(dicWords.items, " ")End FunctionFunction Split2Dic(ByVal parmString As String) As String'================================================'Mostly the same as SplitDic, but using double space character'string as delimiter for the Split() function''Split() string with a double space character delimiter,'adding non-empty strings to the dictionary,'then Join() the dictionary object's items array'================================================Dim strParsed() As StringDim lngLoop As LongDim lngKey As LongDim strTemp As StringStatic dicWords As ObjectIf dicWords Is Nothing ThenSet dicWords = CreateObject("scripting.dictionary")ElsedicWords.RemoveAllEnd IfstrParsed = Split(parmString, "  ")lngKey = 1For lngLoop = 0 To UBound(strParsed)strTemp = Trim(strParsed(lngLoop))If Len(strTemp) <> 0 ThendicWords.Add CStr(lngKey), strTemplngKey = lngKey + 1End IfNextSplit2Dic = Join(dicWords.items, " ")End FunctionFunction SplitJoin(ByVal parmString As String) As String'================================================'Split() string with single space character delimiter,'move non-empty strings to the front of the strParsed array,'Redim the strParsed array down to the number of words we have,'then Join() the strParsed array items with'a single space character'================================================Dim strParsed() As StringDim lngLoop As LongDim lngWord As LongDim strTemp As StringstrParsed = Split(parmString, " ")lngWord = 0'Move non-empty strings to the front of the strParsed arrayFor lngLoop = 0 To UBound(strParsed)strTemp = strParsed(lngLoop)If Len(strTemp) <> 0 ThenstrParsed(lngWord) = strTemplngWord = lngWord + 1End IfNext'reduce size of the strParsed array to equal the number'of non-empty strings we found.ReDim Preserve strParsed(0 To lngWord - 1)SplitJoin = Join(strParsed, " ")End FunctionFunction SplitJoin_InPlace(ByVal parmString As String) As String'================================================'Split() string with single space character delimiter,'move non-empty strings to the front of the strParsed array,'Redim the strParsed array down to the number of words we have,'then Join() the strParsed array items with'a single space character'================================================Dim strParsed() As StringDim lngLoop As LongDim lngWord As LongDim strTemp As StringstrParsed = Split(parmString, "  ")lngWord = 0'Move non-empty strings to the front of the strParsed arrayFor lngLoop = 0 To UBound(strParsed)strTemp = strParsed(lngLoop)If Len(strTemp) <> 0 ThenstrParsed(lngWord) = strTemplngWord = lngWord + 1End IfNext'reduce size of the strParsed array to equal the number'of non-empty strings we found.ReDim Preserve strParsed(0 To lngWord - 1)SplitJoin_InPlace = Join(strParsed, " ")End FunctionFunction Split2Join(ByVal parmString As String) As String'================================================'Mostly the same as SplitJoin, but using double space character'string as delimiter for the Split() function''Split() string with single space character delimiter,'move non-empty strings to the front of the strParsed array,'Redim the strParsed array down to the number of words we have,'then Join() the strParsed array items with'a single space character'================================================Dim strParsed() As StringDim lngLoop As LongDim lngWord As LongDim strTemp As StringstrParsed = Split(parmString, "  ")lngWord = 0'Move non-empty strings to the front of the strParsed arrayFor lngLoop = 0 To UBound(strParsed)strTemp = Trim(strParsed(lngLoop))If Len(strTemp) <> 0 ThenstrParsed(lngWord) = strTemplngWord = lngWord + 1End IfNext'reduce size of the strParsed array to equal the number'of non-empty strings we found.ReDim Preserve strParsed(0 To lngWord - 1)Split2Join = Join(strParsed, " ")End FunctionFunction StringSizes(ByVal parmString As String, parmSizeRequest As eSizeRequest) As String'================================================'Return size permutation of Gettysburg address.'Parameters:'   1: First paragraph'   2: The (original) address = parmString'   3: 10 concatenations of the address'   4: 100 concatenations of the address'================================================Dim lngLoop As LongDim strTemp() As StringSelect Case parmSizeRequestCase eSizeRequest.eFirstParagraph   'first paragraphStringSizes = Split(parmString, vbCrLf, 2)(0)Case eSizeRequest.eSameAsDocument   'same as parameterStringSizes = parmStringCase eSizeRequest.eTenFold          'repeat ten timesReDim strTemp(1 To 10)For lngLoop = 1 To 10strTemp(lngLoop) = parmStringNextStringSizes = Join(strTemp, vbNullString)Case eSizeRequest.eHundredFold      'repeat one hundred timesReDim strTemp(1 To 100)For lngLoop = 1 To 100strTemp(lngLoop) = parmStringNextStringSizes = Join(strTemp, vbNullString)Case ElseStringSizes = vbNullStringEnd SelectEnd FunctionFunction StuffWithSpaces(ByVal parmString As String, parmSeed) As String'================================================'Add Random number of internal space characters'================================================Dim lngRnd As LongDim strWords() As StringDim lngLoop As LongConst cMaxSpaces As Long = 26Dim lngSum As Long      'used to verify avg inserter spaces lengthRnd -1                  'reset random sequenceRandomize parmSeed      'begin random sequencestrWords = Split(parmString, " ")For lngLoop = 0 To UBound(strWords) - 1lngRnd = Int(Rnd() * cMaxSpaces) + 1strWords(lngLoop) = strWords(lngLoop) & Space(lngRnd)NextStuffWithSpaces = Join(strWords, vbNullString)
End FunctionSub testit()'minimize code window before invoking test codeDebug.Print Now, "Before Despace"SendKeys "% N", FalseDoEventsDespaceDebug.Print Now, "After Despace"
End SubFunction ThreeTwo(ByVal parmString As String) As String'================================================'Replace all three consecutive spaces with one space,'then replace all two consecutive spaces with one space'================================================Dim strTemp As StringstrTemp = parmString'Replace three space strings with a single space until'no more instances of three space strings existDo Until InStr(strTemp, "   ") = 0strTemp = Replace(strTemp, "   ", " ")Loop'Replace two space strings with a single space until no'more instances of two space strings existDo Until InStr(strTemp, "  ") = 0strTemp = Replace(strTemp, "  ", " ")LoopThreeTwo = strTemp
End FunctionFunction WksFunctionTrim(ByVal parmString As String) As String'================================================'If running in the Excel VBA environment, invoke the'Trim Worksheetfunction'================================================WksFunctionTrim = WorksheetFunction.Trim(parmString)
End Function

参考文献和相关文章 (References and Related Articles)

快速字符串生成器类http:A_8311.html-http:A_8311.html

Better Concat Function -  http:A_7811.html

更好的Concat功能-http:A_7811.html

Using Regular Expressions in VBA environment -  http:A_1336.html

在VBA环境中使用正则表达式-http:A_1336.html

Analysis of the VB's Random Number Generator Functions -  http:A_11114.html

VB的随机数生成器功能分析-http:A_11114.html

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

=-=-=-=-=-=-=-=-=-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=

If you liked this article and want to see more from this author,  please click here.

如果您喜欢本文,并希望从该作者那里获得更多信息, 请单击此处。

If you found this article helpful, please click the Vote this article as helpful button at the bottom of the page.

如果您发现本文有帮助,请单击页面底部的“将本文投票为有帮助的按钮”。

Thanks!谢谢!

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

=-=-=-=-=-=-=-=-=-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=-=-=- =-=-=

翻译自: https://www.experts-exchange.com/articles/17559/Efficient-String-Clean-up-Removing-Internal-Duplicate-Spaces.html

字符串删除重复字符

字符串删除重复字符_高效的字符串清理-删除内部重复空间相关推荐

  1. python删除字符串中重复字符_从Python中删除字符串标点符号的最佳方法

    似乎有一个比以下更简单的方法: 1 2 3import string s ="string. With. Punctuation?" # Sample string out = s ...

  2. python删除字符串中重复字符_删除字符串中重复字符python 用CAD怎么画DNA反向

    用CAD怎么画DNA反向平行双螺旋结构绘螺旋线时,用选扭曲,确定顺时针. 画双头螺旋线时,第二根螺旋线底圆起点与第一根螺旋线底圆起点,可用角度分隔如180°.python去除文本中重复的字符串可有可无 ...

  3. [转载] 字符串最长重复子串python_查找字符串中重复字符的最长子字符串

    参考链接: Python字符串| digits 我尽量不在codeforces问题上寻求帮助,除非我真的,真的,卡住了,现在正好是.在Your first mission is to find the ...

  4. py遍历字符串的每个字符_“你的字符串遍历对了吗?”

    前 言 最近小编在看<Java核心技术 卷一>,遇到有趣或者难以理解的地方就和大家分享一下.希望我们能共同进步,以梦为马,不负韶华. 字符串遍历似乎是一个很基础同时也很简单的问题,但是字符 ...

  5. python取出字符串中的偶数_从给定字符串中删除偶数个连续的重复字符

    我试图解决这样一个问题:我将字符串作为输入,然后删除偶数计数的重复字符.在 在输入:AZXXZYYYDDDYZZZ在 输出:azzz 你能帮我做这个吗.在 我的尝试在删除重复字符方面效果不错,但我一直 ...

  6. c++ 查找 list中最长的字符串_查找不重复字符的最长子字符串(编程面试中常见题-用8种编程语言来回答)...

    查找不重复字符的最长子字符串(编程面试中常见题-用8种编程语言来回答) 给定一个字符串str,找到不重复字符的最长子字符串. 比如我们有 "ABDEFGABEF", 最长的字符串是 ...

  7. Leetcode之无重复字符的最长字符串

    今天的题目主要涉及的是双指针的问题.可能前端同学当听到双指针时可能会有些头皮发麻,难免会想到一些后端知识,其实指针我们也是很常用的.当使用for循环时,例如循环中的变量i则就是一个指针,这道题的解法有 ...

  8. 去除字符串中重复字符

    题目http://www.cricode.com/260.html 设计算法并写出代码移除字符串中重复的字符,不能使用额外的缓存空间.注意: 可以使用额外的一个或两个变量,但不允许额外再开一个数组拷贝 ...

  9. php 计算字符串相邻最大重复数_php如何解决字符串中重复字符的次数并且排序输出的方法...

    在php开发中有这样的需求.在指定的字符串中提取出每个单位字符出现的次数,并且倒序排序,截取前4个.留作使用.刚拿到这个需求的时候,我想了想,难道要把每个字符全部切割出来之后,一一的比对计算出相应的出 ...

最新文章

  1. 解决svn的working copy locked并且cleanup恢复不能的情况
  2. 在linux系统下安装两个nginx以及启动、停止、重起
  3. Robust principal component analysis?(RPCA简单理解)
  4. jQuery工具和方法(二)
  5. android java json与实体互相转换工具
  6. 如何通过Restful API的方式读取SAP Commerce Cloud的Product Reference
  7. android落下动画,Android应用开发android 购物车小球掉落动画
  8. JS面向对象——原型式继承函数、寄生式继承函数、寄生组合式继承
  9. 陈年牵手徐静蕾之一箭双雕
  10. visual studio哪一款比较好用_时下比较流行的7种家居装修风格,你钟爱哪一款?...
  11. JS获取FckEditor的值
  12. 多线程的那点儿事(之C++锁)
  13. AI给植物看病,宾大用TensorFlow做的这款应用造福坦桑尼亚农民
  14. CUDA学习(二十九)
  15. innodb的锁时间
  16. android.jar 位置,Android 导入jar包 so模块--导入放置的目录
  17. 【HackerRank】Cut the tree
  18. Robocode学习Java
  19. openwrt 遇到问题三 高通9531编译过程
  20. foxmail超大附件服务器文件怎么删,电脑中使用Foxmail发送超大附件的方法

热门文章

  1. 中文分析器IK Analyzer
  2. JS-11-JQ选择器
  3. 共享文件和文档方法指南
  4. 查看APK签名信息和版本号
  5. GROM查询操作总结
  6. 一个人旅游的十大旅游装备
  7. node-red templete节点设置地图
  8. shell--2--shell变量 定义变量 使用变量 只读变量 删除变量 变量类型 Shell字符串 shell函数
  9. leaf游戏服务器开发系列,Leaf游戏服务器简析(一)之模块生命周期
  10. Vue-饿了么系统项目