用Python解析WinMerge生成的Patch文件

这个代码是本人第一次用Python写的包含Class的代码。

该解析之前用VBA写过，刚刚学习了五天的Python，检验一下自己学习的成果，也算给五一长假画上一个分号。

写的比较烂，自己看着都难受。（编码规范，异常处理，Class的使用等等。。。。。。。都没学明白）

另外解析后的数据如何管理，也没有想好，也只能打log了。

功能介绍

1. 解析WinMerge生成的Patch文件，抽出相关信息。

2. 抽出的信息：文件名，变更前后的代码行数范围。

这个功能有啥用呀，肯定有人会问。

我目前筹划的用法如下

1. 本次需求变更代码的文件以及变更行数抽出来。

2. 静态解析的警告信息进行分析，将变更文件和变更行数警告进行标记，生成文档，用于进行重点的检查。

重点：希望大牛们进行指点。

输入的Patch文件

不太熟悉patch格式的，看看代码注释。

diff i circle.yml circle.yml
11c11
<    fedora33_gmake:
---
>    fedora32_gmake:
14,45c14
<        - image: docker.io/fedora:33
---
>        - image: docker.io/fedora:32
66c35
<              make roundtrip CIRCLECI=1 ROUNDTRIP_MAX_ENTRIES=25
---
>              make check roundtrip CIRCLECI=1
104,130c73
<              MAKE=bmake bmake validate-input check codecheck CIRCLECI=1
<
<    fedora30_bmake_roundtrip:
---
>              MAKE=bmake bmake validate-input check roundtrip codecheck CIRCLECI=1
132c75
<    fedora33_distcheck:
---
>    fedora_distcheck:
135c78
<        - image: docker.io/fedora:33
---
>        - image: docker.io/fedora:latest
diff i Units/review-needed.r/test.vhd.t/input.vhd Units/review-needed.r/test.vhd.t/input.vhd
4649c4649
< end parameterize;
---
> end paramterize;
diff i win32/ctags_vs2013.vcxproj win32/ctags_vs2013.vcxproj
8,11d7
<     <ProjectConfiguration Include="Debug|x64">
<     </ProjectConfiguration>
16,19d11
<     </ProjectConfiguration>
34,39d25
<   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">

运行的结果

Line 000001: [diff i circle.yml circle.yml]
Find File!  [True]
Line 000002: [11c11]
{'BEF_START': '11', 'CHG_SYM': 'c', 'AFT_START': '11'}
Find LineNo!    [True]
Line 000003: [<    fedora33_gmake:]
Find Content!   [True]
Line 000004: [---]
Find Content!   [True]
Line 000005: [>    fedora32_gmake:]
Find Content!   [True]
Line 000006: [14,45c14]
{'BEF_START': '14', 'BEF_STOP': '45', 'CHG_SYM': 'c', 'AFT_START': '14'}
Find LineNo!    [True]
Line 000007: [<        - image: docker.io/fedora:33]
Find Content!   [True]
Line 000008: [---]
Find Content!   [True]
Line 000009: [>        - image: docker.io/fedora:32]
Find Content!   [True]
Line 000010: [66c35]
{'BEF_START': '66', 'CHG_SYM': 'c', 'AFT_START': '35'}
Find LineNo!    [True]
Line 000011: [<              make roundtrip CIRCLECI=1 ROUNDTRIP_MAX_ENTRIES=25]
Find Content!   [True]
Line 000012: [---]
Find Content!   [True]
Line 000013: [>              make check roundtrip CIRCLECI=1]
Find Content!   [True]
Line 000014: [104,130c73]
{'BEF_START': '104', 'BEF_STOP': '130', 'CHG_SYM': 'c', 'AFT_START': '73'}
Find LineNo!    [True]
Line 000015: [<              MAKE=bmake bmake validate-input check codecheck CIRCLECI=1]
Find Content!   [True]
Line 000016: [<]
This line can not be parsed(failure!)   [False]
Line 000017: [<    fedora30_bmake_roundtrip:]
Find Content!   [True]
Line 000018: [---]
Find Content!   [True]
Line 000019: [>              MAKE=bmake bmake validate-input check roundtrip codecheck CIRCLECI=1]
Find Content!   [True]
Line 000020: [132c75]
{'BEF_START': '132', 'CHG_SYM': 'c', 'AFT_START': '75'}
Find LineNo!    [True]
Line 000021: [<    fedora33_distcheck:]
Find Content!   [True]
Line 000022: [---]
Find Content!   [True]
Line 000023: [>    fedora_distcheck:]
Find Content!   [True]
Line 000024: [135c78]
{'BEF_START': '135', 'CHG_SYM': 'c', 'AFT_START': '78'}
Find LineNo!    [True]
Line 000025: [<        - image: docker.io/fedora:33]
Find Content!   [True]
Line 000026: [---]
Find Content!   [True]
Line 000027: [>        - image: docker.io/fedora:latest]
Find Content!   [True]
Line 000028: [diff i Units/review-needed.r/test.vhd.t/input.vhd Units/review-needed.r/test.vhd.t/input.vhd]
Find File!  [True]
Line 000029: [4649c4649]
{'BEF_START': '4649', 'CHG_SYM': 'c', 'AFT_START': '4649'}
Find LineNo!    [True]
Line 000030: [< end parameterize;]
Find Content!   [True]
Line 000031: [---]
Find Content!   [True]
Line 000032: [> end paramterize;]
Find Content!   [True]
Line 000033: [diff i win32/ctags_vs2013.vcxproj win32/ctags_vs2013.vcxproj]
Find File!  [True]
Line 000034: [8,11d7]
{'BEF_START': '8', 'BEF_STOP': '11', 'CHG_SYM': 'd', 'AFT_START': '7'}
Find LineNo!    [True]
Line 000035: [<     <ProjectConfiguration Include="Debug|x64">]
Find Content!   [True]
Line 000036: [<     </ProjectConfiguration>]
Find Content!   [True]
Line 000037: [16,19d11]
{'BEF_START': '16', 'BEF_STOP': '19', 'CHG_SYM': 'd', 'AFT_START': '11'}
Find LineNo!    [True]
Line 000038: [<     </ProjectConfiguration>]
Find Content!   [True]
Line 000039: [34,39d25]
{'BEF_START': '34', 'BEF_STOP': '39', 'CHG_SYM': 'd', 'AFT_START': '25'}
Find LineNo!    [True]
Line 000040: [<   <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Release|x64'" Label="Configuration">]
Find Content!   [True]

# **********************************************************************************************************************
# coding: UTF-8 -*-
#
# 解析WinMerge patch 文件
# 解析变更文件的文件名以及变更文件的行号(变更前后)
#
# patch 文件生成方法 menu -> Generate Patch  [必须选的option: include command line]
#
# Pycharm 2021.1.1 + Python 3.9.4
# WinMerge 2.16.2.10
#
# Author:   sgx6660888
# Blog:     https://blog.csdn.net/sgx6660888
# **********************************************************************************************************************
import reclass ParsePatchFile():# ******************************************************************************************************************#  class variable shared by all instances# ******************************************************************************************************************# ******************************************************************************************************************# diff i circle.yml circle.yml          ===> 命令行，包含变更前后的相对路径以及文件名# 11c11                                 ===> 变更前后的行数以及变更的种类(a: add  c: change   d: delete )# <    fedora33_gmake:                  ===> 变更前内容# ---                                   ===> 分隔符# >    fedora32_gmake:                  ===> 变更后内容# ******************************************************************************************************************# Define the rule string for checking__RULE_PATTEN_FILE = "diff "__RULE_SPLIT_SYM_FILE = " "__RULE_PATTEN_CONTENT_1 = "< "__RULE_PATTEN_CONTENT_2 = "---"__RULE_PATTEN_CONTENT_3 = "> "# Define the rule of line no using regular expression for checking# The keyword in Dictionary of analysis result__QUOTE_BEF_START = "BEF_START"__QUOTE_BEF_STOP = "BEF_STOP"__QUOTE_AFT_START = "AFT_START"__QUOTE_AFT_STOP = "AFT_STOP"__QUOTE_DIGIT = "[1-9][0-9]{0,})"__QUOTE_DIGIT1 = "(?P<" + __QUOTE_BEF_START + ">" + __QUOTE_DIGIT__QUOTE_DIGIT2 = "(?P<" + __QUOTE_BEF_STOP + ">" + __QUOTE_DIGIT__QUOTE_DIGIT3 = "(?P<" + __QUOTE_AFT_START + ">" + __QUOTE_DIGIT__QUOTE_DIGIT4 = "(?P<" + __QUOTE_AFT_STOP + ">" + __QUOTE_DIGIT__QUOTE_CHG = "(?P<CHG_SYM>[acd])"# Patten: e.g 11,12c12,24__RULE_PATTEN_LINENO_1 = "^" + __QUOTE_DIGIT1 + "," + __QUOTE_DIGIT2 \+ __QUOTE_CHG + __QUOTE_DIGIT3 + "," + __QUOTE_DIGIT4# Patten: e.g 11c12,24__RULE_PATTEN_LINENO_2 = "^" + __QUOTE_DIGIT1 + __QUOTE_CHG + __QUOTE_DIGIT3 + "," + __QUOTE_DIGIT4# Patten: e.g 11,12c12__RULE_PATTEN_LINENO_3 = "^" + __QUOTE_DIGIT1 + "," + __QUOTE_DIGIT2 + __QUOTE_CHG + __QUOTE_DIGIT3# Patten: e.g 11c12__RULE_PATTEN_LINENO_4 = "^" + __QUOTE_DIGIT1 + __QUOTE_CHG + __QUOTE_DIGIT3# ******************************************************************************************************************#   Public# ******************************************************************************************************************def __init__( self ):self.PatchFileName = ""self.NowLineNo = 0def ParseStart( self, FileName: str ) -> bool:self.PatchFileName = FileNameself.__ParseSub(FileName)print("This file %s has been parsed." % (FileName))# ******************************************************************************************************************#   Private# ******************************************************************************************************************def __ParseSub( self, file_name: str ) -> bool:line_buff = ""end_flag = Falseself.PatchFileName = file_nametry:# Open filewith open(self.PatchFileName, errors='ignore') as file_object:for line_buff in file_object:line_buff = line_buff.strip()  # Delete the space in head and tailself.NowLineNo += 1  # record the line noif line_buff == "":  # skip the space linecontinue# parse the lineprint("Line %06d: [%s]" % (self.NowLineNo, line_buff))ret = self.__ParseLine(line_buff)print("[%s]" % ret)else:end_flag = Trueif file_object:file_object.close()except FileNotFoundError:print("[__ParseSub] FileNotFoundError!")else:if file_object:file_object.close()print("[__ParseSub] SYSTEM ERROR!")def __ParseLine( self, lineBuff: str ) -> bool:rslt = []# check all rules# TODO: Add Rule schedule function#       1.Run the rules by the the priority( not the rule no)#       2.Count the times of rule for optimize the priority of the rulesreturn self.__ParseRules(lineBuff, rslt)def __ParseRules( self, lineBuff: str, Rslt ) -> bool:ret = True# print("__ParseCheckRules", type(lineBuff))# TODO: Run the rules and add the analysis data into buff.# Check functionif self.__ParseCheckRule_CONTENT(lineBuff, Rslt):print("Find Content!", end="\t")# TODO:Action function (Add the Result to data struct)ret = Trueelif self.__ParseCheckRule_LINENO(lineBuff, Rslt):print("Find LineNo!", end="\t")# TODO:Action function (Add the Result to data struct)ret = Trueelif self.__ParseCheckRule_FILE(lineBuff, Rslt):print("Find File!", end="\t")# TODO:Action function (Add the Result to data struct)ret = Trueelse:print("This line can not be parsed(failure!)", end="\t")ret = Falsereturn retdef __ParseCheckRule_FILE( self, lineBuff: str, Rslt ) -> bool:# print("__ParseCheckRule_FILE", type(lineBuff))if self.__RULE_PATTEN_FILE == lineBuff[0: len(self.__RULE_PATTEN_FILE)]:# Split the file name# TODO: Parse the file name# Need attention how to do it if include the space in path name# e.g: diff i circle.yml circle.ymlreturn Trueelse:return Falsedef __ParseCheckRule_LINENO( self, lineBuff: str, Rslt ) -> bool:# 11c11if self.__ParseCheckRule_LINENO_Patten(self.__RULE_PATTEN_LINENO_1, lineBuff, Rslt):passelif self.__ParseCheckRule_LINENO_Patten(self.__RULE_PATTEN_LINENO_2, lineBuff, Rslt):passelif self.__ParseCheckRule_LINENO_Patten(self.__RULE_PATTEN_LINENO_3, lineBuff, Rslt):passelif self.__ParseCheckRule_LINENO_Patten(self.__RULE_PATTEN_LINENO_4, lineBuff, Rslt):passelse:return Falsereturn Truedef __ParseCheckRule_LINENO_Patten( self, Patten: str, lineBuff: str, Rslt ) -> bool:# 11c11reg = re.compile(Patten, re.I)reg_match = reg.match(lineBuff)if reg_match:line_grp = reg_match.groupdict()print(line_grp)return Truereturn Falsedef __ParseCheckRule_CONTENT( self, lineBuff: str, Rslt ) -> bool:# No Need to parse the string# <    fedora33_gmake:# ---# >    fedora32_gmake:if self.__RULE_PATTEN_CONTENT_1 == lineBuff[0: len(self.__RULE_PATTEN_CONTENT_1)]:return Trueelif self.__RULE_PATTEN_CONTENT_2 == lineBuff[0: len(self.__RULE_PATTEN_CONTENT_2)]:return Trueelif self.__RULE_PATTEN_CONTENT_3 == lineBuff[0: len(self.__RULE_PATTEN_CONTENT_3)]:return Trueelse:return False# For Test
if __name__ == '__main__':pf = ParsePatchFile()pf.ParseStart("E:/Study/010.ProgramLanguage/python/sample/ParseWinMergePatchFile/testData/all_file.txt")pass

用Python解析WinMerge生成的Patch文件相关推荐

Python解析CANoe录制的blf文件asc文件通用方法
Python解析CANoe录制的blf文件&asc文件通用方法一.背景由于很多时候我们在录制日志文件的时候更愿意选择BLF文件,至少目前我见到的很多公司都是使用的BLF文件来作为最 ...
Python解析CANoe录制的asc文件
Python解析CANoe录制的asc文件一.背景由于很多时候我们需要单纯分析一些报文数据,筛选或者一些故障报文,这个时候,用CANoe打开太占用设备了,而且只能过滤到某一帧报文,当我们能 ...
解析caffe生成的caffemodel文件
要想了解caffe生成的caffemodel文件里的内容,我们就需要解析.caffemodel文件(caffemodel里不仅存储了权重和偏置等信息,还存储了整个训练网络的结构信息,即.prototx ...
python解析xml生成代码_python解析xml模块封装代码
有如下的xml文件: 复制代码代码如下: 1 2 下面介绍python解析xml文件的几种方法,使用python模块实现. 方式1,python模块实现自动遍历所有节点: 复制代码代码如下: #! ...
使用python解析Wordpress导出的xml文件
在用wordpress导出日志时,得到的往往是xml文件,具体形式如下: <?xml version="1.0" encoding="UTF-8"?> ...
Python 用pyinstaller打包python程序，生成的exe文件过大问题
文章目录 1.安装 pipenv 2. 设置虚拟python 3.环境 4. 查看已有的库(非必要) 5.安装自己的.py文件中所需要的第三方库 6.利用pyinstaller 生成.exe文件 pi ...
解决用pyinstaller打包python程序，生成的.exe文件过大问题
计算机是Windows 7旗舰版 32位操作系统. Anaconda3,32bit版本 python3.7 安装完 Pyinstaller,安装Pyinstaller的命令为:pip install ...
python 项目自动生成requirements.txt文件
任何应用程序通常需要设置安装所需并依赖一组类库来满足工作要求.要求文件是指定和一次性安装包的依赖项具体一整套方法. 学习python中有什么不懂的地方,小编这里推荐加小编的python学习群:895 ...
Python解析access数据库（mdb文件或者accdb文件）
在工作中遇到这样一个问题,需要对上百个mdb文件进行数据统计,mdb文件实际上就是access数据库,使用微软的access工具即可打开. 但是我电脑上没有安装access数据库,而且官方的安装包还要 ...