系列文章:

Python 日志logging模块初探及多线程踩坑(1)

Python 日志logging模块初探及多线程踩坑(2)


接着上面一篇文章,我们这篇来写一个多进程兼容且无损性能的 TimedRotatingFileHandler

TimedRotatingFileHandler 是 Python 官方提供的一个可以基于时间自动切分日志的 Handler 类。但是它是没有办法支持多进程的日志切换,多进程进行日志切换的时候会丢失日志数据,导致日志错乱。在win10上还会报错:‘win32Error :  文件被占用的错误’.

我们先好好看看源码,究竟是什么问题导致它的线程不安全:

class TimedRotatingFileHandler(BaseRotatingHandler):"""Handler for logging to a file, rotating the log file at certain timedintervals.If backupCount is > 0, when rollover is done, no more than backupCountfiles are kept - the oldest ones are deleted."""def __init__(self, filename, when='h', interval=1, backupCount=0, encoding=None, delay=False, utc=False, atTime=None):BaseRotatingHandler.__init__(self, filename, 'a', encoding, delay)self.when = when.upper()self.backupCount = backupCountself.utc = utcself.atTime = atTime# Calculate the real rollover interval, which is just the number of# seconds between rollovers.  Also set the filename suffix used when# a rollover occurs.  Current 'when' events supported:# S - Seconds# M - Minutes# H - Hours# D - Days# midnight - roll over at midnight# W{0-6} - roll over on a certain day; 0 - Monday## Case of the 'when' specifier is not important; lower or upper case# will work.if self.when == 'S':self.interval = 1 # one secondself.suffix = "%Y-%m-%d_%H-%M-%S"self.extMatch = r"^\d{4}-\d{2}-\d{2}_\d{2}-\d{2}-\d{2}(\.\w+)?$"elif self.when == 'M':self.interval = 60 # one minuteself.suffix = "%Y-%m-%d_%H-%M"self.extMatch = r"^\d{4}-\d{2}-\d{2}_\d{2}-\d{2}(\.\w+)?$"elif self.when == 'H':self.interval = 60 * 60 # one hourself.suffix = "%Y-%m-%d_%H"self.extMatch = r"^\d{4}-\d{2}-\d{2}_\d{2}(\.\w+)?$"elif self.when == 'D' or self.when == 'MIDNIGHT':self.interval = 60 * 60 * 24 # one dayself.suffix = "%Y-%m-%d"self.extMatch = r"^\d{4}-\d{2}-\d{2}(\.\w+)?$"elif self.when.startswith('W'):self.interval = 60 * 60 * 24 * 7 # one weekif len(self.when) != 2:raise ValueError("You must specify a day for weekly rollover from 0 to 6 (0 is Monday): %s" % self.when)if self.when[1] < '0' or self.when[1] > '6':raise ValueError("Invalid day specified for weekly rollover: %s" % self.when)self.dayOfWeek = int(self.when[1])self.suffix = "%Y-%m-%d"self.extMatch = r"^\d{4}-\d{2}-\d{2}(\.\w+)?$"else:raise ValueError("Invalid rollover interval specified: %s" % self.when)self.extMatch = re.compile(self.extMatch, re.ASCII)self.interval = self.interval * interval # multiply by units requested# The following line added because the filename passed in could be a# path object (see Issue #27493), but self.baseFilename will be a stringfilename = self.baseFilenameif os.path.exists(filename):t = os.stat(filename)[ST_MTIME]else:t = int(time.time())self.rolloverAt = self.computeRollover(t)def computeRollover(self, currentTime):"""Work out the rollover time based on the specified time."""result = currentTime + self.interval# If we are rolling over at midnight or weekly, then the interval is already known.# What we need to figure out is WHEN the next interval is.  In other words,# if you are rolling over at midnight, then your base interval is 1 day,# but you want to start that one day clock at midnight, not now.  So, we# have to fudge the rolloverAt value in order to trigger the first rollover# at the right time.  After that, the regular interval will take care of# the rest.  Note that this code doesn't care about leap seconds. :)if self.when == 'MIDNIGHT' or self.when.startswith('W'):# This could be done with less code, but I wanted it to be clearif self.utc:t = time.gmtime(currentTime)else:t = time.localtime(currentTime)currentHour = t[3]currentMinute = t[4]currentSecond = t[5]currentDay = t[6]# r is the number of seconds left between now and the next rotationif self.atTime is None:rotate_ts = _MIDNIGHTelse:rotate_ts = ((self.atTime.hour * 60 + self.atTime.minute)*60 +self.atTime.second)r = rotate_ts - ((currentHour * 60 + currentMinute) * 60 +currentSecond)if r < 0:# Rotate time is before the current time (for example when# self.rotateAt is 13:45 and it now 14:15), rotation is# tomorrow.r += _MIDNIGHTcurrentDay = (currentDay + 1) % 7result = currentTime + r# If we are rolling over on a certain day, add in the number of days until# the next rollover, but offset by 1 since we just calculated the time# until the next day starts.  There are three cases:# Case 1) The day to rollover is today; in this case, do nothing# Case 2) The day to rollover is further in the interval (i.e., today is#         day 2 (Wednesday) and rollover is on day 6 (Sunday).  Days to#         next rollover is simply 6 - 2 - 1, or 3.# Case 3) The day to rollover is behind us in the interval (i.e., today#         is day 5 (Saturday) and rollover is on day 3 (Thursday).#         Days to rollover is 6 - 5 + 3, or 4.  In this case, it's the#         number of days left in the current week (1) plus the number#         of days in the next week until the rollover day (3).# The calculations described in 2) and 3) above need to have a day added.# This is because the above time calculation takes us to midnight on this# day, i.e. the start of the next day.if self.when.startswith('W'):day = currentDay # 0 is Mondayif day != self.dayOfWeek:if day < self.dayOfWeek:daysToWait = self.dayOfWeek - dayelse:daysToWait = 6 - day + self.dayOfWeek + 1newRolloverAt = result + (daysToWait * (60 * 60 * 24))if not self.utc:dstNow = t[-1]dstAtRollover = time.localtime(newRolloverAt)[-1]if dstNow != dstAtRollover:if not dstNow:  # DST kicks in before next rollover, so we need to deduct an houraddend = -3600else:           # DST bows out before next rollover, so we need to add an houraddend = 3600newRolloverAt += addendresult = newRolloverAtreturn resultdef shouldRollover(self, record):"""Determine if rollover should occur.record is not used, as we are just comparing times, but it is needed sothe method signatures are the same"""t = int(time.time())if t >= self.rolloverAt:return 1return 0def getFilesToDelete(self):"""Determine the files to delete when rolling over.More specific than the earlier method, which just used glob.glob()."""dirName, baseName = os.path.split(self.baseFilename)fileNames = os.listdir(dirName)result = []prefix = baseName + "."plen = len(prefix)for fileName in fileNames:if fileName[:plen] == prefix:suffix = fileName[plen:]if self.extMatch.match(suffix):result.append(os.path.join(dirName, fileName))result.sort()if len(result) < self.backupCount:result = []else:result = result[:len(result) - self.backupCount]return resultdef doRollover(self):"""do a rollover; in this case, a date/time stamp is appended to the filenamewhen the rollover happens.  However, you want the file to be named for thestart of the interval, not the current time.  If there is a backup count,then we have to get a list of matching filenames, sort them and removethe one with the oldest suffix."""if self.stream:self.stream.close()self.stream = None# get the time that this sequence started at and make it a TimeTuplecurrentTime = int(time.time())dstNow = time.localtime(currentTime)[-1]t = self.rolloverAt - self.intervalif self.utc:timeTuple = time.gmtime(t)else:timeTuple = time.localtime(t)dstThen = timeTuple[-1]if dstNow != dstThen:if dstNow:addend = 3600else:addend = -3600timeTuple = time.localtime(t + addend)dfn = self.rotation_filename(self.baseFilename + "." +time.strftime(self.suffix, timeTuple))if os.path.exists(dfn):os.remove(dfn)self.rotate(self.baseFilename, dfn)if self.backupCount > 0:for s in self.getFilesToDelete():os.remove(s)if not self.delay:self.stream = self._open()newRolloverAt = self.computeRollover(currentTime)while newRolloverAt <= currentTime:newRolloverAt = newRolloverAt + self.interval#If DST changes and midnight or weekly rollover, adjust for this.if (self.when == 'MIDNIGHT' or self.when.startswith('W')) and not self.utc:dstAtRollover = time.localtime(newRolloverAt)[-1]if dstNow != dstAtRollover:if not dstNow:  # DST kicks in before next rollover, so we need to deduct an houraddend = -3600else:           # DST bows out before next rollover, so we need to add an houraddend = 3600newRolloverAt += addendself.rolloverAt = newRolloverAt

首先日志会被打印在一个叫 baseFilename 名字的文件中。然后在 Rotate 的时候会根据你想要打印的参数生成对应新文件的名字也就是上面函数的 dfn 的值。然后会将现在的文件重命名为 dfn 的值。之后在重新创建一个 baseFilename 的文件。然后继续往这个文件里面写。

当我们一直往 new.log 中写日志。现在该 rotate 了我们会把 new.log rename 成 new.log.2019-5-14 然后再创建一个 new.log 继续写日志,过程就是这样。

让我们来注意导致多进程问题和报错文件被占用 的最关键的几句话:

doRollover 函数中:

    if os.path.exists(dfn):os.remove(dfn)# Issue 18940: A file may not have been created if delay is True.if os.path.exists(self.baseFilename):os.rename(self.baseFilename, dfn)

根据上面的例子继续来描述。比如现在 dfn 就是 new.log.2019-5-14 。那么我会看有没有存在这个文件,如果有我就会先删除掉,然后再看下 new.log 是否存在,如果存在就执行 rename。

所以问题就很明确了,如果同时有多个进程进入临界区,那么会导致 dfn 文件被删除多次,另外下面的 rename 可能也会产生混乱。

现在我们的思路是: 首先认为文件存在即是已经有人 rename 成功过了,并且在判断文件不存在的时候只允许一个人去 rename ,其他进程如果正好进入临界区就等一等。

怎么做呢,加个锁即可。

又有人问?这个跟作者上一篇文章有什么区别呢?

区别还是有的,上篇文章当日志往日志文件写入时,我们加锁。下个线程写入时,需等待前面的线程完成后才能写入。所以性能比较慢:P

这次我们吸取上面的教训,在文件rename的一瞬间,锁住它,完成分割新的一个文件时,再释放它。它的过程是非常快的。其他线程几乎没有等待,继续向文件写入日志。这样日志不会丢失,性能不受影响。

下面贴个代码(重写doRollover方法):

class MultiCompatibleTimedRotatingFileHandler(TimedRotatingFileHandler):def doRollover(self):if self.stream:self.stream.close()self.stream = None# get the time that this sequence started at and make it a TimeTuplecurrentTime = int(time.time())dstNow = time.localtime(currentTime)[-1]t = self.rolloverAt - self.intervalif self.utc:timeTuple = time.gmtime(t)else:timeTuple = time.localtime(t)dstThen = timeTuple[-1]if dstNow != dstThen:if dstNow:addend = 3600else:addend = -3600timeTuple = time.localtime(t + addend)dfn = self.baseFilename + "." + time.strftime(self.suffix, timeTuple)# 兼容多进程并发 LOG_ROTATEif not os.path.exists(dfn):f = open(self.baseFilename, 'a')fcntl.lockf(f.fileno(), fcntl.LOCK_EX)if os.path.exists(self.baseFilename):os.rename(self.baseFilename, dfn)if self.backupCount > 0:for s in self.getFilesToDelete():os.remove(s)if not self.delay:self.stream = self._open()newRolloverAt = self.computeRollover(currentTime)while newRolloverAt <= currentTime:newRolloverAt = newRolloverAt + self.interval# If DST changes and midnight or weekly rollover, adjust for this.if (self.when == 'MIDNIGHT' or self.when.startswith('W')) and not self.utc:dstAtRollover = time.localtime(newRolloverAt)[-1]if dstNow != dstAtRollover:if not dstNow:  # DST kicks in before next rollover, so we need to deduct an houraddend = -3600else:  # DST bows out before next rollover, so we need to add an houraddend = 3600newRolloverAt += addendself.rolloverAt = newRolloverAt

注意,windows 的python模块是没有fcntl模块的,大家可以自己写一个 文件锁 替换它或者goole找一个。

当然也可以参考我上一篇文章:Python 日志logging模块初探及多线程踩坑(1)_小牛呼噜噜的博客-CSDN博客_logging.filehandler

~~

参考:

https://www.cnblogs.com/gide/p/6811927.html  python中给程序加锁之fcntl模块的使用

http://blog.jobbole.com/104331/  Linux 中 fcntl()、lockf、flock 的区别

用 Python 写一个多进程 多线程 兼容的 TimedRotatingFileHandler - piperck - 博客园

Python 日志logging模块初探及多线程踩坑(2)相关推荐

  1. python 日志 logging模块(详细解析)

    1 基本使用 转自:https://www.cnblogs.com/wf-linux/archive/2018/08/01/9400354.html 配置logging基本的设置,然后在控制台输出日志 ...

  2. python 的日志logging模块学习

    2019独角兽企业重金招聘Python工程师标准>>> python 的日志logging模块学习 分类: python 2011-08-02 23:51 8338人阅读 评论(0) ...

  3. python中日志logging模块和异常捕获traceback模块的使用

    python中日志logging模块和异常捕获traceback模块的使用 参考文章: (1)python中日志logging模块和异常捕获traceback模块的使用 (2)https://www. ...

  4. python之 logging模块详细使用【转载】

    很多程序都有记录日志的需求,并且日志中包含的信息既有正常的程序访问日志,还可能有错误.警告等信息输出,python的logging模块提供了标准的日志接口,你可以通过它存储各种格式的日志,主要用于输出 ...

  5. python 之logging 模块

    很多程序都有记录日志的需求,并且日志中包含的信息即有正常的程序访问日志,还可能有错误.警告等信息输出,python的logging模块提供了标准的日志接口,你可以通过它存储各种格式的日志,loggin ...

  6. Python零基础速成班-第11讲-Python日志Logging,小游戏设计game of life

    Python零基础速成班-第11讲-Python日志Logging,小游戏设计game of life 学习目标 Python日志Logging 小游戏设计game of life 课后作业(2必做) ...

  7. Python中logging模块的基本用法

    本文首发于:行者AI 在开发项目时,我们不可能将所有的信息都打印在控制台中.我们可以使用Python标准库提供的logging API来处理.相比print,logging模块提供了许多强大而灵活的功 ...

  8. python stdout_详解使用python的logging模块在stdout输出的两种方法

    详解使用python的logging模块在stdout输出 前言: 使用python的logging模块时,除了想将日志记录在文件中外,还希望在前台执行python脚本时,可以将日志直接输出到标准输出 ...

  9. python多个日志模块怎么隔离_在多个模块中使用Python日志logging

    最好的做法是在每个模块中都有一个像这样定义的logging器: import logging logger = logging.getLogger(__name__) 靠近模块的顶部,然后在模块的其他 ...

最新文章

  1. 怎样用c语言解一元一次方程,问一道算法题目(解一元一次方程的问题)
  2. 请检查网站服务器是否正常.,网站访问不了?如何做自我检查?
  3. 【工具类】手动获取被spring管理的bean对象
  4. 关于Oracle数据库中行迁移/行链接的问题(一)
  5. 2016福州大学软件工程第四次团队作业-系统设计成绩汇总
  6. 广告贴:本周六的SharePoint技术交流会
  7. MybatisPlus自定义sql分页和分页失效问题解决
  8. iOS开发-Masonry约束宽高比
  9. 腾讯下载的qlv格式视频转换mp4格式方法
  10. 如何实现图片和图题的组合
  11. 那些测绘工作中让你事半功倍的小工具软件,我不推荐你轻易安装
  12. 学计算机u盘多少g合适,U盘设定分配单元的大小多少合适
  13. mysql用户域账户登录_使用本地系统帐户和域用户帐户两者区别(microsoft SQLServer2000...
  14. Java实现对已有的PDF添加页码
  15. “龙腾计划”启动!邀请 500 家企业加入,与龙蜥社区一起拥抱无限生态
  16. office2016、office365和office其它版本JH
  17. 解决w950播放RM视频的问题[转]
  18. 流程引擎之compileflow简介
  19. 手动安装ceph和使用
  20. 中国房地产还能火多久?

热门文章

  1. JSP开发--MVC模式(三)
  2. AMiner发布:2022年人工智能全球最具影响力学者榜单AI 2000
  3. Private VLAN 与Switchport Protected
  4. Shell脚本遍历指定网段的在线ip
  5. 多少人败给了一个字:等 ! 宋丹丹、梁静茹、张国荣。。。
  6. 黑马瑞吉外卖项目开发笔记
  7. css写七步诗,兄弟情谊的诗句
  8. EditText光标始终保持在文字末尾
  9. 不要再「外包」AI 模型了!最新研究发现:有些破坏机器学习模型安全的「后门」无法被检测到...
  10. Elasticsearch高级(Shards、高可用集群搭建)