需求:

本地文件中,查找在书单<信息安全从业者书单>的书籍。

原理:

遍历 README.md 将通过Everything SDK在本地查找每本书。

1、计算文件CRC32

因为只是确定本地文件的唯一性,CRC32计算效率上比md5和sha1更快,所以计算CRC.

#!usr/bin/env python
#-*- coding:utf-8 -*-  import zlib
import osblock_size = 1024 * 1024
#从文件中读取block_size大小,计算CRC32
def crc32_simple(filepath):try:with open(filepath,'rb') as f:s=f.read(block_size)return zlib.crc32(s,0)except Exception as e:print(str(e))return 0#计算整个文件的crc32
def crc32_file(filepath):crc = 0try:fd = open(filepath, 'rb')while True:buffer = fd.read(block_size)if len(buffer) == 0: # EOF or file empty. return hashesfd.close()if sys.version_info[0] < 3 and crc < 0:crc += 2 ** 32return crc#返回的是十进制的值crc = zlib.crc32(buffer, crc)except Exception as e:if sys.version_info[0] < 3:error = unicode(e)else:error = str(e)print(error)return 0

2、文件大小自动变换单位

递归实现 文件大小根据bytes,返回合理区间['B', 'KB', 'MB', 'GB', 'TB', 'PB']。eg : 16473740 bytes--> 15.727 MB

#根据文件大小 返回合理区间,16473740 bytes--> 15.727 MB
def FormatSize(size):print(size)#递归实现,精确为最大单位值 + 小数点后三位def formatsize(integer, remainder, level):if integer >= 1024:remainder = integer % 1024integer //= 1024level += 1return formatsize(integer, remainder, level)else:return integer, remainder, levelunits = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']integer, remainder, level = formatsize(size, 0, 0)if level+1 > len(units):level = -1return ( '{}.{:>03d} {}'.format(integer, remainder, units[level]) )

3、调用Everything SDK,通过everything64.dll来完成交互。

import ctypes
import datetime
import struct#dll imports
everything_dll = ctypes.WinDLL (r"./Everything64.dll")
everything_dll.Everything_GetResultDateModified.argtypes = [ctypes.c_int,ctypes.POINTER(ctypes.c_ulonglong)]
everything_dll.Everything_GetResultSize.argtypes = [ctypes.c_int,ctypes.POINTER(ctypes.c_ulonglong)]
everything_dll.Everything_GetResultFileNameW.argtypes = [ctypes.c_int]
everything_dll.Everything_GetResultFileNameW.restype = ctypes.c_wchar_p#转换时间
def get_time(filetime):#convert a windows FILETIME to a python datetime#https://stackoverflow.com/questions/39481221/convert-datetime-back-to-windows-64-bit-filetimeWINDOWS_TICKS = int(1/10**-7)  # 10,000,000 (100 nanoseconds or .1 microseconds)WINDOWS_EPOCH = datetime.datetime.strptime('1601-01-01 00:00:00','%Y-%m-%d %H:%M:%S')POSIX_EPOCH = datetime.datetime.strptime('1970-01-01 00:00:00','%Y-%m-%d %H:%M:%S')EPOCH_DIFF = (POSIX_EPOCH - WINDOWS_EPOCH).total_seconds()  # 11644473600.0WINDOWS_TICKS_TO_POSIX_EPOCH = EPOCH_DIFF * WINDOWS_TICKS  # 116444736000000000.0"""Convert windows filetime winticks to python datetime.datetime."""winticks = struct.unpack('<Q', filetime)[0]microsecs = (winticks - WINDOWS_TICKS_TO_POSIX_EPOCH) / WINDOWS_TICKSreturn datetime.datetime.fromtimestamp(microsecs)#defines 定义参看Everything.h
EVERYTHING_REQUEST_FILE_NAME = 0x00000001
EVERYTHING_REQUEST_PATH = 0x00000002
EVERYTHING_REQUEST_SIZE = 0x00000010
EVERYTHING_REQUEST_DATE_MODIFIED = 0x00000040EVERYTHING_SORT_SIZE_DESCENDING = 6#关键词搜索
def searchfile(bookName):recom = re.compile(r'[《》::、;.,,;—— -()()【】\'\"]')keyword = recom.sub(' ',bookName).strip()if len(keyword) <1:return#文件大小倒序everything_dll.Everything_SetSort(EVERYTHING_SORT_SIZE_DESCENDING)everything_dll.Everything_SetSearchW(keyword)everything_dll.Everything_SetRequestFlags(EVERYTHING_REQUEST_FILE_NAME | EVERYTHING_REQUEST_PATH | EVERYTHING_REQUEST_SIZE | EVERYTHING_REQUEST_DATE_MODIFIED)#execute the queryeverything_dll.Everything_QueryW(1)#get the number of resultsnum_results = everything_dll.Everything_GetNumResults()#show the number of resultsresult = "\nResult Count: {}\n".format(num_results)print(keyword,result)#create buffersfile_name = ctypes.create_unicode_buffer(260)file_modi = ctypes.c_ulonglong(1)file_size = ctypes.c_ulonglong(1)bPrint = FalsenCount = 0#show resultsfor i in range(num_results):everything_dll.Everything_GetResultFullPathNameW(i,file_name,260)everything_dll.Everything_GetResultDateModified(i,file_modi)everything_dll.Everything_GetResultSize(i,file_size)filepath = ctypes.wstring_at(file_name)if filepath.endswith('.lnk') or filepath.endswith('.txt'):continue#计算文件crc32,格式化为0x1122AAFFfilecrc = hex(crc32_file(filepath)).upper().replace("0X","0x")filesize = FormatSize(file_size.value)modtime = get_time(file_modi)strInfo = "\nFilePath: {}\nSize: {}    CRC32:{}".format(filepath,filesize,filecrc)print(strInfo)if not bPrint:fout.write("\n=======↓↓↓↓↓===========\n")fout.write(bookName)fout.write("\n-----------------")bPrint = Truefout.write(strInfo)nCount+=1if bPrint:fout.write("\nFind Count:{}".format(nCount))fout.write("\n=======↑↑↑↑↑===========\n")

完整代码

#!usr/bin/env python
#-*- coding:utf-8 -*-
"""
@author:hiltonwei
@file: secBooksFind.py
@time: 2021/12/06
@desc: 信息安全从业者书单推荐 https://github.com/riusksk/secbookstep1 读入 README.md,读取《》内书名step2 通过everything的sdk查找文件,并计算文件CRC32校验值,写入txt中
"""import zlib
import os
import sys
import ctypes
import datetime
import struct
import io
import re#dll imports
everything_dll = ctypes.WinDLL (r"./Everything64.dll")
everything_dll.Everything_GetResultDateModified.argtypes = [ctypes.c_int,ctypes.POINTER(ctypes.c_ulonglong)]
everything_dll.Everything_GetResultSize.argtypes = [ctypes.c_int,ctypes.POINTER(ctypes.c_ulonglong)]
everything_dll.Everything_GetResultFileNameW.argtypes = [ctypes.c_int]
everything_dll.Everything_GetResultFileNameW.restype = ctypes.c_wchar_pfout = open("secBooksFind.txt", 'a+')block_size = 1024 * 1024
#从文件中读取block_size大小,计算CRC32
def crc32_simple(filepath):try:with open(filepath,'rb') as f:s=f.read(block_size)return zlib.crc32(s,0)except Exception as e:print(str(e))return 0#计算整个文件的crc32
def crc32_file(filepath):crc = 0try:fd = open(filepath, 'rb')while True:buffer = fd.read(block_size)if len(buffer) == 0: # EOF or file empty. return hashesfd.close()if sys.version_info[0] < 3 and crc < 0:crc += 2 ** 32return crc#返回的是十进制的值crc = zlib.crc32(buffer, crc)except Exception as e:if sys.version_info[0] < 3:error = unicode(e)else:error = str(e)print(error)return 0#根据文件大小 返回合理区间,16473740 bytes--> 15.727 MB
def FormatSize(size):print(size)#递归实现,精确为最大单位值 + 小数点后三位def formatsize(integer, remainder, level):if integer >= 1024:remainder = integer % 1024integer //= 1024level += 1return formatsize(integer, remainder, level)else:return integer, remainder, levelunits = ['B', 'KB', 'MB', 'GB', 'TB', 'PB']integer, remainder, level = formatsize(size, 0, 0)if level+1 > len(units):level = -1return ( '{}.{:>03d} {}'.format(integer, remainder, units[level]) )#转换时间
def get_time(filetime):#convert a windows FILETIME to a python datetime#https://stackoverflow.com/questions/39481221/convert-datetime-back-to-windows-64-bit-filetimeWINDOWS_TICKS = int(1/10**-7)  # 10,000,000 (100 nanoseconds or .1 microseconds)WINDOWS_EPOCH = datetime.datetime.strptime('1601-01-01 00:00:00','%Y-%m-%d %H:%M:%S')POSIX_EPOCH = datetime.datetime.strptime('1970-01-01 00:00:00','%Y-%m-%d %H:%M:%S')EPOCH_DIFF = (POSIX_EPOCH - WINDOWS_EPOCH).total_seconds()  # 11644473600.0WINDOWS_TICKS_TO_POSIX_EPOCH = EPOCH_DIFF * WINDOWS_TICKS  # 116444736000000000.0"""Convert windows filetime winticks to python datetime.datetime."""winticks = struct.unpack('<Q', filetime)[0]microsecs = (winticks - WINDOWS_TICKS_TO_POSIX_EPOCH) / WINDOWS_TICKSreturn datetime.datetime.fromtimestamp(microsecs)#defines 定义参看Everything.h
EVERYTHING_REQUEST_FILE_NAME = 0x00000001
EVERYTHING_REQUEST_PATH = 0x00000002
EVERYTHING_REQUEST_SIZE = 0x00000010
EVERYTHING_REQUEST_DATE_MODIFIED = 0x00000040EVERYTHING_SORT_SIZE_DESCENDING = 6#关键词搜索
def searchfile(bookName):recom = re.compile(r'[《》::、;.,,;—— -()()【】\'\"]')keyword = recom.sub(' ',bookName).strip()if len(keyword) <1:return#文件大小倒序everything_dll.Everything_SetSort(EVERYTHING_SORT_SIZE_DESCENDING)everything_dll.Everything_SetSearchW(keyword)everything_dll.Everything_SetRequestFlags(EVERYTHING_REQUEST_FILE_NAME | EVERYTHING_REQUEST_PATH | EVERYTHING_REQUEST_SIZE | EVERYTHING_REQUEST_DATE_MODIFIED)#execute the queryeverything_dll.Everything_QueryW(1)#get the number of resultsnum_results = everything_dll.Everything_GetNumResults()#show the number of resultsresult = "\nResult Count: {}\n".format(num_results)print(keyword,result)#create buffersfile_name = ctypes.create_unicode_buffer(260)file_modi = ctypes.c_ulonglong(1)file_size = ctypes.c_ulonglong(1)bPrint = FalsenCount = 0#show resultsfor i in range(num_results):everything_dll.Everything_GetResultFullPathNameW(i,file_name,260)everything_dll.Everything_GetResultDateModified(i,file_modi)everything_dll.Everything_GetResultSize(i,file_size)filepath = ctypes.wstring_at(file_name)if filepath.endswith('.lnk') or filepath.endswith('.txt'):continue#计算文件crc32,格式化为0x1122AAFFfilecrc = hex(crc32_file(filepath)).upper().replace("0X","0x")filesize = FormatSize(file_size.value)modtime = get_time(file_modi)strInfo = "\nFilePath: {}\nSize: {}    CRC32:{}".format(filepath,filesize,filecrc)print(strInfo)if not bPrint:fout.write("\n=======↓↓↓↓↓===========\n")fout.write(bookName)fout.write("\n-----------------")bPrint = Truefout.write(strInfo)nCount+=1if bPrint:fout.write("\nFind Count:{}".format(nCount))fout.write("\n=======↑↑↑↑↑===========\n")#读取文件,将《》内的名称去特殊符号后,通过everything查找
def readMd(fileName):dataStr = []with io.open(fileName,'r', encoding='utf-8') as f:dataStr = f.readlines()for line in dataStr:if line.startswith('·'):#《》的内容start = line.find('《')end = line.find('》')end = end if end == -1 else end+1f0 = line[start:end]searchfile(f0)if __name__ == "__main__":readMd("README.md")fout.close()

Python3通过Everything SDK访问本地文件相关推荐

  1. Mac上使用nginx访问本地文件夹报403的问题

    Mac上使用nginx访问本地文件夹报403的问题 就是没有权限访问你配置的文件夹在 nginx.conf头行加入下面配置 user root wheel; 不过我的电脑在启动nginx访问的时候会发 ...

  2. 服务器读取本地文件,java远程服务器访问本地文件

    java远程服务器访问本地文件 内容精选 换一换 云服务器网络异常.防火墙未放行本地远程桌面端口.云服务器CPU负载过高等场景均可能导致云服务器无法正常登录.本节操作介绍无法登录Linux弹性云服务器 ...

  3. java访问文件服务器,java远程服务器访问本地文件

    java远程服务器访问本地文件 内容精选 换一换 云服务器网络异常.防火墙未放行本地远程桌面端口.云服务器CPU负载过高等场景均可能导致云服务器无法正常登录.本节操作介绍无法登录Linux弹性云服务器 ...

  4. 谷歌浏览器关闭跨域限制,允许跨域请求,设置允许访问本地文件

    1. 设置允许访问本地文件 只需要右键谷歌浏览器的快捷方式,查看属性,在目标一栏中空出一格然后加入字符串--allow-file-access-from-files,点击确定即可. 2. 允许跨域请求 ...

  5. 惨痛的教训,NSURL访问本地文件的问题

    今天终于有空调试之前碰到但未解决的一个问题.问题是这样的:使用http将一个视频文件(mp4格式)下载到documents目录下,拿到这个路径,ios4.3sdk下,iphone模拟器上播放失败,提示 ...

  6. 通过ip访问本地文件spring boot

    spring boot通过ip访问本地文件 方案一: 继承WebMvcConfigurerAdapter并重写addResourceHandlers方法(SpringBoot2.0及Spring 5. ...

  7. 解决谷歌访问本地文件和跨域问题

    解决谷歌访问本地文件和跨域问题(非代码) 一.解决不能访问本地文件问题 一般浏览器是不能通过load方法来加载本地文件的,那么我们可以右键点击属性. 然后将"–allow-file-acce ...

  8. 技术小白的第一篇博客 --- 虚拟机访问本地文件设置

    虚拟机访问本地文件夹 最近工作中需要使用到虚拟机在不同操作系统上测试软件,总是从服务器拷贝文件太麻烦,就找了一下虚拟机可以直接访问本地文件夹的设置方法,这里就是简单记录一下. 1.在虚拟机的设置页面中 ...

  9. flash不能访问本地文件

    flash出现"不能访问本地资源";解决方案 linux下,如果没有文件夹自行创建 在/home/{user}/.macromedia/Flash_Player/#Security ...

  10. java访问本地文件_详解Java读取本地文件并显示在JSP文件中

    详解Java读取本地文件并显示在JSP文件中 当我们初学IMG标签时,我们知道通过设置img标签的src属性,能够在页面中显示想要展示的图片.其中src的值,可以是磁盘目录上的绝对,也可以是项目下的相 ...

最新文章

  1. 敏捷开发中如何定义“完成”?
  2. 记忆模糊、记忆泛化的关键分子开关被发现
  3. java webdav服务,nginx+webdav
  4. 为什么德国人工作这么慢,但效率却很高?
  5. 你可能不知道的switch
  6. rabbitmq延迟队列实现
  7. Android测试写入文本Log
  8. mysql深度解析_百万级数据下的mysql深度解析
  9. CSDN公众号新功能上线,居然还能搜出小姐姐???(文末有福利)
  10. 基于JAVA+SpringMVC+Mybatis+MYSQL的超市订单管理系统
  11. Security+ 学习笔记29 虚拟化
  12. 有效利用PLM系统能为企业带来什么?
  13. 启动springboot项目APPLICATION FAILED TO START
  14. 颜色所代表的人的性格
  15. 使用Frida hook 获取native层代码的返回值
  16. badboy简介和回放
  17. 手把手教你做主成分分析
  18. Altium Designer原理图转OrCAD原理图方法
  19. VIT ④function 、Object-oriented programming in Python
  20. Yolov5 系列1--- Yolo发展史以及Yolov5模型详解

热门文章

  1. 机器学习-学习笔记3.1-局部加权回归
  2. [原]减小VC6编译生成的exe文件的大小
  3. 杭电 2838 牛叉的树状数组
  4. GetRows的用法详解
  5. linux下的web安全机制,linux http服务器web网页的不同安全机制
  6. Windows 中不规则窗体的编程实现三种方法:CRgn,作图路径法,据图像创建region
  7. getopt/getopt_long函数使用说明
  8. [转载] 使用hexo+github搭建免费个人博客详细教程
  9. 浅析MyBatis执行器原理
  10. 猫/路由器/网关/交换机的作用与区别