已经研究生二年级下学期了,已经为了这个检索项目写了差不多2年代码了,回想大四下学期就开始接触的这个项目,在研一的时候根本不知道科研如何做,而且项目就自己一个人,也是胡乱写了代码,而且心事太多,简直只能用一个词语形容就是混乱。

  但是在大二上学期10月份的时候,随着一位同学加入简直就是可以说这个项目才真正开始。在我们的系统完成后,我便心血来潮整理我之前写过的代码,因为我们要写论文,所以需要做很多的数据处理来完成实验对比部分,其实这部分数据处理我在大一的时候就已经写过类似的代码,结果现在不得不重新再写,因为写的时间比回想代码时候更短,所以我发现好多代码都重复写了,这是我整理代码的初衷。我更加想的是用一个文件树的数据结构+数据处理算法流程去流水化我们数据处理模块,以后数据处理的代码就可以复用,干苦力的总是应该想办法提高自己的工作效率。所以我带着这个想法实现了下面这个类。用Python写的,因为Python做数据处理,字符处理,批处理真的太便利。其实这个类或许只能我自己用,为什么我会写出一个博客来,或许是因为以后我带研一新生做论文的时候我会让他去看回我们所写过的代码。让他去用我们写过的代码,我并没太多时间带一个新生,所以我让他来看我的博客。

  我的数据结构其实就是个多叉树,用来表示文件目录结构。每一个结点其实就是一个文件,并且用栈和队列实现遍历树的算法,实现添加节点的算法。直接上代码了,以后有时间的时候在回来写注释:

import os
from strOp import strExt
from collections import deque
from tblOp import tblConcatclass FileNode:def __init__(self, _fileName_s='',_brothers=None,_sons=[],_isDir_b=False,_parent= None):self.fileName_s = _fileName_sself.bro = _brothersself.sons = _sonsself.isDir_b = _isDir_bself.parent = _parentdef addNodeUnderPathUnrecur(root, _path_s):''' inputs: root -> the root of directory tree. It must give the root of the d_path_s -> add the sons under the path of _path_s. if _path_s is equal to 'D:\\CS_DATA\\' then all the file under it is added as sons of the node named 'CS_DATA'outputs:Add all the files under _path_s as its sons. The input must give the root of directory'''node = searchNodeFromGivenFilePath(root, _path_s)filesUnderPath = os.listdir(_path_s)lenOfFilesUnderPath = len(filesUnderPath)for i in range(lenOfFilesUnderPath):if len(node.sons) == 0:newNode = FileNode(filesUnderPath[i], None, [], os.path.isdir(_path_s+filesUnderPath[i]), node)node.sons.append(newNode)else:newNode = FileNode(filesUnderPath[i], None, [], os.path.isdir(_path_s+filesUnderPath[i]), node)node.sons[len(node.sons)-1].bro = newNodenode.sons.append(newNode)#isSameName(node, newNode) file system will ensure that no the same name files exist.def searchNodeFromGivenFilePath(root, _path_s):''' inputs: root -> Must give the root of directory. Meaning the absolute path of a node._path_s -> The absolute path of a node. Examples: 'D:\\CS_DATA\\'output:Search the directory tree from root to find the node whose fileName_s is equal to 'CS_DATA'.So, you must give the absolute path. Whether 'D:\\CS_DATA\\' or 'D:\\CS_DATA' would be fine.'''if _path_s[-1] != '\\':_path_s += '\\'folderStructure = _path_s.split('\\')if root.bro != None:print 'input root is not root of file tree'returnif folderStructure[0] != root.fileName_s:print 'the head of input path is not same as root'returnstack = []stack.append(root)for i in range(1,len(folderStructure)-1):if len(stack) == 0:print 'stack is empty'breaknode = stack.pop()flag = 0for j in node.sons:if folderStructure[i] == j.fileName_s:stack.append(j)flag = 1if flag == 0:print 'can not find the folder %s' % folderStructure[i]return Nonenode = stack.pop()return nodedef addNodeAsSonFromGivenNode(root, _sonPath_s):''' inputs:root -> The root of the directory. Which directory that you want to add the node._sonPath_s -> The absolute path of added node. Examples: 'D:\\CS_DATA\\tree\\' means add the node named 'tree' to its parent 'CS_DATA'outputs:The directory tree with added node.'''if _sonPath_s[-1] != '\\':_sonPath_s += '\\'fileStructure = _sonPath_s.split('\\')lenOfFileStructure = len(fileStructure)if lenOfFileStructure <= 2:print 'These is not son in the input path %s' % _sonPath_sreturn_sonFileName_s = fileStructure[-2]_parentPath_s = ''for i in range(len(fileStructure)-2):_parentPath_s = _parentPath_s + fileStructure[i] + '\\'_addNodeAsSonFromGivenNode(root, _parentPath_s, _sonFileName_s)def _addNodeAsSonFromGivenNode(root, _parentPath_s, _sonFileName_s):''' inputs:root -> The root of directory tree._parentPath_s -> The absolute path of parent_sonFileName_s -> the filename of added nodeoutputs:This function is a auxiliary function of addNodeAsSonFromGivenNode'''if _parentPath_s[-1] != '\\':_parentPath_s += '\\'parentNode = searchNodeFromGivenFilePath(root, _parentPath_s)if parentNode == None:print 'can not find the parent folder %s' % _parentPath_sreturn Noneif len(parentNode.sons) == 0:newNode = FileNode(_sonFileName_s, None, [], os.path.isdir(_parentPath_s+_sonFileName_s), parentNode)if isSameName(parentNode, newNode):returnparentNode.sons.append(newNode)else:newNode = FileNode(_sonFileName_s, None, [], os.path.isdir(_parentPath_s+_sonFileName_s), parentNode)if isSameName(parentNode, newNode):returnparentNode.sons[len(parentNode.sons)-1].bro = newNodeparentNode.sons.append(newNode)def isSameName(parentNode, sonNode):''' inputs:parentNode -> The parent node.sonNode -> the son node.outputs:If sonNode is already in parentNode.sons then return True.'''for node in parentNode.sons:if node.fileName_s == sonNode.fileName_s:print 'has same node %s\\%s -> %s' % (parentNode.fileName_s, node.fileName_s, sonNode.fileName_s)return Truereturn Falsedef addNodeUnderPathRecur(root, _path_s):''' inputs:root -> The root of directory._path_s -> The absolute path wanted to be added. Examples: 'D:\\CS_DATA\\'outputs:1. Add all the file nodes under _path_s recursively. 2. The _path_s must exist in root.Unsafe:1. Some system directory can not be added recursively. Examples: 'D:\\System Volume Information'2. I do not make the judgment between files whether have same name when adding.3. So, this function must use in the premise of operation system ensuring the rule for us.'''if _path_s[-1] != '\\':_path_s = _path_s + '\\'fileStructure = _path_s.split('\\')if fileStructure[0] == root.fileName_s and len(fileStructure) == 2:print '_path_s can not be the root'returnreturnNode = currentNode = searchNodeFromGivenFilePath(root, _path_s)if currentNode == None:print 'can not find the path'returnqueue = deque([])fileName_sl = os.listdir(_path_s)for fileName_s in fileName_sl:file_s = _path_s + fileName_snewNode = FileNode(fileName_s, None, [], os.path.isdir(file_s), currentNode)queue.append(newNode)while(len(queue) != 0):newNode = queue.popleft()currentNode = newNode.parentlenOfSonsCurrentNode = len(currentNode.sons)if lenOfSonsCurrentNode == 0:currentNode.sons.append(newNode)else:currentNode.sons[lenOfSonsCurrentNode-1].bro = newNodecurrentNode.sons.append(newNode)if newNode.isDir_b == True:fullPathOfNewNode = getFullPathOfNode(newNode)subFileName_sl = os.listdir(fullPathOfNewNode)for subFileName_s in subFileName_sl:subNewNode = FileNode(subFileName_s, None, [], os.path.isdir(fullPathOfNewNode+subFileName_s), newNode)queue.append(subNewNode)return returnNode       def printBrosOfGivenNode(root, _path_s):''' inputs:root -> The root of the directory._path_s -> Examples: 'D:\\CS_DATA' , 'D:\\CS_DATA\\'outputs:print out the bros of 'CS_DATA' for 'D:\\CS_DATA'print out the sons of 'CS_DATA' for 'D:\\CS_DATA\\''''if _path_s[-1] != '\\':node = searchNodeFromGivenFilePath(root, _path_s)if node == None:print 'can not find the node'parentOfNode = node.parentheadOfSons = parentOfNode.sons[0]printStr = headOfSons.fileName_s + ','while(headOfSons.bro != None):headOfSons = headOfSons.broprintStr = printStr + headOfSons.fileName_s + ','else:node = searchNodeFromGivenFilePath(root, _path_s)if node == None:print 'can not find the node'printStr = ''if len(node.sons) == 0:print 'its sons is empty'else:for son in node.sons:printStr = printStr + son.fileName_s + ','print printStr[:-1]def crtFileTreeFromPath(_path_s):''' inputs:_path_s -> Examples: 'D:\\sketchDataset\\' outputs:This function will create the root node by 'D:',and then, call addNodeUnderPathUnrecur to add files under 'D:\\',and then, again call addNodeUnderPathUnrecur to add files under 'D:\\sketchDataset\\'This process is a loop until the last separator of _path_s.'''if _path_s[-1] != '\\':_path_s += '\\'fileStructure = _path_s.split('\\')lenOfFileStructure = len(fileStructure)root = FileNode(_fileName_s=fileStructure[0], _isDir_b=os.path.isdir(fileStructure[0]))fileStr = root.fileName_s + '\\'addNodeUnderPathUnrecur(root, fileStr)for i in range(1, lenOfFileStructure-1):file_s = fileStructure[i]fileStr = fileStr + file_s + '\\'addNodeUnderPathUnrecur(root, fileStr)return rootdef searchLeafNodeUnderGivenNode(root, _path_s):''' inputs:root -> For the given directory tree._path_s -> The absolute path of node that wanted to search all the leafs under it.outputs:Return all the leafs under the given _path_s.Leaf is the file whose has not sons and it is not a directory'''node = searchNodeFromGivenFilePath(root, _path_s)leafs = []if node == None:print 'can not find the node in searchLeafNodeUnderGivenNode'returnqueue = deque([])queue.append(node)while(len(queue) != 0):currentNode = queue.popleft()if len(currentNode.sons) == 0 and (currentNode.isDir_b == False):leafs.append(currentNode)else:for son in currentNode.sons:queue.append(son)return leafs        def getFullPathOfNode(givenNode):''' find the full(absolute) path of the input node.'''tmpNode = givenNodefullPathOfNode = tmpNode.fileName_s + '\\'while(tmpNode.parent != None):tmpNode = tmpNode.parentfullPathOfNode = tmpNode.fileName_s + '\\' + fullPathOfNodereturn fullPathOfNode

比如我要计算草图检索的验证集,可以上上面的代码后面添加代码:

if __name__ == '__main__':root = crtFileTreeFromPath('D:\\sketchDataset\\')categroyNode = addNodeUnderPathRecur(root, 'D:\\sketchDataset\\category\\')leafs = searchLeafNodeUnderGivenNode(root, 'D:\\sketchDataset\\category\\')containModel_t = {}for i in range(len(leafs)):if leafs[i].parent.fileName_s not in containModel_t:containModel_t[leafs[i].parent.fileName_s] = []containModel_t[leafs[i].parent.fileName_s].append(strExt.extractModelIdWithSuffix(leafs[i].fileName_s, suffix_s='.off'))else:containModel_t[leafs[i].parent.fileName_s].append(strExt.extractModelIdWithSuffix(leafs[i].fileName_s, suffix_s='.off'))categroyNode = addNodeUnderPathRecur(root, 'D:\\sketchDataset\\all_categorized_sketches\\')sketchToCate_t = {}for son in categroyNode.sons:sketchNodes = son.sonsfor sketchNode in sketchNodes:sketchName = strExt.extractSketchNameWithSuffix(sketchNode.fileName_s, suffix_s='.txt')if sketchName not in sketchToCate_t:sketchToCate_t[sketchName] = son.fileName_swanted = tblConcat.concatTableByKey_ValAndVal_Vals(sketchToCate_t, containModel_t)print wanted

结果就是,也就是草图165号的验证模型是'm1646.off, m1647.off'等等。

{'s165.txt': ['m1646.off', 'm1647.off', 'm1648.off', 'm1649.off', 'm1650.off', 'm1651.off', 'm1652.off', 'm1653.off', 'm1654.off', 'm1655.off', 'm1656.off', 'm1657.off', 'm1658.off', 'm1659.off', 'm1660.off', 'm1661.off', 'm1662.off', 'm1663.off', 'm1664.off', 'm1665.off'] ......}

转载于:https://www.cnblogs.com/Key-Ky/p/4461700.html

只是一个文件节点类为了项目的数据处理相关推荐

  1. 分享一个文件监听器类:FileMonitor

    一直使用tomcat开发,只要将<Context docBase="MyTest" path="/MyTest" reloadable="tru ...

  2. 【python 3.6】调用另一个文件的类的方法

    文件1:test12.py 文件2:test13.py 文件1 如下: #!/usr/bin/python # -*- coding: utf-8 -*- '''''' class abcd(obje ...

  3. 文件节点的linux指令,Java工程师必学的Linux命令(一)文件与目录管理

    从本篇文章开始,我将总结一些Java工程师日常研发工作中会使用到的Linux命令,在介绍这些命令的过程中,也会对Linux系统的一些基础知识进行普及.希望对大家工作和学习有所帮助吧. 本篇将从文件与目 ...

  4. 【计算机常识】IDEA如何建立一个Java工程,Java项目、模块、包、类,.idea是什么,out文件是什么

    刚刚开始学习一门编程语言的时候,我们往往是跟着某个老师学习,学习的过程中,可以编译出来,我们就很开心了.但是自己在创立工程的时候,我们会不理解,为啥这样建立啊?为啥工程下面自己创立了模块(就有src了 ...

  5. 利用classloader同一个项目中加载另一个同名的类_线程上下文类加载器ContextClassLoader内存泄漏隐患...

    前提 今天(2020-01-18)在编写Netty相关代码的时候,从Netty源码中的ThreadDeathWatcher和GlobalEventExecutor追溯到两个和线程上下文类加载器Cont ...

  6. 写了一个操作XML文件的类

    一个操作XML文件的类..部份功能在完善中~~~~ using System; using System.Collections.Generic; using System.Text; using S ...

  7. matlab如何判断一个文件夹里面是否包含某个含有部分文件名的文件_如何构建一个成功的AI PoC(概念验证项目)...

    作者:Arnault 编译:ronghuaiyang 导读 如何把你的人工智能想法转化为可用的软件. 建立一个 AI PoC 是困难的.在这篇文章中,我将解释我的思维过程,使我的人工智能 PoCs 成 ...

  8. 32.如何把一个公共的类给抽离出来?让所有的项目都可以用?嘻哈的简写笔记——SpringBoot

    我们在做项目的时候,有时会遇到A项目需要用一个类,B项目也需要用同一个类:那么我们可能会把同一个类定义两次,但这样会很麻烦,万一要修改呢,那么可能要修改很多个类:我们需要把这个公共的类给抽离出来:这样 ...

  9. python modbus类封装_Python 中引入一个文件,模块的概念

    Python 提供了强大的模块支持,主要体现在,不仅 Python 标准库中包含了大量的模块(称为标准模块),还有大量的第三方模块,开发者自己也可以开发自定义模块. 通过这些强大的模块可以极大地提高开 ...

最新文章

  1. 飞得更高:(三)人不好招啊
  2. Tungsten Fabric SDN — 与 OpenStack 的集成部署
  3. Java实现前中后序线索化二叉树以及遍历
  4. Python爬虫利器六PyQuery的用法
  5. 队列实现栈的3种方法,全都击败了100%的用户!
  6. 代码生成工具随笔(2) ---我的生成工具
  7. SAP BASIS SCC4 事务代码在项目实践中的使用初探
  8. GNSS说第(七)讲---自适应动态导航定位(七)---抗差估计理论介绍
  9. 台式计算机怎么连接蓝牙 win10,win10台式电脑蓝牙怎么开启(开启电脑蓝牙的步骤图)...
  10. centos 6.5 mysql 5.5 安装,centos6.5 安装mysql-5.5
  11. 微信怎么自动加好友java_iOS逆向开发之微信自动添加好友功能
  12. ZK-SNARKS | 创建第一个零知识snark电路
  13. ESXi/ESX 链路聚合
  14. 超神学院暗质计算机,超神学院之黑白守护者
  15. C# 打印照片和文档
  16. android 进度条边框,android用户界面-组件Widget-进度条ProgressBar
  17. php 如何判断手机(m端)和电脑(pc端)
  18. pink老师【品优购商城】
  19. 09uni-app实战跨端云开发实战拍照识别垃圾分类精灵视频教程
  20. java程序读取excel表格并存入mysql数据库详细教程

热门文章

  1. 6T SRAM的基本结构及其读写操作
  2. c++使用制表符\t
  3. python 电子签名去背景
  4. 如何从0到1组建敏捷团队?
  5. 大数据课程设计python_大数据Python编程设计
  6. 功放限幅保护_【干货】如何利用限幅器保护音箱√
  7. CMU CSAPP : Decoding lab
  8. html将字符串按逗号分隔,js如何截取以逗号隔开的字符串
  9. 关于汇编指令ldr和str的理解
  10. 如何使用uni-app做一个音乐播放器