用来处理gwdac网站爬取的实验数据的python代码
写了几个用来整理gwdac爬取的散射实验数据的python代码,在这里保存一下,免得以后找不到
代码一
把按组分类的实验数据转成一行一个数据点
import osobslist = (["DSG","P","D","DT","AYY","AXX","AZZ","AZX","CKP","CKK","CPP","R","RP","A","AP","RT","RPT","AT","APT","MSSN","MKSN","MSKN","MKKN","SGTE","ALFA","REF3","REF2","SGT","SGTL","SGTT","DTRT","AMPL","SGEL","SGTR","DELT","MTXR","D0SS","D0KS","D0SK","D0KK","NKNK","NSNK","NKNS","NSNS","NNKK","MSNS","MKNS","MSNK","MKNK","NSSN","NKSN","NSKN","NKKN","NNSK","NNKS","BAMP","S00S","K00S","S00K","K00K","A0ST","A0KT","KS0T","KK0T","MSNT","MNST","MNKT","MKNT","DSGL"])txt = ".txt"
dat = ".dat"retype = "NP_"
errtype = "S"for obslist in obslist:file = retype + obslist + txtprint (file)if os.path.isfile(retype+obslist+txt):print (retype+obslist+txt," exist")fr = open(retype+obslist+txt,"r",encoding="utf-8")fread = fr.readline()fread = fr.readline()fread = fr.readline()fread = fr.readline()string = fr.read(5)if string != " Summ": # "Summ" means this obs is emptyprint(file,"created")fw = open("out/"+retype+obslist+dat,"w",encoding="utf-8")while True:dele= " T "num = int(string)#print (num)energy = fr.read(14)energy = fr.read(8)fread = fr.readline()ref = fr.read(8)ref = ref[0:4]+ref[5:7]fread = fr.read(1)ref2 = fr.read(41)fread = fr.readline()fread = fr.read(8)if fread == " DELETED":dele= " F "fread = fr.readline()fread = fr.read(8)if fread == " N(theo)":fread = fr.read(16)syserr = fr.read(7)if float(syserr)>0.99:errtype = "F"else:errtype = "S"fread = fr.readline()fread = fr.readline()elif fread ==" A ":errtype = "N"fread = fr.readline()syserr = " 0.000"else:print("wrong")#while fread!= " A":#fread = fr.readline()#fread = fr.read(6)#fread = fr.readline()#print(num)index = 1while index < num+1:fw.write(energy) acm = fr.read(10)fw.write(acm)fread = fr.read(13)obs = fr.read(13) fw.write(obs)err = fr.read(13)fw.write(err)fw.write(ref)fw.write(" "+errtype)fw.write(syserr)fw.write(dele)fw.write(string)fw.write(ref2)fw.write("\n")fread = fr.readline()index = index+1#print("acm"+acm)fread = fr.readline()string = fr.read(5)#print(string)if string == " Summ":fr.close()fw.close()break
代码二
把所有数据合成单个文件
import osobslists = (["DSG","P","D","DT","AYY","AXX","AZZ","AZX","CKP","CKK","CPP","R","RP","A","AP","RT","RPT","AT","APT","MSSN","MKSN","MSKN","MKKN","SGTE",#"ALFA","REF3","REF2","SGT","SGTL","SGTT",#"DTRT","AMPL","SGEL","SGTR","DELT","MTXR","D0SS","D0KS","D0SK","D0KK", "NKNK","NSNK","NKNS","NSNS","NNKK","MSNS","MKNS","MSNK","MKNK","NSSN","NKSN","NSKN","NKKN","NNSK","NNKS","BAMP","S00S","K00S","S00K","K00K","A0ST","A0KT","KS0T","KK0T","MSNT","MNST","MNKT","MKNT","DSGL"])txt = ".txt"
dat = ".dat"retype = "NP_"#fw = open("out/alldata.dat","w",encoding="utf-8")
fw = open("out/300mev.dat","w",encoding="utf-8")for obslist in obslists:file = retype + obslist + datprint (file)if os.path.isfile(retype+obslist+dat):print (retype+obslist+dat," exist")fr = open(retype+obslist+dat,"r",encoding="utf-8")for line in fr.readlines():if float(line[0:7])<300.001:fw.write(obslist.ljust(5)+line)#fw.write(obslist.ljust(5)+line)fr.close()fw.close()
用来处理gwdac网站爬取的实验数据的python代码相关推荐
- python爬取网页表格数据匹配,python爬虫——数据爬取和具体解析
标签:pattern div mat txt 保存 关于 json result with open 关于正则表达式的更多用法,可参考链接:https://blog.c ...
- 数据分析毕业设计 招聘网站爬取与大数据分析可视化 - python flask
文章目录 0 前言 1 课题背景 2 实现效果 3 Flask框架 4 Echarts 5 爬虫 0 前言
- 爬取B站免费视频--python代码赶快拿
首先在终端下面安装you_get 安装代码: pip install -i https://pypi.tuna.tsinghua.edu.cn/simple --trusted-host pypi.t ...
- 爬取北京二手房数据信息(python)
数据爬取 爬取北京二手房数据信息python代码: # coding : utf-8from requests import get from bs4 import BeautifulSoup as ...
- 网站爬取工具_浅析阻碍网站内容被蜘蛛抓取的原因有哪些?
众所周知,在搜索引擎中存在的蜘蛛其实就是一段代码,这段代码通过在各个网站爬取,以便于网站内容能够被搜索引擎收录.不过一般蜘蛛爬取是按照一定规则进行的,如果网站中出现了一些爬取障碍,那么蜘蛛爬取就会被打 ...
- 网站爬取工具_Python项目:结合Django和爬虫开发小说网站,免安装,无广告
前言 很多喜欢看小说的小伙伴都是是两袖清风的学生党,沉迷小说,不能自拔.奈何囊中甚是羞涩,没有money去看正版小说,但是往往这些免费的小说网站或者小说软件,随之而来的是大量的广告. Python嘛, ...
- 基于python的数据爬取与分析_基于Python的网站数据爬取与分析的技术实现策略
欧阳元东 摘要:Python为网页数据爬取和数据分析提供了很多工具包.基于Python的BeautifulSoup可以快速高效地爬取网站数据,Pandas工具能方便灵活地清洗分析数据,调用Python ...
- Python之 - 使用Scrapy建立一个网站抓取器,网站爬取Scrapy爬虫教程
Scrapy是一个用于爬行网站以及在数据挖掘.信息处理和历史档案等大量应用范围内抽取结构化数据的应用程序框架,广泛用于工业. 在本文中我们将建立一个从Hacker News爬取数据的爬虫,并将数据按我 ...
- 爬虫漫游指南:HTTP/2 网站爬取
爬虫漫游指南 HTTP/2 网站爬取 最近写爬虫的时候遇到了一个用HTTP 2.0协议的网站,requests那套老经验在它身上不好用了,得专门针对HTTP 2.0进行开发. 因为与HTTP 1.x的 ...
最新文章
- Ubuntu 防火墙常用配置操作(ufw)【适用于 Debian 及其衍生版---Linux Mint、Deepin 等】
- php教育网站设计案例_酒店装修,精品酒店设计装修案例,酒店设计网站
- jQuery kxbdMarquee 无缝滚动
- argparse模块---解析命令行参数
- 从头算和密度泛函理论_PHP Laravel教程–如何从头开始构建关键字密度工具
- 使用Python扩展库spleeter分离MP3音乐文件中的伴奏和人声
- Android插件框架VirtualAPK学习和使用
- stm8s + si4463 寄存器配置
- 大牛教你如何利用积分商城API接口对接积分商城平台
- Java获取图片传到前端,生成二维码给前端
- OpenHarmony在Amlogic A311D芯片平台的快速开发上手指南
- 计算机专业的在职考研,2019年深造北京航空航天大学在职研究生计算机专业在职考研科目是什么...
- Excel表格中如何获得筛选下拉项的集合?
- 工具及方法 - 如何保护眼睛
- 互联网未来7大猜想,互联网营销
- 2020年秋招回顾总结(2021届),目前已在上海入职工作,感恩亲人与朋友,未来,你好!
- 初识二维码 第二讲 二维码的结构
- ios Mac下的SVN工具:Cornerstone与Versions和使用subversion管理iOS源代码
- [负荷预测]基于灰色GM(1,1)模型的中长期电力负荷预测
- 将“360软件小助手添加到“快速启动栏”