AlphaFold2/docker/run_docker.py阅读理解

我想阅读源码看一下alphafold2的实现细节，但是看完这个py文件里面并没有实现细节。只有对传入参数的设置，但读完代码也做一下记录。

1. 检查特定的参数：数据库文件夹、fasta文件、模板

运行main函数

if __name__ == '__main__':flags.mark_flags_as_required(['data_dir','fasta_paths','max_template_date',])app.run(main)

2. main()函数

设置数据库路径（8个）、模板路径、输出文件路径

if len(argv) > 1:raise app.UsageError('Too many command-line arguments.')# You can individually override the following paths if you have placed the# data in locations other than the FLAGS.data_dir.# Path to the Uniref90 database for use by JackHMMER.uniref90_database_path = os.path.join(FLAGS.data_dir, 'uniref90', 'uniref90.fasta')# Path to the Uniprot database for use by JackHMMER.uniprot_database_path = os.path.join(FLAGS.data_dir, 'uniprot', 'uniprot.fasta')# Path to the MGnify database for use by JackHMMER.mgnify_database_path = os.path.join(FLAGS.data_dir, 'mgnify', 'mgy_clusters_2018_12.fa')# Path to the BFD database for use by HHblits.bfd_database_path = os.path.join(FLAGS.data_dir, 'bfd','bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt')# Path to the Small BFD database for use by JackHMMER.small_bfd_database_path = os.path.join(FLAGS.data_dir, 'small_bfd', 'bfd-first_non_consensus_sequences.fasta')# Path to the Uniclust30 database for use by HHblits.uniclust30_database_path = os.path.join(FLAGS.data_dir, 'uniclust30', 'uniclust30_2018_08', 'uniclust30_2018_08')# Path to the PDB70 database for use by HHsearch.pdb70_database_path = os.path.join(FLAGS.data_dir, 'pdb70', 'pdb70')# Path to the PDB seqres database for use by hmmsearch.pdb_seqres_database_path = os.path.join(FLAGS.data_dir, 'pdb_seqres', 'pdb_seqres.txt')# Path to a directory with template mmCIF structures, each named <pdb_id>.cif.template_mmcif_dir = os.path.join(FLAGS.data_dir, 'pdb_mmcif', 'mmcif_files')# Path to a file mapping obsolete PDB IDs to their replacements.obsolete_pdbs_path = os.path.join(FLAGS.data_dir, 'pdb_mmcif', 'obsolete.dat')alphafold_path = pathlib.Path(__file__).parent.parentdata_dir_path = pathlib.Path(FLAGS.data_dir)if alphafold_path == data_dir_path or alphafold_path in data_dir_path.parents:raise app.UsageError(f'The download directory {FLAGS.data_dir} should not be a subdirectory 'f'in the AlphaFold repository directory. If it is, the Docker build is 'f'slow since the large databases are copied during the image creation.')

为模型使用的每个文件和目录创建装入点，并传回命令行修改参数为挂载目录

for i, fasta_path in enumerate(FLAGS.fasta_paths):mount, target_path = _create_mount(f'fasta_path_{i}', fasta_path)mounts.append(mount)target_fasta_paths.append(target_path)command_args.append(f'--fasta_paths={",".join(target_fasta_paths)}')

把fasta文件的路径映射到/mnt/fasta文件路径下，返回挂载器和挂载目录

def _create_mount(mount_name: str, path: str) -> Tuple[types.Mount, str]:"""Create a mount point for each file and directory used by the model."""path = pathlib.Path(path).absolute()target_path = pathlib.Path(_ROOT_MOUNT_DIRECTORY, mount_name)if path.is_dir():source_path = pathmounted_path = target_pathelse:source_path = path.parentmounted_path = pathlib.Path(target_path, path.name)if not source_path.exists():raise ValueError(f'Failed to find source directory "{source_path}" to ''mount in Docker container.')logging.info('Mounting %s -> %s', source_path, target_path)mount = types.Mount(target=str(target_path), source=str(source_path),type='bind', read_only=True)return mount, str(mounted_path)

将path相对路径转换为绝对路径

path = pathlib.Path(path).absolute()

目标路径为：/mnt/fasta_path_{i} i=1,2,3,...

target_path = pathlib.Path(_ROOT_MOUNT_DIRECTORY, mount_name)

根据传入参数的不同（有无多聚体），选择不同的数据库

if FLAGS.model_preset == 'multimer':database_paths.append(('uniprot_database_path', uniprot_database_path))database_paths.append(('pdb_seqres_database_path',pdb_seqres_database_path))else:database_paths.append(('pdb70_database_path', pdb70_database_path))

挂载数据库路径

for name, path in database_paths:if path:mount, target_path = _create_mount(name, path)mounts.append(mount)command_args.append(f'--{name}={target_path}')

挂载输出路径/mnt/output/

output_target_path = os.path.join(_ROOT_MOUNT_DIRECTORY, 'output')
mounts.append(types.Mount(output_target_path, FLAGS.output_dir, type='bind'))

初始化docker客户端

client = docker.from_env()

运行docker容器，返回一个Container对象

container = client.containers.run(image=FLAGS.docker_image_name,command=command_args,device_requests=device_requests,remove=True,detach=True,mounts=mounts,user=FLAGS.docker_user,environment={'NVIDIA_VISIBLE_DEVICES': FLAGS.gpu_devices,# The following flags allow us to make predictions on proteins that# would typically be too long to fit into GPU memory.'TF_FORCE_UNIFIED_MEMORY': '1','XLA_PYTHON_CLIENT_MEM_FRACTION': '4.0',})

environment：以下标志允许我们对通常太长而无法放入GPU内存的蛋白质进行预测。

run()函数的解析如下图：参考 http://t.csdn.cn/Qwqel

有一些没有写出来，应该就是运行时设置的一些参数

更多关于Python第三方库操作Docker的内容可以参考官方手册：https://docker-py.readthedocs.io/en/stable/index.html

查看官方文档后把剩下的参数列出：

device_requests

list列表

将主机的所有资源公开给容器

mounts

list列表

容器转载列表（规范后的）

environment

dict or list 字典或列表

要在容器内设置的环境变量

AlphaFold2/docker/run_docker.py阅读理解相关推荐

【NLP】用BERT进行机器阅读理解
作者 | Edward Qian 编译 | VK 来源 | Towards Data Science 这里可以找到带有代码的Github存储库:https://github.com/edwardcqi ...
【NLP】完全解析！Bert Transformer 阅读理解源码详解
接上一篇: 你所不知道的 Transformer! 超详细的 Bert 文本分类源码解读 | 附源码中文情感分类单标签参考论文: https://arxiv.org/abs/1706.03762 ...
信息抽取（一）机器阅读理解——样本数据处理与Baseline模型搭建训练（2020语言与智能技术竞赛）
机器阅读理解--样本数据处理与Baseline模型搭建训练前言样本数据处理数据测试模型部分模型构建模型训练部分推理结果总结前言最近看到今年早些时候百度的"2020语言与智 ...
“非自回归”也不差：基于MLM的阅读理解问答
作者丨苏剑林单位丨追一科技研究方向丨NLP,神经网络个人主页丨kexue.fm 前段时间写了万能的Seq2Seq:基于Seq2Seq的阅读理解问答,探索了以最通用的 Seq2Seq 的方式来做阅 ...
万能的Seq2Seq：基于Seq2Seq的阅读理解问答
作者丨苏剑林单位丨追一科技研究方向丨NLP,神经网络个人主页丨kexue.fm 今天给 bert4keras [1] 新增加了一个例子:阅读理解式问答(task_reading_comprehe ...
对Docker镜像layer的理解
对Docker镜像layer的理解转自:https://blog.csdn.net/u011069294/article/details/105583522 FROM python:3.6.1-al ...
菜鸟笔记-DuReader阅读理解基线模型代码阅读笔记（八）—— 模型训练-训练
系列目录: 菜鸟笔记-DuReader阅读理解基线模型代码阅读笔记(一)--数据菜鸟笔记-DuReader阅读理解基线模型代码阅读笔记(二)-- 介绍及分词菜鸟笔记-DuReader阅读理解基线模 ...
使用MRC（机器阅读理解）方式做事件抽取任务，基于2020百度事件抽取任务
关注微信公众号:NLP分享汇.[喜欢的扫波关注,每天都在更新自己之前的积累] 文章链接:https://mp.weixin.qq.com/s/aKB6j42bC1MnWCFIEyjwQQ [前言] ...
EasyNLP带你实现中英文机器阅读理解
作者:施晨.黄俊导读机器阅读理解是自然语言处理(NLP),特别是自然语言理解(NLU)领域最重要的研究方向之一.自1977年首次被提出以来,机器阅读理解已有近50年的发展史,历经"人工规 ...

AlphaFold2/docker/run_docker.py阅读理解

AlphaFold2/docker/run_docker.py阅读理解相关推荐

最新文章

热门文章