Running Trinity in multiple steps
2019独角兽企业重金招聘Python工程师标准>>>
Running Trinity in multiple steps
Trinity (trinityrnaseq.sourceforge.net) is a software package combining three independent software modules (Inchworm, Chrysalis, Butterfly) to process large volumes of RNA-seq reads. Running Trinity from beginning to end on large data sets may exceed the walltime limit for a single job. Trinity provides a mechanism to run the workflow in four separate steps. Each step may be run as its own job, providing a workaround for the single job walltime limit. This page describes how to run Trinity in this manner under the SLURM scheduler and provides example submit scripts.
Generally, the same Trinity command is run for each step, aside from one option that determines how far Trinity will progress before stopping. On the last step, the Trinity command is run as normal. For example,
# Step 1
Trinity.pl <options> --no_run_chrysalis
# Step 2
Trinity.pl <options> --no_run_quantifygraph
# Step 3
Trinity.pl <options> --no_run_butterfly
# Step 4
Trinity.pl <options>
|
SLURM submit scripts that will request 16 CPUs and 200GB of RAM for each step are given as examples.
#!/bin/sh
#SBATCH --job-name=trinity_step1
#SBATCH --time=168:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --mem=200gb
#SBATCH --output=trinity_step1.stdout
#SBATCH --error=trinity_step1.stderr
module load trinity/r2013-02-25bowtie/1.0.0
Trinity.pl --output trinity_out --seqType fq --JM 200G --left leftreads.fastq \
--right rightreads.fastq --CPU $SLURM_NTASKS_PER_NODE --inchworm_cpu $SLURM_NTASKS_PER_NODE \
--bflyCPU $SLURM_NTASKS_PER_NODE --no_run_chrysalis
|
#!/bin/sh
#SBATCH --job-name=trinity_step2
#SBATCH --time=168:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --mem=200gb
#SBATCH --output=trinity_step2.stdout
#SBATCH --error=trinity_step2.stderr
module load trinity/r2013-02-25bowtie/1.0.0
Trinity.pl --output trinity_out --seqType fq --JM 200G --left leftreads.fastq \
--right rightreads.fastq --CPU $SLURM_NTASKS_PER_NODE --inchworm_cpu $SLURM_NTASKS_PER_NODE \
--bflyCPU $SLURM_NTASKS_PER_NODE --no_run_quantifygraph
|
#!/bin/sh
#SBATCH --job-name=trinity_step3
#SBATCH --time=168:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --mem=200gb
#SBATCH --output=trinity_step3.stdout
#SBATCH --error=trinity_step3.stderr
module load trinity/r2013-02-25bowtie/1.0.0
Trinity.pl --output trinity_out --seqType fq --JM 200G --left leftreads.fastq \
--right rightreads.fastq --CPU $SLURM_NTASKS_PER_NODE --inchworm_cpu $SLURM_NTASKS_PER_NODE \
--bflyCPU $SLURM_NTASKS_PER_NODE --no_run_butterfly
|
#!/bin/sh
#SBATCH --job-name=trinity_step4
#SBATCH --time=168:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --mem=200gb
#SBATCH --output=trinity_step4.stdout
#SBATCH --error=trinity_step4.stderr
module load trinity/r2013-02-25bowtie/1.0.0
Trinity.pl --output trinity_out --seqType fq --JM 200G --left leftreads.fastq \
--right rightreads.fastq --CPU $SLURM_NTASKS_PER_NODE --inchworm_cpu $SLURM_NTASKS_PER_NODE \
--bflyCPU $SLURM_NTASKS_PER_NODE
|
The job dependency feature of SLURM can be used to run each step sequentially as the previous step completes. All four jobs can be submitted at once and they will run in the proper order without needing any further interaction from the user. The job ID of each step is used in the submit command for the next to order the jobs. Assuming the four scripts above are saved in the working directory with the input dataset, they would be submitted as follows:
$ sbatch trinity_step1.submit
Submitted batch job 366910
$ sbatch -d afterok:366910 trinity_step2.submit
Submitted batch job 366911
$ sbatch -d afterok:366911 trinity_step3.submit
Submitted batch job 366912
$ sbatch -d afterok:366912 trinity_step4.submit
Submitted batch job 366913
|
The -d afterok option instructs SLURM to only run the submitted job if the existing specified job completes successfully. If for some reason Trinity exits with an error code for one step, SLURM will not run the next step.
Tips: Check Command
1.Check the status of your job:
$ squeue -u <username>
|
Output:
JobID JobName State ExitCode Start End Elapsed
------------ ------------------------------ ---------- -------- ------------------- ------------------- ----------
[<username>@login.tusker ~]$ squeue -u <username>JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)426290 batch trinity_ <username> PD 0:00 1 (Dependency)426291 batch trinity_ <username> PD 0:00 1 (Dependency)426289 batch trinity_ <username> R 10:33:59 1 c2417
2.Check a specific JOB,such as JOBID=426289
$scontrol show job426289
|
[<username>@login.tusker ~]$ scontrol show job 426289
JobId=426289 Name=trinity_step2UserId=<username>(3557) GroupId=<groupname>(11156)Priority=30208 Account=<groupname> QOS=normalJobState=RUNNING Reason=None Dependency=(null)Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0RunTime=10:38:38 TimeLimit=7-00:00:00 TimeMin=N/ASubmitTime=2013-08-19T15:12:44 EligibleTime=2013-08-21T00:36:51StartTime=2013-08-21T00:37:09 EndTime=2013-08-28T00:37:09PreemptTime=None SuspendTime=None SecsPreSuspend=0Partition=batch AllocNode:Sid=login:62036ReqNodeList=(null) ExcNodeList=(null)NodeList=c2417BatchHost=c2417NumNodes=1 NumCPUs=16 CPUs/Task=1 ReqS:C:T=*:*:*MinCPUsNode=16 MinMemoryNode=250G MinTmpDiskNode=0Features=(null) Gres=(null) Reservation=(null)Shared=OK Contiguous=0 Licenses=(null) Network=(null)Command=/lustre/work/entomology/hwang4/WCR_RNAseq_2013/Fallarmyworm/trinity_step2.submitWorkDir=/lustre/work/entomology/hwang4/WCR_RNAseq_2013/Fallarmyworm
3.Check your job history after a specific date. For example, all jobs run since 08-14-2013.
$ sacct -u <username> -S081413-o JobId,JobName%30,State,ExitCode,Start,End,Elapse
|
Output:
JobID JobName State ExitCode Start End Elapsed
------------ ------------------------------ ---------- -------- ------------------- ------------------- ----------
382339 trinity_step1 COMPLETED 0:0 2013-08-13T09:47:18 2013-08-13T22:03:39 12:16:21
382339.batc+ batch COMPLETED 0:0 2013-08-13T09:47:18 2013-08-13T22:03:39 12:16:21
382846 trinity_step2 CANCELLED+ 0:0 2013-08-13T22:03:39 2013-08-14T15:40:45 17:37:06
426288 trinity_step1 RUNNING 0:0 2013-08-20T15:24:23 Unknown 00:14:21
426289 trinity_step2 PENDING 0:0 Unknown Unknown 00:00:00
426290 trinity_step3 PENDING 0:0 Unknown Unknown 00:00:00
426291 trinity_step4 PENDING 0:0 Unknown Unknown 00:00:00
转载于:https://my.oschina.net/u/727594/blog/191124
Running Trinity in multiple steps相关推荐
- Trinity安装全过程并解决部分报错
Trinity安装全过程并解决部分报错 简单的安装方式: # Hompage : https://github.com/trinityrnaseq/trinityrnaseq/wiki# 安装 sud ...
- nvidia Multiple Process Service (MPS)
What MPS is? MPS is a binary-compatible client-server runtime implementation of the CUDA API which c ...
- brain.js 时间序列_免费的Brain JS课程学习JavaScript中的神经网络
brain.js 时间序列 The last few years, machine learning has gone from a promising technology to something ...
- 在Ubuntu 18.04上安装和使用Tesseract 4
量子指南 (QUANTRIUM GUIDES) Today, the extraction of information from scanned documents such as letters, ...
- azure第一个月_MLOps:两个Azure管道的故事
azure第一个月 Luuk van der Velden and Rik Jongerius 卢克·范德·费尔登(Luuk van der Velden)和里克 ·琼格里乌斯( Rik Jonger ...
- 2套RAC环境修改scanip后客户端连接异常
摘要:在某个项目上需要将1套rac数据库迁移到另外1套rac,这2套rac的网段一致.数据库名一致.在迁移之后发现新的数据还是会往老的数据库插入,然而新数据库并没有新增数据. 2套RAC环境修改sca ...
- 禁用 ssh agent_如何修复“禁用Agent XP”错误
禁用 ssh agent This article explains how we can fix SQL Server error "Agent XPs Disabled". B ...
- java 执行ssis包_在SSIS包中使用CHECKPOINT重新启动包执行
java 执行ssis包 In the article, SQL Server CHECKPOINT, Lazy Writer, Eager Writer and Dirty Pages in SQL ...
- sql raiserror_SQL Server PRINT和SQL Server RAISERROR语句
sql raiserror This article explores the SQL Server PRINT statements, and its alternative SQL Server ...
最新文章
- Android 优秀开源项目
- 属于web框架的python库_(2017)我不建议使用的Python Web框架
- Win10 + VSCode踩坑 + vue项目开发:设置vscode终端为管理员权限
- 龙芯机器联网时,链接建立速度有点慢
- 阿特拉斯拧紧枪说明书_阿特拉斯使用说明书(全).pdf
- 《东周列国志》第一百一回 秦王灭周迁九鼎 廉颇败燕杀二将
- SAP中常用SE系列TCODE汇总
- 中国人正在上的四个大当 看你到底上了几个了?
- linux程序释放内存,Linux释放内存方法
- Chrome安装程序遇到错误 0xe0000008解决办法
- 读《拆掉思维的墙》小记
- asp.net web开发框架_Web前端开发必不可少的9个开源框架
- _kbhit()以及_getch()函数的用法
- vue 3D轮播展示 --vue-carousel-3d
- web2.0中流行的设计元素:颜色
- GameFramework篇:StarForce资源加载细节讲解(一:资源加载辅助器)
- Python 的“self“参数是什么?
- 李国庆谈离职:记录一个降薪降职和辞职的观点
- x86 LEA 指令
- Ubuntu 20.4 安装 Sublime Text 步骤