2019独角兽企业重金招聘Python工程师标准>>>

Running Trinity in multiple steps

Trinity (trinityrnaseq.sourceforge.net) is a software package combining three independent software modules (Inchworm, Chrysalis, Butterfly) to process large volumes of RNA-seq reads.  Running Trinity from beginning to end on large data sets may exceed the walltime limit for a single job.  Trinity provides a mechanism to run the workflow in four separate steps.  Each step may be run as its own job, providing a workaround for the single job walltime limit.   This page describes how to run Trinity in this manner under the SLURM scheduler and provides example submit scripts.

Generally, the same Trinity command is run for each step, aside from one option that determines how far Trinity will progress before stopping.  On the last step, the Trinity command is run as normal.  For example,

# Step 1
Trinity.pl <options> --no_run_chrysalis
# Step 2
Trinity.pl <options> --no_run_quantifygraph
# Step 3
Trinity.pl <options> --no_run_butterfly
# Step 4
Trinity.pl <options>

SLURM submit scripts that will request 16 CPUs and 200GB of RAM for each step are given as examples.

trinity_step1.submit
#!/bin/sh
#SBATCH --job-name=trinity_step1
#SBATCH --time=168:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --mem=200gb
#SBATCH --output=trinity_step1.stdout
#SBATCH --error=trinity_step1.stderr
module load trinity/r2013-02-25bowtie/1.0.0
Trinity.pl --output trinity_out  --seqType fq --JM 200G --left leftreads.fastq \
--right rightreads.fastq --CPU $SLURM_NTASKS_PER_NODE  --inchworm_cpu $SLURM_NTASKS_PER_NODE \
--bflyCPU $SLURM_NTASKS_PER_NODE --no_run_chrysalis

trinity_step2.submit
#!/bin/sh
#SBATCH --job-name=trinity_step2
#SBATCH --time=168:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --mem=200gb
#SBATCH --output=trinity_step2.stdout
#SBATCH --error=trinity_step2.stderr
module load trinity/r2013-02-25bowtie/1.0.0
Trinity.pl --output trinity_out  --seqType fq --JM 200G --left leftreads.fastq \
--right rightreads.fastq --CPU $SLURM_NTASKS_PER_NODE  --inchworm_cpu $SLURM_NTASKS_PER_NODE \
--bflyCPU $SLURM_NTASKS_PER_NODE --no_run_quantifygraph

trinity_step3.submit
#!/bin/sh
#SBATCH --job-name=trinity_step3
#SBATCH --time=168:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --mem=200gb
#SBATCH --output=trinity_step3.stdout
#SBATCH --error=trinity_step3.stderr
module load trinity/r2013-02-25bowtie/1.0.0
Trinity.pl --output trinity_out  --seqType fq --JM 200G --left leftreads.fastq \
--right rightreads.fastq --CPU $SLURM_NTASKS_PER_NODE  --inchworm_cpu $SLURM_NTASKS_PER_NODE \
--bflyCPU $SLURM_NTASKS_PER_NODE --no_run_butterfly

trinity_step4.submit
#!/bin/sh
#SBATCH --job-name=trinity_step4
#SBATCH --time=168:00:00
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=16
#SBATCH --mem=200gb
#SBATCH --output=trinity_step4.stdout
#SBATCH --error=trinity_step4.stderr
module load trinity/r2013-02-25bowtie/1.0.0
Trinity.pl --output trinity_out  --seqType fq --JM 200G --left leftreads.fastq \
--right rightreads.fastq --CPU $SLURM_NTASKS_PER_NODE  --inchworm_cpu $SLURM_NTASKS_PER_NODE \
--bflyCPU $SLURM_NTASKS_PER_NODE

The job dependency feature of SLURM can be used to run each step sequentially as the previous step completes.  All four jobs can be submitted at once and they will run in the proper order without needing any further interaction from the user.  The job ID of each step is used in the submit command for the next to order the jobs.  Assuming the four scripts above are saved in the working directory with the input dataset, they would be submitted as follows:

Example Trinity submission
$ sbatch trinity_step1.submit
Submitted batch job 366910
$ sbatch -d afterok:366910 trinity_step2.submit
Submitted batch job 366911
$ sbatch -d afterok:366911 trinity_step3.submit
Submitted batch job 366912
$ sbatch -d afterok:366912 trinity_step4.submit
Submitted batch job 366913

The -d afterok option instructs SLURM to only run the submitted job if the existing specified job completes successfully.  If for some reason Trinity exits with an error code for one step, SLURM will not run the next step.

Tips: Check Command

1.Check the status of your job:

Example: Check Your Job Status
$ squeue -u <username>

Output:

JobID                        JobName      State ExitCode               Start                 End    Elapsed
------------ ------------------------------ ---------- -------- ------------------- ------------------- ----------
[<username>@login.tusker ~]$ squeue -u <username>JOBID PARTITION     NAME     USER    ST       TIME    NODES  NODELIST(REASON)426290     batch trinity_ <username>  PD       0:00      1   (Dependency)426291     batch trinity_ <username>  PD       0:00      1   (Dependency)426289     batch trinity_ <username>   R    10:33:59     1   c2417

2.Check a specific JOB,such as JOBID=426289

Example to check JOBID:426289
$scontrol show job426289

[<username>@login.tusker ~]$ scontrol show job 426289
JobId=426289 Name=trinity_step2UserId=<username>(3557) GroupId=<groupname>(11156)Priority=30208 Account=<groupname> QOS=normalJobState=RUNNING Reason=None Dependency=(null)Requeue=1 Restarts=0 BatchFlag=1 ExitCode=0:0RunTime=10:38:38 TimeLimit=7-00:00:00 TimeMin=N/ASubmitTime=2013-08-19T15:12:44 EligibleTime=2013-08-21T00:36:51StartTime=2013-08-21T00:37:09 EndTime=2013-08-28T00:37:09PreemptTime=None SuspendTime=None SecsPreSuspend=0Partition=batch AllocNode:Sid=login:62036ReqNodeList=(null) ExcNodeList=(null)NodeList=c2417BatchHost=c2417NumNodes=1 NumCPUs=16 CPUs/Task=1 ReqS:C:T=*:*:*MinCPUsNode=16 MinMemoryNode=250G MinTmpDiskNode=0Features=(null) Gres=(null) Reservation=(null)Shared=OK Contiguous=0 Licenses=(null) Network=(null)Command=/lustre/work/entomology/hwang4/WCR_RNAseq_2013/Fallarmyworm/trinity_step2.submitWorkDir=/lustre/work/entomology/hwang4/WCR_RNAseq_2013/Fallarmyworm

3.Check your job history after a specific date.  For example, all jobs run since 08-14-2013.
Example: Check Your Job History After A Specific Date
$ sacct -u <username> -S081413-o JobId,JobName%30,State,ExitCode,Start,End,Elapse

Output:

JobID                        JobName      State ExitCode               Start                 End    Elapsed
------------ ------------------------------ ---------- -------- ------------------- ------------------- ----------
382339                        trinity_step1  COMPLETED      0:0 2013-08-13T09:47:18 2013-08-13T22:03:39   12:16:21
382339.batc+                          batch  COMPLETED      0:0 2013-08-13T09:47:18 2013-08-13T22:03:39   12:16:21
382846                        trinity_step2 CANCELLED+      0:0 2013-08-13T22:03:39 2013-08-14T15:40:45   17:37:06
426288                        trinity_step1    RUNNING      0:0 2013-08-20T15:24:23             Unknown   00:14:21
426289                        trinity_step2    PENDING      0:0             Unknown             Unknown   00:00:00
426290                        trinity_step3    PENDING      0:0             Unknown             Unknown   00:00:00
426291                        trinity_step4    PENDING      0:0             Unknown             Unknown   00:00:00

转载于:https://my.oschina.net/u/727594/blog/191124

Running Trinity in multiple steps相关推荐

  1. Trinity安装全过程并解决部分报错

    Trinity安装全过程并解决部分报错 简单的安装方式: # Hompage : https://github.com/trinityrnaseq/trinityrnaseq/wiki# 安装 sud ...

  2. nvidia Multiple Process Service (MPS)

    What MPS is? MPS is a binary-compatible client-server runtime implementation of the CUDA API which c ...

  3. brain.js 时间序列_免费的Brain JS课程学习JavaScript中的神经网络

    brain.js 时间序列 The last few years, machine learning has gone from a promising technology to something ...

  4. 在Ubuntu 18.04上安装和使用Tesseract 4

    量子指南 (QUANTRIUM GUIDES) Today, the extraction of information from scanned documents such as letters, ...

  5. azure第一个月_MLOps:两个Azure管道的故事

    azure第一个月 Luuk van der Velden and Rik Jongerius 卢克·范德·费尔登(Luuk van der Velden)和里克 ·琼格里乌斯( Rik Jonger ...

  6. 2套RAC环境修改scanip后客户端连接异常

    摘要:在某个项目上需要将1套rac数据库迁移到另外1套rac,这2套rac的网段一致.数据库名一致.在迁移之后发现新的数据还是会往老的数据库插入,然而新数据库并没有新增数据. 2套RAC环境修改sca ...

  7. 禁用 ssh agent_如何修复“禁用Agent XP”错误

    禁用 ssh agent This article explains how we can fix SQL Server error "Agent XPs Disabled". B ...

  8. java 执行ssis包_在SSIS包中使用CHECKPOINT重新启动包执行

    java 执行ssis包 In the article, SQL Server CHECKPOINT, Lazy Writer, Eager Writer and Dirty Pages in SQL ...

  9. sql raiserror_SQL Server PRINT和SQL Server RAISERROR语句

    sql raiserror This article explores the SQL Server PRINT statements, and its alternative SQL Server ...

最新文章

  1. Android 优秀开源项目
  2. 属于web框架的python库_(2017)我不建议使用的Python Web框架
  3. Win10 + VSCode踩坑 + vue项目开发:设置vscode终端为管理员权限
  4. 龙芯机器联网时,链接建立速度有点慢
  5. 阿特拉斯拧紧枪说明书_阿特拉斯使用说明书(全).pdf
  6. 《东周列国志》第一百一回 秦王灭周迁九鼎 廉颇败燕杀二将
  7. SAP中常用SE系列TCODE汇总
  8. 中国人正在上的四个大当 看你到底上了几个了?
  9. linux程序释放内存,Linux释放内存方法
  10. Chrome安装程序遇到错误 0xe0000008解决办法
  11. 读《拆掉思维的墙》小记
  12. asp.net web开发框架_Web前端开发必不可少的9个开源框架
  13. _kbhit()以及_getch()函数的用法
  14. vue 3D轮播展示 --vue-carousel-3d
  15. web2.0中流行的设计元素:颜色
  16. GameFramework篇:StarForce资源加载细节讲解(一:资源加载辅助器)
  17. Python 的“self“参数是什么?
  18. 李国庆谈离职:记录一个降薪降职和辞职的观点
  19. x86 LEA 指令
  20. Ubuntu 20.4 安装 Sublime Text 步骤

热门文章

  1. 框架源码专题:springIOC的加载过程,bean的生命周期,结合spring源码分析
  2. 类加载机制、双亲委派机制深度解析以及如何自定义类加载器
  3. 总结协查上海银行绑卡失败原因的处理过程
  4. JVM对象内存分配详细过程(栈上分配->TLAB->老年代->Eden区)
  5. 计算机更新阶段,较旧的计算机开始自动升级到Windows10 2004版
  6. 设计模式--结构型模式
  7. VDUSE(vDPA Device in Userspace)技术简介
  8. HashMap TreeMap专题
  9. Nginx的原理解析
  10. MySQL学习(二)