更多代码请见:https://github.com/xubo245

基因数据处理系列之SparkBWA

1.解释

1.1 简述

当partitions超过节点数量的时候Lost executor的问题,已经提交到SparkBWA中,https://github.com/citiususc/SparkBWA/issues/35

另外发现,tmp里面有临时文件没有删除,而且stage retry

未解决

2.记录

完整报错:

[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 0 'bwa'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Algorithm found 1 'mem'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 1 'mem'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 2 '-t'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 3 '2'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename parameter -f found 4 '-f'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 4 '-f'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename found 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 6 '/home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 7 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_1'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 8 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_2'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[0]: bwa.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[1]: mem.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[2]: -t.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[3]: 2.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[4]: /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[5]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_1.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[6]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_2.
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 10284 sequences (1028400 bp)...
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 0 'bwa'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Algorithm found 1 'mem'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 1 'mem'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 2 '-t'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 3 '2'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename parameter -f found 4 '-f'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 4 '-f'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename found 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-12.sam'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-12.sam'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 6 '/home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 7 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_1'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 8 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_2'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[0]: bwa.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[1]: mem.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[2]: -t.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[3]: 2.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[4]: /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[5]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_1.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[6]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_2.
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 10286 sequences (1028600 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 4579, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (191, 198, 205)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (163, 233)
[M::mem_pestat] mean and std.dev: (198.15, 10.05)
[M::mem_pestat] low and high boundaries for proper pairs: (149, 247)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 10284 reads in 3.555 CPU sec, 1.835 real sec
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t 2 /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_1 /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_2
[main] Real time: 2.076 sec; CPU: 16.998 sec
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Return code from BWA 0.
17/02/13 19:24:28 ERROR BwaAlignmentBase: getOutputFile:/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam
17/02/13 19:24:28 ERROR BwaAlignmentBase: outputSamFileName:/xubo/project/alignment/sparkBWA/output/g38chr1/standaloneT20170213192403L100c100000Nhs20Paired12/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam
17/02/13 19:24:29 ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks
java.io.IOException: Failed to connect to Mcnode1/219.219.220.180:44294at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:193)at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:88)at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)at org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:97)at org.apache.spark.storage.ShuffleBlockFetcherIterator.sendRequest(ShuffleBlockFetcherIterator.scala:152)at org.apache.spark.storage.ShuffleBlockFetcherIterator.initialize(ShuffleBlockFetcherIterator.scala:265)at org.apache.spark.storage.ShuffleBlockFetcherIterator.<init>(ShuffleBlockFetcherIterator.scala:112)at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:43)at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90)at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:96)at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:95)at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)at com.github.sparkbwa.BwaPairedAlignment.call(BwaPairedAlignment.java:94)at com.github.sparkbwa.BwaPairedAlignment.call(BwaPairedAlignment.java:33)at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction2$1.apply(JavaPairRDD.scala:1024)at org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)at org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:727)at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:727)at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)at org.apache.spark.scheduler.Task.run(Task.scala:88)at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: Mcnode1/219.219.220.180:44294at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)... 1 more
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 4601, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (192, 199, 205)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (166, 231)
[M::mem_pestat] mean and std.dev: (198.69, 9.94)
[M::mem_pestat] low and high boundaries for proper pairs: (153, 244)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 10286 reads in 6.262 CPU sec, 4.261 real sec
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t 2 /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_1 /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_2
[main] Real time: 4.478 sec; CPU: 20.226 sec
*** Error in `/usr/lib/jvm/jdk1.7.0/bin/java': munmap_chunk(): invalid pointer: 0x00007fc3702d3710 ***

参考

    【1】https://github.com/xubo245【2】http://blog.csdn.net/xubo245/

Spark问题14之Spark stage retry问题相关推荐

  1. 【Spark深入学习 -14】Spark应用经验与程序调优

    ----本节内容------- 1.遗留问题解答 2.Spark调优初体验 2.1 利用WebUI分析程序瓶颈 2.2 设置合适的资源 2.3 调整任务的并发度 2.4 修改存储格式 3.Spark调 ...

  2. Spark3000门徒第14课spark RDD解密总结

    今晚听了王家林老师的第14课spark RDD解密,课堂笔记如下: Spark是基于工作集的应用抽象,RDD:Resillient Distributed Dataset是基于工作集的,spark可以 ...

  3. Spark面试,Spark面试题,Spark面试汇总

    Table of Contents 1.你觉得spark 可以完全替代hadoop 么? 2.Spark消费 Kafka,分布式的情况下,如何保证消息的顺序? 3.对于 Spark 中的数据倾斜问题你 ...

  4. spark 提交任务到spark

    用上俩篇写的spark 程序提交到spark 做运行测试,分别以俩种方式进行提交(yarn-cluster)(yarn-client) 1>将编写的spark程序打成jar包 2>将打好的 ...

  5. Spark(四) -- Spark工作机制

    版权声明:本文为博主原创文章,未经博主允许不得转载. https://blog.csdn.net/qq1010885678/article/details/45728173 一.应用执行机制 一个应用 ...

  6. 【原创】大数据基础之Spark(9)spark部署方式yarn/mesos

    1 下载解压 https://spark.apache.org/downloads.html $ wget http://mirrors.shu.edu.cn/apache/spark/spark-2 ...

  7. Spark快速入门指南 – Spark安装与基础使用

    本文转载自Spark快速入门指南 – Spark安装与基础使用 Apache Spark 是一个新兴的大数据处理通用引擎,提供了分布式的内存抽象.Spark 正如其名,最大的特点就是快(Lightni ...

  8. 手把手带你了解Spark作业“体检报告” --Spark UI

    手把手带你了解Spark作业"体检报告" --Spark UI Spark UI 一级入口 Executors Environment Storage SQL Jobs Stage ...

  9. Spark入门学习交流—Spark生态圈

    1.简介 1.1 Spark简介 Spark是加州大学伯克利分校AMP实验室(Algorithms, Machines, and People Lab)开发通用内存并行计算框架.Spark在2013年 ...

最新文章

  1. Android setFocusableInTouchMode 方法使用和源码详解
  2. 深入解析Linux中的fork函数
  3. CISCO与华为3COM路由器配置差别
  4. Java初学者不得不知的概念,JDK,JRE,JVM的区别?
  5. #开发catia_CATIA工程制图二次开发之15:从平面创建剖面视图
  6. Quartus II 15.0详细安装步骤
  7. 隐藏与显现_手机键盘摇一摇,隐藏功能立马显现,太棒了
  8. C/C++中善用大括号
  9. 100亿人口会挨饿吗?人工智能迎击全球粮食问题
  10. 不使用机器学习的机器视觉_使用机器学习为卡通着色
  11. (36)FPGA面试题D触发器实现4进制计数器
  12. python 中locals() 和 globals()的区别
  13. FOSRestBundle功能包:使用指南
  14. echarts3Dearth 地球数据可视化添加 tooltip效果和涟漪扩散的效果
  15. java分号_java – 为什么这些分号不会产生错误?
  16. matlab数字信号处理(1)——正弦信号生成与时域分析
  17. 【20200401程序设计思维与实践 Week7作业】
  18. 爬虫代理和验证码识别
  19. 红米note2位置服务器,红米Note2
  20. fiddler拦截模拟器中app的请求设置方法

热门文章

  1. 仓库摆放示意图_仓库货物摆放标准
  2. msf拿到shell显示乱码解决方法以及chcp各种编码的补充
  3. java好用吗_你准备好使用Java9了吗?
  4. c语言求不定式的最大值,C语言之四则运算表达式求值(链栈)—支持浮点型数据,负数, 整型数据运算...
  5. nmn吃第一天有什么感觉,吃完nmn的反应,一点点体会
  6. 解决 no such file or directory, scandir ‘node_modules\node-sass\vendor 报错
  7. python中删除list中某指定元素
  8. mysql 类型_MySQL 数据类型有哪些?
  9. 聚合支付”为什么很多游戏商家选择他
  10. Tomcat7 部署CGI程序