Spark问题14之Spark stage retry问题
更多代码请见:https://github.com/xubo245
基因数据处理系列之SparkBWA
1.解释
1.1 简述
当partitions超过节点数量的时候Lost executor的问题,已经提交到SparkBWA中,https://github.com/citiususc/SparkBWA/issues/35
另外发现,tmp里面有临时文件没有删除,而且stage retry
未解决
2.记录
完整报错:
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 0 'bwa'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Algorithm found 1 'mem'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 1 'mem'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 2 '-t'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 3 '2'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename parameter -f found 4 '-f'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 4 '-f'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename found 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 6 '/home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 7 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_1'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 8 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_2'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[0]: bwa.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[1]: mem.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[2]: -t.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[3]: 2.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[4]: /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[5]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_1.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[6]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_2.
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 10284 sequences (1028400 bp)...
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 0 'bwa'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Algorithm found 1 'mem'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 1 'mem'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 2 '-t'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 3 '2'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename parameter -f found 4 '-f'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 4 '-f'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Filename found 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-12.sam'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 5 '/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-12.sam'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 6 '/home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 7 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_1'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Arg 8 '/home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_2'
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[0]: bwa.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[1]: mem.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[2]: -t.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[3]: 2.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[4]: /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[5]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_1.
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] option[6]: /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_2.
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[M::process] read 10286 sequences (1028600 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 4579, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (191, 198, 205)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (163, 233)
[M::mem_pestat] mean and std.dev: (198.15, 10.05)
[M::mem_pestat] low and high boundaries for proper pairs: (149, 247)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 10284 reads in 3.555 CPU sec, 1.835 real sec
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t 2 /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_1 /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD4_2
[main] Real time: 2.076 sec; CPU: 16.998 sec
[Java_com_github_sparkbwa_BwaJni_bwa_1jni] Return code from BWA 0.
17/02/13 19:24:28 ERROR BwaAlignmentBase: getOutputFile:/home/hadoop/cloud/workspace/tmps/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam
17/02/13 19:24:28 ERROR BwaAlignmentBase: outputSamFileName:/xubo/project/alignment/sparkBWA/output/g38chr1/standaloneT20170213192403L100c100000Nhs20Paired12/SparkBWA_g38L100c100000Nhs20Paired1.fastq-18-NoSort-app-20170213192407-1107-4.sam
17/02/13 19:24:29 ERROR RetryingBlockFetcher: Exception while beginning fetch of 1 outstanding blocks
java.io.IOException: Failed to connect to Mcnode1/219.219.220.180:44294at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:193)at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:156)at org.apache.spark.network.netty.NettyBlockTransferService$$anon$1.createAndStart(NettyBlockTransferService.scala:88)at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:140)at org.apache.spark.network.shuffle.RetryingBlockFetcher.start(RetryingBlockFetcher.java:120)at org.apache.spark.network.netty.NettyBlockTransferService.fetchBlocks(NettyBlockTransferService.scala:97)at org.apache.spark.storage.ShuffleBlockFetcherIterator.sendRequest(ShuffleBlockFetcherIterator.scala:152)at org.apache.spark.storage.ShuffleBlockFetcherIterator.initialize(ShuffleBlockFetcherIterator.scala:265)at org.apache.spark.storage.ShuffleBlockFetcherIterator.<init>(ShuffleBlockFetcherIterator.scala:112)at org.apache.spark.shuffle.hash.HashShuffleReader.read(HashShuffleReader.scala:43)at org.apache.spark.rdd.ShuffledRDD.compute(ShuffledRDD.scala:90)at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:96)at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:95)at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)at scala.collection.convert.Wrappers$IteratorWrapper.hasNext(Wrappers.scala:29)at com.github.sparkbwa.BwaPairedAlignment.call(BwaPairedAlignment.java:94)at com.github.sparkbwa.BwaPairedAlignment.call(BwaPairedAlignment.java:33)at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction2$1.apply(JavaPairRDD.scala:1024)at org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)at org.apache.spark.api.java.JavaRDDLike$$anonfun$mapPartitionsWithIndex$1.apply(JavaRDDLike.scala:102)at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:727)at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$18.apply(RDD.scala:727)at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300)at org.apache.spark.rdd.RDD.iterator(RDD.scala:264)at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)at org.apache.spark.scheduler.Task.run(Task.scala:88)at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.ConnectException: Connection refused: Mcnode1/219.219.220.180:44294at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)at io.netty.channel.socket.nio.NioSocketChannel.doFinishConnect(NioSocketChannel.java:224)at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:289)at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:528)at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)... 1 more
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 4601, 0, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (192, 199, 205)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (166, 231)
[M::mem_pestat] mean and std.dev: (198.69, 9.94)
[M::mem_pestat] low and high boundaries for proper pairs: (153, 244)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 10286 reads in 6.262 CPU sec, 4.261 real sec
[main] Version: 0.7.15-r1140
[main] CMD: bwa mem -t 2 /home/hadoop/disk2/xubo/ref/GRCH38L1Index/GRCH38chr1L3556522.fasta /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_1 /home/hadoop/cloud/workspace/tmps/app-20170213192407-1107-RDD12_2
[main] Real time: 4.478 sec; CPU: 20.226 sec
*** Error in `/usr/lib/jvm/jdk1.7.0/bin/java': munmap_chunk(): invalid pointer: 0x00007fc3702d3710 ***
参考
【1】https://github.com/xubo245【2】http://blog.csdn.net/xubo245/
Spark问题14之Spark stage retry问题相关推荐
- 【Spark深入学习 -14】Spark应用经验与程序调优
----本节内容------- 1.遗留问题解答 2.Spark调优初体验 2.1 利用WebUI分析程序瓶颈 2.2 设置合适的资源 2.3 调整任务的并发度 2.4 修改存储格式 3.Spark调 ...
- Spark3000门徒第14课spark RDD解密总结
今晚听了王家林老师的第14课spark RDD解密,课堂笔记如下: Spark是基于工作集的应用抽象,RDD:Resillient Distributed Dataset是基于工作集的,spark可以 ...
- Spark面试,Spark面试题,Spark面试汇总
Table of Contents 1.你觉得spark 可以完全替代hadoop 么? 2.Spark消费 Kafka,分布式的情况下,如何保证消息的顺序? 3.对于 Spark 中的数据倾斜问题你 ...
- spark 提交任务到spark
用上俩篇写的spark 程序提交到spark 做运行测试,分别以俩种方式进行提交(yarn-cluster)(yarn-client) 1>将编写的spark程序打成jar包 2>将打好的 ...
- Spark(四) -- Spark工作机制
版权声明:本文为博主原创文章,未经博主允许不得转载. https://blog.csdn.net/qq1010885678/article/details/45728173 一.应用执行机制 一个应用 ...
- 【原创】大数据基础之Spark(9)spark部署方式yarn/mesos
1 下载解压 https://spark.apache.org/downloads.html $ wget http://mirrors.shu.edu.cn/apache/spark/spark-2 ...
- Spark快速入门指南 – Spark安装与基础使用
本文转载自Spark快速入门指南 – Spark安装与基础使用 Apache Spark 是一个新兴的大数据处理通用引擎,提供了分布式的内存抽象.Spark 正如其名,最大的特点就是快(Lightni ...
- 手把手带你了解Spark作业“体检报告” --Spark UI
手把手带你了解Spark作业"体检报告" --Spark UI Spark UI 一级入口 Executors Environment Storage SQL Jobs Stage ...
- Spark入门学习交流—Spark生态圈
1.简介 1.1 Spark简介 Spark是加州大学伯克利分校AMP实验室(Algorithms, Machines, and People Lab)开发通用内存并行计算框架.Spark在2013年 ...
最新文章
- Android setFocusableInTouchMode 方法使用和源码详解
- 深入解析Linux中的fork函数
- CISCO与华为3COM路由器配置差别
- Java初学者不得不知的概念,JDK,JRE,JVM的区别?
- #开发catia_CATIA工程制图二次开发之15:从平面创建剖面视图
- Quartus II 15.0详细安装步骤
- 隐藏与显现_手机键盘摇一摇,隐藏功能立马显现,太棒了
- C/C++中善用大括号
- 100亿人口会挨饿吗?人工智能迎击全球粮食问题
- 不使用机器学习的机器视觉_使用机器学习为卡通着色
- (36)FPGA面试题D触发器实现4进制计数器
- python 中locals() 和 globals()的区别
- FOSRestBundle功能包:使用指南
- echarts3Dearth 地球数据可视化添加 tooltip效果和涟漪扩散的效果
- java分号_java – 为什么这些分号不会产生错误?
- matlab数字信号处理(1)——正弦信号生成与时域分析
- 【20200401程序设计思维与实践 Week7作业】
- 爬虫代理和验证码识别
- 红米note2位置服务器,红米Note2
- fiddler拦截模拟器中app的请求设置方法
热门文章
- 仓库摆放示意图_仓库货物摆放标准
- msf拿到shell显示乱码解决方法以及chcp各种编码的补充
- java好用吗_你准备好使用Java9了吗?
- c语言求不定式的最大值,C语言之四则运算表达式求值(链栈)—支持浮点型数据,负数, 整型数据运算...
- nmn吃第一天有什么感觉,吃完nmn的反应,一点点体会
- 解决 no such file or directory, scandir ‘node_modules\node-sass\vendor 报错
- python中删除list中某指定元素
- mysql 类型_MySQL 数据类型有哪些?
- 聚合支付”为什么很多游戏商家选择他
- Tomcat7 部署CGI程序