CESM mpirun noticed that process rank 1 with PID 0 on node ubuntu exited on signal 11
这是个笔记
在移植CESM的时候,我想着我的服务器比较强,所以希望同时跑两个案例。
在我跑第二个案例的时候 “./case.submit”,出现以下错误:
我的案例及其编译器
./create_newcase --case 1850CLM50Bgc_gnu_cesm --res f19_g16 --compset I1850Clm50Bgc --run-unsupported --compiler gnu --mach mygnu
2021-08-08 12:21:36 MODEL EXECUTION BEGINS HERE
run command is mpirun -np 4 /home/ubuntu/cesm/scratch/1850CLM50Bgc_gnu_cesm/bld/cesm.exe >> cesm.log.$LID 2>&1
ERROR: RUN FAIL: Command 'mpirun -np 4 /home/ubuntu/cesm/scratch/1850CLM50Bgc_gnu_cesm/bld/cesm.exe >> cesm.log.$LID 2>&1 ' failed
See log file for details: /home/ubuntu/cesm/scratch/1850CLM50Bgc_gnu_cesm/run/cesm.log.210808-122133
运行代码 “cat /home/ubuntu/cesm/scratch/1850CLM50Bgc_gnu_cesm/run/cesm.log.210808-122133”,查看cesm日志。
Invalid PIO rearranger comm max pend req (comp2io), 0Resetting PIO rearranger comm max pend req (comp2io) to 64PIO rearranger options:comm type =p2p comm fcd =2denable max pend req (comp2io) = 0enable_hs (comp2io) = Tenable_isend (comp2io) = Fmax pend req (io2comp) = 64enable_hs (io2comp) = Fenable_isend (io2comp) = T
(seq_comm_setcomm) init ID ( 1 GLOBAL ) pelist = 0 3 1 ( npes = 4) ( nthreads = 1)( suffix =)
(seq_comm_setcomm) init ID ( 2 CPL ) pelist = 0 3 1 ( npes = 4) ( nthreads = 1)( suffix =)
(seq_comm_setcomm) init ID ( 5 ATM ) pelist = 0 3 1 ( npes = 4) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 6 CPLATM ) join IDs = 2 5 ( npes = 4) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 3 ALLATMID ) join multiple comp IDs ( npes = 4) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 4 CPLALLATMID ) join IDs = 2 3 ( npes = 4) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 9 LND ) pelist = 0 3 1 ( npes = 4) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 10 CPLLND ) join IDs = 2 9 ( npes = 4) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 7 ALLLNDID ) join multiple comp IDs ( npes = 4) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 8 CPLALLLNDID ) join IDs = 2 7 ( npes = 4) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 13 ICE ) pelist = 0 3 1 ( npes = 4) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 14 CPLICE ) join IDs = 2 13 ( npes = 4) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 11 ALLICEID ) join multiple comp IDs ( npes = 4) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 12 CPLALLICEID ) join IDs = 2 11 ( npes = 4) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 17 OCN ) pelist = 0 3 1 ( npes = 4) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 18 CPLOCN ) join IDs = 2 17 ( npes = 4) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 15 ALLOCNID ) join multiple comp IDs ( npes = 4) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 16 CPLALLOCNID ) join IDs = 2 15 ( npes = 4) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 21 ROF ) pelist = 0 3 1 ( npes = 4) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 22 CPLROF ) join IDs = 2 21 ( npes = 4) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 19 ALLROFID ) join multiple comp IDs ( npes = 4) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 20 CPLALLROFID ) join IDs = 2 19 ( npes = 4) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 25 GLC ) pelist = 0 3 1 ( npes = 4) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 26 CPLGLC ) join IDs = 2 25 ( npes = 4) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 23 ALLGLCID ) join multiple comp IDs ( npes = 4) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 24 CPLALLGLCID ) join IDs = 2 23 ( npes = 4) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 29 WAV ) pelist = 0 3 1 ( npes = 4) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 30 CPLWAV ) join IDs = 2 29 ( npes = 4) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 27 ALLWAVID ) join multiple comp IDs ( npes = 4) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 28 CPLALLWAVID ) join IDs = 2 27 ( npes = 4) ( nthreads = 1)
(seq_comm_setcomm) init ID ( 33 ESP ) pelist = 0 0 1 ( npes = 1) ( nthreads = 1)( suffix =)
(seq_comm_joincomm) init ID ( 34 CPLESP ) join IDs = 2 33 ( npes = 4) ( nthreads = 1)
(seq_comm_jcommarr) init ID ( 31 ALLESPID ) join multiple comp IDs ( npes = 1) ( nthreads = 1)
(seq_comm_joincomm) init ID ( 32 CPLALLESPID ) join IDs = 2 31 ( npes = 4) ( nthreads = 1)
(seq_comm_printcomms) 1 0 4 1 GLOBAL:
(seq_comm_printcomms) 2 0 4 1 CPL:
(seq_comm_printcomms) 3 0 4 1 ALLATMID:
(seq_comm_printcomms) 4 0 4 1 CPLALLATMID:
(seq_comm_printcomms) 5 0 4 1 ATM:
(seq_comm_printcomms) 6 0 4 1 CPLATM:
(seq_comm_printcomms) 7 0 4 1 ALLLNDID:
(seq_comm_printcomms) 8 0 4 1 CPLALLLNDID:
(seq_comm_printcomms) 9 0 4 1 LND:
(seq_comm_printcomms) 10 0 4 1 CPLLND:
(seq_comm_printcomms) 11 0 4 1 ALLICEID:
(seq_comm_printcomms) 12 0 4 1 CPLALLICEID:
(seq_comm_printcomms) 13 0 4 1 ICE:
(seq_comm_printcomms) 14 0 4 1 CPLICE:
(seq_comm_printcomms) 15 0 4 1 ALLOCNID:
(seq_comm_printcomms) 16 0 4 1 CPLALLOCNID:
(seq_comm_printcomms) 17 0 4 1 OCN:
(seq_comm_printcomms) 18 0 4 1 CPLOCN:
(seq_comm_printcomms) 19 0 4 1 ALLROFID:
(seq_comm_printcomms) 20 0 4 1 CPLALLROFID:
(seq_comm_printcomms) 21 0 4 1 ROF:
(seq_comm_printcomms) 22 0 4 1 CPLROF:
(seq_comm_printcomms) 23 0 4 1 ALLGLCID:
(seq_comm_printcomms) 24 0 4 1 CPLALLGLCID:
(seq_comm_printcomms) 25 0 4 1 GLC:
(seq_comm_printcomms) 26 0 4 1 CPLGLC:
(seq_comm_printcomms) 27 0 4 1 ALLWAVID:
(seq_comm_printcomms) 28 0 4 1 CPLALLWAVID:
(seq_comm_printcomms) 29 0 4 1 WAV:
(seq_comm_printcomms) 30 0 4 1 CPLWAV:
(seq_comm_printcomms) 31 0 1 1 ALLESPID:
(seq_comm_printcomms) 32 0 4 1 CPLALLESPID:
(seq_comm_printcomms) 33 0 1 1 ESP:
(seq_comm_printcomms) 34 0 4 1 CPLESP:(t_initf) Read in prof_inparm namelist from: drv_in(t_initf) Using profile_disable= F(t_initf) profile_timer= 4(t_initf) profile_depth_limit= 4(t_initf) profile_detail_limit= 2(t_initf) profile_barrier= F(t_initf) profile_outpe_num= 1(t_initf) profile_outpe_stride= 0(t_initf) profile_single_file= F(t_initf) profile_global_stats= T(t_initf) profile_ovhd_measurement= F(t_initf) profile_add_detail= F(t_initf) profile_papi_enable= FProgram received signal SIGSEGV: Segmentation fault - invalid memory reference.Backtrace for this error:Program received signal SIGSEGV: Segmentation fault - invalid memory reference.Backtrace for this error:Program received signal SIGSEGV: Segmentation fault - invalid memory reference.Backtrace for this error:
#0 0x151e7878232a
#1 0x151e78781503
#2 0x151e77dff03f
#3 0x55d58d7cc3ed
#4 0x55d58d7c29da
#5 0x55d58d7bf712
#6 0x55d58d67cb71
#7 0x55d58d6e40d0
#8 0x55d58cf41888
#9 0x55d58cf3bab7
#10 0x55d58cec4b2e
#11 0x55d58ceb5d8f
#12 0x55d58cec20e0
#13 0x151e77de1bf6
#14 0x55d58cea84d9
#15 0xffffffffffffffff
#0 0x155370fb532a
#1 0x155370fb4503
#2 0x15537063203f
#3 0x5590caf743c1
#4 0x5590caf6a9da
#5 0x5590caf67712
#6 0x5590cae2619f
#7 0x5590cae8c0d0
#8 0x5590ca6e9888
#9 0x5590ca6e3ab7
#10 0x5590ca66cb2e
#11 0x5590ca65dd8f
#12 0x5590ca66a0e0
#13 0x155370614bf6
#14 0x5590ca6504d9
#15 0xffffffffffffffff
#0 0x14d229d6232a
#1 0x14d229d61503
#2 0x14d2293df03f
#3 0x55bbac8b93c1
#4 0x55bbac8af9da
#5 0x55bbac8ac712
#6 0x55bbac769b71
#7 0x55bbac7d10d0
#8 0x55bbac02e888
#9 0x55bbac028ab7
#10 0x55bbabfb1b2e
#11 0x55bbabfa2d8f
#12 0x55bbabfaf0e0
#13 0x14d2293c1bf6
#14 0x55bbabf954d9
#15 0xffffffffffffffff
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 0 on node ubuntu exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
结果方案
其实什么错误都没有,只是工作站只能 submit 一个案例。关掉前一个,运行这一个就ok。写此博客第一为了做笔记,第二希望后面的人不要再像我一样,去查找解决方案,最后还没解决了。
CESM mpirun noticed that process rank 1 with PID 0 on node ubuntu exited on signal 11相关推荐
- Python 编写代码出现 process finished with exit code 0
在Pycharm中编写Python语句,程序都没有问题 就是没有执行的结果,有时候会出现: process finished with exit code 0状况 这个是因为格式不正确造成的:请看图 ...
- Pytorch出现Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
只是提出问题,并没有解决 简介 最近一直在修改一个3D的网络,在设计网络的过程中出现了上面的问题,具体的原因还没有找到,我有一个习惯是设计好网络结构后,需要使用Variable数据输入到网络里面测试, ...
- 0716 process finished with exit code 0 解决
写完代码之后想要debug 看看是否可以运行成功过程中,出现了Process finished with exit code 0 意思是程序运行成功并且退出 exit code 0 表示程序执行成功, ...
- SpringBoot启动代表出现Process finished with exit code 0
代码都没问题,启动时应用无缘无故就是会出现Process finished with exit code 0,这个是什么意思呢,是应用程序正常退出的意思. 可以科普一下exit code: exit ...
- 【问题解决】springboot启动后一小会就自动停止,提示Process finished with exit code 0
最近springboot启动后一小会就自动停止,没提示具体报错,只有提示: Disconnected from the target VM, address: '127.0.0.1:60011', t ...
- SpringBoot启动时:Process finished with exit code 0解决办法
Process finished with exit code 0并不是报错了,这个表示程序正常执行完毕退出了.这就表示项目启动成功后了,此时运行,最后运行完毕自动退出.但我们是需要访问路径的,所以需 ...
- 【报错】Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
在pycharm中运行程序报错:Process finished with exit code 139 (interrupted by signal 11: SIGSEGV) 检查了常见的问题,与我的 ...
- pycharm debug Can‘t process net command: 501 1 0.1 WIN
使用pycharm debug 调试的时候报错: Traceback (most recent call last):File "C:\PyCharm 2018.2.4\helpers\py ...
- pycharm debug: Process finished with exit code 139 (interrupted by signal 11: SIGSEGV)
记录: mac上pycharm上debug时,突然弹出以下提示 以为可以加速,点击install后,就出现debug时,程序退出: Process finished with exit code 13 ...
- Pycharm 错误代码 Process finished with exit code 0
错误代码:Process finished with exit code 0 1.问题描述 PyCharm正常运行,但没有得到预期的效果 2.解决办法 看运行的py文件是否有主函数,或者同一个工程文件 ...
最新文章
- springboot之异步调用@Async
- [网络流24题]最小路径覆盖问题
- Getting Started with CocoaPods
- 成功解决AttributeError: 'list' object has no attribute 'ndim'
- wxWidgets:wxObjectDataPtr< T >类模板用法
- python自动化测试-D6-学习笔记之一(常用模块补充datetime模块)
- 搞懂 Java HashMap 源码
- CentOS下配置JDK1.6+TOMCAT6
- Python计算序列中数字最大差值(美团2016校招笔试题)
- Psam_ISO7816
- 解除用户锁定、修改用户密码
- 网络游戏植入广告营销案例
- mysql 重建索引,mysql优化之索引重建
- 什么是裸金属服务器?
- 微信小程序-电影app程序遇到得问题
- Photoshop修图的常用方法与技巧一
- Django - Celery使用及介绍
- 给一个喝酒青年的公开状
- NCBI ORFfinder结果在线可视化
- Java学习第7篇_supper关键字
热门文章
- Elastic控制查询精准度-minimum_should_match
- 软约束、硬约束、Minimum Snap的轨迹优化方法
- 计算机考研考研院校难度等级,建议收藏
- python苹果手机照片导入电脑_通过python获取苹果手机备份文件中的照片,视频等信息采集...
- win10系统优化小工具:Windows10系统优化辅助工具.bat(批处理)
- 使用ECS和OSS搭建个人网盘(阿里云官方)
- 关于计算机团队名字大全集,好听的团队名字大全
- c++学习书籍推荐《深度探索C++对象模型》下载
- 赛微微电科创板上市破发:跌幅达26% 公司市值44亿
- 3D游戏编程与设计4——游戏对象与图形基础