ceph health detail,可以看到如下pg处于incomplete状态:
pg 7.c is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.11 is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.15 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.17 is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.1a is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.1f is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.22 is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.25 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.27 is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.29 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.39 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.3f is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.41 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.43 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.48 is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.4d is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.4f is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.55 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.5b is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.5d is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.61 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.66 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.67 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.6e is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.76 is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.78 is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.7b is incomplete, acting [3,2] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')
    pg 7.7e is incomplete, acting [2,3] (reducing pool ceph-kvm-pool min_size from 2 may help; search ceph.com/docs for 'incomplete')

看起来,都是位于osd.2和osd.3上。先设置集群不进行reblance:
ceph osd set noout
ceph osd set nodown
ceph osd set norebalance
停止osd.2和osd.3:
ceph osd down osd.2  
ceph osd down osd.3
列出osd.2下的pg:
ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-2/ --op list-pgs
可以看到,上面标记的incomplete状态的pg确实在里面。

用如下脚本,令down掉的osd里的pg都被mark为complete:
#!/bin/bash
for i in `ceph osd tree down 2>/dev/null |grep -w -A 4 node1 |grep -v node|awk '{print $4}'|sed 's/osd.//g'` #获取当前节点down的osd
do
ceph osd in osd.$i #将osd标记为in,防止数据迁移
 ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-$i/ --op list-pgs > pg."$i" 2>/dev/null #获取当前osd的所有pg并放入对应pg.id文件中
 for j in `cat pg."$i"`
 do
 ceph-objectstore-tool --pgid $j --op mark-complete --data-path /var/lib/ceph/osd/ceph-$i/ --type bluestore #取出每个pg,标记为complete
 done
done
需要注意:ceph osd tree down 2>/dev/null |grep -w -A 4 node1 |grep -v node 这里的node1和node请按照自己的环境情况进行替换,保证可以拿到正确的ceph-osd的index。

启动osd:
systemctl enable ceph-osd@2
systemctl enable ceph-osd@3
systemctl start ceph-osd@2
systemctl start ceph-osd@3
清除标记:
ceph osd unset noout
ceph osd unset nodown
ceph osd unset norebalance

如果还存在pg报错,not deep-scrubbed in time,手动deep-scrubbed一下pg:
ceph pg deep-scrub 7.66
ceph pg deep-scrub 7.5b
ceph pg deep-scrub 7.15
ceph pg deep-scrub 7.1f
ceph pg deep-scrub 7.25

此后,ceph还是处于health_warn状态,因为有91 daemons have recently crashed警告
列出ceph全部crash记录:ceph crash ls-new
将奔溃记录归档:ceph crash archive-all
此时,ceph即恢复到health_ok状态

处理ceph osd incomplete相关推荐

  1. ceph osd 相关命令

    混合osd的部署 先部署所有的ssd 在/etc/ceph.conf中最后添加ssd做osd的block大小如下: 比如部署中有两个ssd,则添加 [osd.0] bluestore_block_si ...

  2. ceph osd 由于“No space left on device” 异常down,通过扩容文件系统或者显式运行osd进程解决

    文章目录 ceph版本: 环境配置: 异常问题: 问题解决: 总结 ceph版本: ceph 12.2.1 环境配置: tier_pool 16个分区大小800G 的osd容量 3副本 data_po ...

  3. ceph osd混合部署和普通部署

    文章目录 混合osd的部署 先部署所有的ssd 部署hdd 普通OSD的部署 当OSD被踢出集群但是挂载点还在,为osd添加id验证 测试OSD压力 Mark osd 为down 混合osd的部署 混 ...

  4. Ceph OSD Down

    CEPH集群跑了一段时间后有几个OSD变成down的状态了,但是我用这个命令去activate也不行 ceph-deploy osd activate osd1:/dev/sdb2:/dev/sdb1 ...

  5. ceph osd为down的情况

    ceph修复osd为down的情况 尝试一.直接重新激活所有osd 1.查看osd树 root@ceph01:~# ceph osd tree ID WEIGHT TYPE NAME UP/DOWN ...

  6. 【ceph】ceph osd blacklist cep黑名单|MDS问题分析

    目录 blacklist 是什么 blacklist相关操作 Ceph MDS问题分析 CephFS client evict子命令使用 概述 命令格式 1. 查看所有client/session 2 ...

  7. 解决ceph osd写满导致osd无法启动的问题

    背景 最近一个无人看管的ceph集群出现了osd被写满的情况,osd磁盘使用量99.99%,然后osd自己down了,重启也启动不起来. 可能是因为之前有人调过full的限制值,所以才完全写满了,由于 ...

  8. ceph osd down修复

    一.查看osd状态找到down状态的osd ceph osd tree 二.删除对应osd 1.调整osd 的crush weight ceph osd crush reweight osd.18 0 ...

  9. 【ceph】ceph OSD状态及常用命令

    OSD进程的启动停止:https://blog.csdn.net/bandaoyu/article/details/119894927 1. OSD概念 OSD:Object Storage Devi ...

最新文章

  1. 黑帽大会2014:10个酷炫的黑客工具
  2. 动态绘制圆环和扇形的源代码
  3. pytorch 常用的 loss function
  4. python删除word表格中的某一行_python docx删除word段落
  5. 保持你的决心——《传说之下》背后的设计之道
  6. GPU Gems2 - 13 动态环境光遮蔽与间接光照(Dynamic Ambient Occlusion and Indirect Lighting)
  7. ETL异构数据源Datax_工具部署_02
  8. 郫都区计算机学校,成都郫县好升学的计算机学校有哪些
  9. ef mysql dbfirst_.NetCore教程之 EFCore连接Mysql DBFirst模式
  10. SpringBoot中如何优雅的使用拦截器
  11. PHP实现微信企业付款
  12. android源码编译完成之后
  13. 20165320 我期望的师生关系
  14. Inno Setup 简体中文语言包
  15. 有哪些方式加速大数据查询速度
  16. 【spider06】Selenium
  17. python简单实现天猫手机评论标签提取--自然语言处理
  18. Python基础数据类型---列表、元组、字典、集合、编码进价、数据类型转换
  19. 三年级语文课外阅读赏析——心田花开
  20. java timeunit_java – 了解TimeUnit

热门文章

  1. WordPress主题制作进阶#10自定义主页
  2. chatGPT身份指令
  3. java tire树_Java实现Tire
  4. 奇幻到android studio 的DUANG 蛮捄三
  5. ContentProvider操作数据库—一项古老的Android技术
  6. 把pdf转换成ppt格式的步骤
  7. 镭速Raysync v6.6.8.0版本发布
  8. 狄克斯特拉(Dijkstra)算法求一个顶点到其余各个顶点的最短路径
  9. php c++多态区别,【总结】C++多态性与虚函数
  10. 1000米跑步时的思考