达梦主备集群,主库部分redo被删除或损坏,故障恢复。
记录一次在虚拟机上安装达梦主备集群,搭建完成后,由于主库新添加3个2G redo.log,造成虚拟机空间不足,无法归档。然后删除3个redo,实例无法启动又到恢复过程。详细步骤如下:
主库报错:
2022-06-22 08:26:04.462 [FATAL] database P0000003551 T0000000000000003551 dmserver startup failed, code = -523 [Out of space]
2022-06-22 08:26:04.462 [FATAL] database P0000003551 T0000000000000003551 nsvr_ini_file_read failed, [code: -523]
2022-06-22 08:26:35.638 [INFO] database P0000003562 T0000000000000003562 INI parameter DW_PORT changed, the original value 0, new value 33141
2022-06-22 08:26:35.638 [ERROR] database P0000003562 T0000000000000003562 ARCH_DEST[/dmarch] will be out of space.
2022-06-22 08:26:35.638 [INFO] database P0000003562 T0000000000000003562 INI parameter DPC_2PC changed, the original value 1, new value 0
2022-06-22 08:26:35.638 [INFO] database P0000003562 T0000000000000003562 INI parameter TRX_VIEW_SIZE changed, the original value 512, new value 50
2022-06-22 08:26:35.639 [FATAL] database P0000003562 T0000000000000003562 dmserver startup failed, code = -523 [Out of space]
2022-06-22 08:26:35.639 [FATAL] database P0000003562 T0000000000000003562 nsvr_ini_file_read failed, [code: -523]
2022-06-22 08:27:06.875 [INFO] database P0000003564 T0000000000000003564 INI parameter DW_PORT changed, the original value 0, new value 33141
2022-06-22 08:27:06.875 [ERROR] database P0000003564 T0000000000000003564 ARCH_DEST[/dmarch] will be out of space.
2022-06-22 08:27:06.875 [INFO] database P0000003564 T0000000000000003564 INI parameter DPC_2PC changed, the original value 1, new value 0
1、在主库主机上:
2、然后主库又报错:
2022-06-22 08:32:22.792 [INFO] database P0000003884 T0000000000000003884 fil_sys_init
2022-06-22 08:32:22.971 [INFO] database P0000003884 T0000000000000003884 Database mode = 1, oguid = 453331
2022-06-22 08:32:22.979 [FATAL] database P0000003884 T0000000000000003884 /dmdata/5236/DM01/DM0103.log not exist,can not startup
2022-06-22 08:32:53.668 [INFO] database P0000003928 T0000000000000003928 INI parameter DW_PORT changed, the original value 0, new value 33141
2022-06-22 08:32:53.678 [INFO] database P0000003928 T0000000000000003928 INI parameter DPC_2PC changed, the original value 1, new value 0
2022-06-22 08:32:53.680 [INFO] database P0000003928 T0000000000000003928 INI parameter TRX_VIEW_SIZE changed, the original value 512, new value 50
2022-06-22 08:32:53.710 [INFO] database P0000003928 T0000000000000003928 version info: develop
2022-06-22 08:32:53.711 [INFO] database P0000003928 T0000000000000003928 os_sema2_create_low, create and inc sema success, key:155910874, sem_id:524288, sem_value:1!
2022-06-22 08:32:53.730 [INFO] database P0000003928 T0000000000000003928 Database's huge_with_delta is 1, and rlog_gen_for_huge is 0!
2022-06-22 08:32:53.745 [INFO] database P0000003928 T0000000000000003928 DM Database Server 64 V8 03134283890-20220304-158322-10045 startup...
2022-06-22 08:32:53.753 [INFO] database P0000003928 T0000000000000003928 INI parameter ROLLSEG_POOLS changed, the original value 19, new value 1
3、监控状态报错
2022-06-22 08:36:22
#================================================================================#
GROUP OGUID MON_CONFIRM MODE MPP_FLAG
GRP1 453331 TRUE MANUAL FALSE <<DATABASE GLOBAL INFO:>>
DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
192.168.0.102 52141 2022-06-22 08:36:22 GLOBAL VALID OPEN DM02 OK 1 1 OPEN STANDBY DSC_OPEN REALTIME VALID ERROR DATABASE:
DW_IP MAL_DW_PORT WTIME WTYPE WCTLSTAT WSTATUS INAME INST_OK N_EP N_OK ISTATUS IMODE DSC_STATUS RTYPE RSTAT
192.168.0.101 52141 2022-06-22 08:36:21 GLOBAL VALID STARTUP DM01 ERROR 1 1 OPEN PRIMARY DSC_OPEN REALTIME VALID
主库无法启动
4、利用监视器将主备切换
choose takeover
Can choose one of the following instances to do takeover:
1: DM02login
username:SYSDBA
password:
[monitor] 2022-06-22 08:37:40: Login dmmonitor success!takeover dm02
[monitor] 2022-06-22 08:38:25: Start to takeover use instance DM02
[monitor] 2022-06-22 08:38:25: Notify dmwatcher(DM02) switch to TAKEOVER status
[monitor] 2022-06-22 08:38:25: Dmwatcher process DM02 status switching [OPEN-->TAKEOVER]
[monitor] 2022-06-22 08:38:26: Switch dmwatcher DM02 to TAKEOVER status success
[monitor] 2022-06-22 08:38:26: Instance DM02 start to execute sql SP_SET_GLOBAL_DW_STATUS(0, 7)
[monitor] 2022-06-22 08:38:26: Instance DM02 execute sql SP_SET_GLOBAL_DW_STATUS(0, 7) success
[monitor] 2022-06-22 08:38:26: Instance DM02 start to execute sql SP_APPLY_KEEP_PKG()
[monitor] 2022-06-22 08:38:27: Instance DM02 execute sql SP_APPLY_KEEP_PKG() success
[monitor] 2022-06-22 08:38:27: Instance DM02 start to execute sql ALTER DATABASE MOUNT
[monitor] 2022-06-22 08:38:29: Instance DM02 execute sql ALTER DATABASE MOUNT success
[monitor] 2022-06-22 08:38:29: Instance DM02 start to execute sql ALTER DATABASE PRIMARY
[monitor] 2022-06-22 08:38:30: Instance DM02 execute sql ALTER DATABASE PRIMARY success
[monitor] 2022-06-22 08:38:30: Notify instance DM02 to change all arch status to be invalid
[monitor] 2022-06-22 08:38:30: Succeed to change all instances arch status to be invalid
[monitor] 2022-06-22 08:38:30: Instance DM02 start to execute sql ALTER DATABASE OPEN FORCE
[monitor] 2022-06-22 08:38:54: Instance DM02 execute sql ALTER DATABASE OPEN FORCE success
[monitor] 2022-06-22 08:38:54: Instance DM02 start to execute sql SP_SET_GLOBAL_DW_STATUS(7, 0)
[monitor] 2022-06-22 08:38:55: Instance DM02 execute sql SP_SET_GLOBAL_DW_STATUS(7, 0) success
[monitor] 2022-06-22 08:38:55: Notify dmwatcher(DM02) switch to OPEN status
[monitor] 2022-06-22 08:38:55: Dmwatcher process DM02 status switching [TAKEOVER-->OPEN]
[monitor] 2022-06-22 08:38:56: Switch dmwatcher DM02 to OPEN status success
[monitor] 2022-06-22 08:38:56: Notify group(GRP1)'s dmwatcher to do clear
[monitor] 2022-06-22 08:38:56: Clean request of dmwatcher processer DM01 success
5、使用监视器关闭原主库
[monitor] 2022-06-22 08:42:49: Stop dmwatcher process of instance DM01[PRIMARY, OPEN, ISTAT_SAME:TRUE]
[monitor] 2022-06-22 08:42:49: Dmwatcher process DM01 status switching [STARTUP-->SHUTDOWN]
[monitor] 2022-06-22 08:42:49: Stop dmwatcher process of instance DM01[PRIMARY, OPEN, ISTAT_SAME:TRUE] success[monitor] 2022-06-22 08:42:49: Notify group(GRP1)'s dmwatcher to do clear
[monitor] 2022-06-22 08:42:50: Clean request of dmwatcher processer DM01 success
[monitor] 2022-06-22 08:42:51: Clean request of dmwatcher processer DM02 success
6、使用dmctlcvt工具,将控制文件转换为文本文件
7、编辑文本文件,将 DM0103.log DM0104.log DM0105.log 相关内容删除
fil_path=/dmdata/5236/DM01/DM0103.log
# mirror path
mirror_path=
# file id
fil_id=2
# whether the file is auto extend
autoextend=1
# file create time
fil_create_time=DATETIME '2022-6-21 14:47:51'
# file modify time
fil_modify_time=DATETIME '2022-6-21 14:47:51'
# the max size of file
fil_max_size=0
# next size of file
fil_next_size=0
8、使用dmctlcvt工具,将控制文件转换为文本文件
9、监视器启动DM01
[monitor] 2022-06-22 08:54:18: Notify group(GRP1)'s active dmwatcher to set MID
[monitor] 2022-06-22 08:54:18: Notify group(GRP1)'s active dmwatcher to set MID success
[monitor] 2022-06-22 08:54:18: Startup dmwatcher process of instance DM01[PRIMARY, OPEN, ISTAT_SAME:TRUE]
[monitor] 2022-06-22 08:54:18: Dmwatcher process DM01 status switching [SHUTDOWN-->STARTUP]
[monitor] 2022-06-22 08:54:19: Startup dmwatcher process of instance DM01[PRIMARY, OPEN, ISTAT_SAME:TRUE] success[monitor] 2022-06-22 08:54:20: Notify group(GRP1)'s dmwatcher to do clear
[monitor] 2022-06-22 08:54:20: Clean request of dmwatcher processer DM01 success
[monitor] 2022-06-22 08:54:20: Clean request of dmwatcher processer DM02 success
startup dmwatcher database dm01[monitor] 2022-06-22 09:06:11: Instance DM01[PRIMARY, MOUNT, ISTAT_SAME:TRUE] recover to OK WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-06-22 09:06:11 STARTUP OK DM01 MOUNT PRIMARY VALID 2 99878 99878 [monitor] 2022-06-22 09:06:11: Dmwatcher process DM01 status switching [STARTUP-->UNIFY EP] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-06-22 09:06:11 UNIFY EP OK DM01 MOUNT STANDBY INVALID 2 99878 99878 [monitor] 2022-06-22 09:06:12: Dmwatcher process DM01 status switching [UNIFY EP-->STARTUP] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-06-22 09:06:12 STARTUP OK DM01 OPEN STANDBY INVALID 2 99878 99878 [monitor] 2022-06-22 09:06:12: Dmwatcher process DM01 status switching [STARTUP-->OPEN] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-06-22 09:06:12 OPEN OK DM01 OPEN STANDBY INVALID 2 99878 99878 [monitor] 2022-06-22 09:06:13: Dmwatcher process DM02 status switching [OPEN-->RECOVERY] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-06-22 09:06:12 RECOVERY OK DM02 OPEN PRIMARY VALID 3 1033149 1033149 [monitor] 2022-06-22 09:08:28: Dmwatcher process DM02 status switching [RECOVERY-->OPEN] WTIME WSTATUS INST_OK INAME ISTATUS IMODE RSTAT N_OPEN FLSN CLSN 2022-06-22 09:08:28 OPEN OK DM02 OPEN PRIMARY VALID 3 1033221 1033222
原主库恢复正常,后面根据空间情况添加redo,以及主库是否回切。
此为测试环境,生产环境需与客户声明,主库redo日志损坏,可能会丢失那部分数据(已写入redo,但还未写数据文件的)。
由于主备库采用实时归档模式,“主库生成联机 Redo 日志,当触发日志写文件操作后,日志线程先将 RLOG_PKG 发送到备库,备库接收后进行合法性校验(包括日志是否连续、备库状态是否 Open 等),不合法则返回错误信息,合法则作为 KEEP_PKG 保留在内存中,原有 KEEP_PKG 的 Redo 日志加入 Apply 任务队列进行 Redo 日志重演,并响应主库日志接收成功。”
“如果备库自动接管、或者用户发起备库接管命令,那么备库的 KEEP_PKG 将会启动重演,不管主库是否已经将 KEEP_PKG 对应的 Redo 日志写入联机日志文件中,备库接管时的 APPLY_LSN 一定是大于等于主库的 FILE_LSN。当故障主库重启后,仍然可以作为备库,自动重新加入数据守护系统。”
由此说明,理论上数据是不会丢的。所以说只是可能,丢不丢数据待测。
到此文章结束,有不足地方,欢迎批正!
更多达梦技术资讯,请访问达梦技术社区:
达梦数据库 - 新一代大型通用关系数据库 | 达梦云适配中心
https://eco.dameng.com/
达梦主备集群,主库部分redo被删除或损坏,故障恢复。相关推荐
- 使用Druid,C3P0连接池连接达梦主备集群
使用Druid,C3P0连接达梦数据库主备集群 导入连接池对应的驱动包,达梦的JDBC驱动包进行连接 连接池信息: Druid连接池版本:1.1.22 C3P0连接池版本:0.9.1 链接:https ...
- 达梦主备守护集群原理详解
达梦主备集群顾名思义就是一主一备(也可以一主多备)是一种集成化的高可靠性解决方案,同时满足用户对数据安全性和高可用性的要求.解决由于硬件故障.自然灾害等原因导致的数据库服务长时间中断问题,满足用户不间 ...
- 达梦数据库实时主备集群的同步机制和切换机制
DM数据守护介绍 1. DM 数据守护(Data Watch) 是一种集成化的高可用.高性能数据库解决方案,是数据库异地容灾的首选方案.通过部署 DM 数据守护,可以在硬件故障(如磁盘损坏).自然灾害 ...
- 达梦-主备与读写分离集群
DM8-主备与读写分离集群 文章目录 DM8-主备与读写分离集群 1. 基本概念 1.1 DM集群 1.2 DM归档类型 2. 数据守护集群 2.1 概念 2.2 数据守护集群架构及原理 2.3 守护 ...
- 达梦数据库DSC集群安装文档
达梦数据库DSC集群安装文档 一. DSC安装准备工作 1 1.1. 创建安装用户 2 1.2. 配置udev存储 2 二. DMDSC 搭建 8 三. 搭建DW容灾架构 39 修改dm.ini,数据 ...
- 达梦数据库守护集群安装
目录 数据库守护集群安装... 1 主备集群原理介绍... 1 搭建DM数据守护环境... 2 环境检查... 2 主备集群搭建思路... 3 安装前工作... 3 数据库安装... 6 主机配置数据 ...
- 达梦v8版本集群搭建
达梦v8版本集群搭建 单节点搭建 1.配置需求 主机 用户 软件安装目录 实例目录 端口 归档日志目录 192.168.12.88 4c 8G 存储至少20G dmdba /home/dmdba/dm ...
- 部署DM MPP主备集群
DM8的DEM上在部署MPP集群时无法和DM7.6的DEM一样可以同时部署DataWatch,所以这里采用手动部署.后面有补充DM7.6部署MPP主备集群步骤截图. 使用两种方式部署MPP主备集群(交 ...
- java应用系统正确的连接DM主备集群
jdbc连接DM主备集群 一句话 为保证集群在出故障时能够正常切换,应用连接时采用服务名方式进行连接,将连接串中的ip改成服务名如下面配置的DM在应用服务器配置dm_svc.conf文件: 应用服务器 ...
最新文章
- 剑指offer:面试题04. 二维数组中的查找
- 【C++】【十一】二叉树递归遍历与非递归遍历的实现及思路
- Linux大神必会操作——系统排错
- python学费多少-培训python学费多少?
- linux sublime 输入中文,Linux中Sublime Text无法输入中文怎么办
- python线程代码_python--(十步代码学会线程)
- ora00936缺失表达式怎么解决_正则表达式替换函数
- jQuery——封装form表单的数据为json对象
- Svchost.exe进程详解及Svchost.exe病毒清除方法
- 基于B/S架构的故障模型
- socket 编程入门教程(一)TCP server 端:6、创建“通讯 ”嵌套字
- LeetCode:二进制手表【401】
- python创建变量_【转载】 Python动态生成变量
- 学计算机毁一生,大学中4大“天坑”级别的专业,学了毁一生,家里没钱不要学...
- java调用短信接口使用实例
- Windows下TexLive2015 TeXstudio 和SumatraPDF安装配置
- 动手学深度学习之数据预处理
- hanlp 如何快速从分词仅取出人名
- k8s搭建v1.18.3高可用集群时添加master节点报错:failure loading certificate for CA: couldn‘t load the certificate fil
- python编程计算圆面积和体积_计算sph体积和表面积的面向对象Python程序
热门文章
- 真正厉害的人,都在延迟满足
- 实时性是指计算机多媒体系统中声音及活动,《计算机应用基础》电子教案
- eNSP实验日记四(防火墙配置)
- English Learning - L2 语音作业打卡 复习双元音 [eɪ] [aɪ] [aʊ] [əʊ] [ɔɪ] [ɪə] [ʊə] [eə] Day39 2023.3.31 周五
- 泛微考勤加班流程,有重复的时间段不让提交
- java监控cpu绘图,java gateway监控cpu使用率
- 苹果自助维修服务上线:维修工具租赁价约为321元
- 基于scapy实现随机源IP的DNS发包工具
- JS数组合并的7种常见方法
- 计算机网络 ping中ttl,ping命令显示的TTL是什么意思?