一.问题起因

2014/10/14接某客户反馈,备份数据库的crontab执行失败。远程连接分析后发现是因为2014/09/13灾备演练过后dataguard参数没有正确调整导致的归档未清理,过多归档备份时因空间不足而失败。详细过程如下

二.日志分析

1.登陆后检查备份日志后发现数据文件备份成功但是备份归档时失败:

including current SPFILE in backup set
channel c1: starting piece 1 at 13-OCT-14
channel c1: finished piece 1 at 13-OCT-14
piece handle=/backup/addrrman/full_ADDRPROD_20141013_14004_1 tag=TAG20141013T220005 comment=NONE
channel c1: backup set complete, elapsed time: 00:00:01
channel c2: finished piece 1 at 13-OCT-14
piece handle=/backup/addrrman/full_ADDRPROD_20141013_14001_1 tag=TAG20141013T220005 comment=NONE
channel c2: backup set complete, elapsed time: 01:45:12
channel c3: finished piece 1 at 13-OCT-14
piece handle=/backup/addrrman/full_ADDRPROD_20141013_14002_1 tag=TAG20141013T220005 comment=NONE
channel c3: backup set complete, elapsed time: 01:46:01
Finished backup at 13-OCT-14sql statement: alter system archive log current
。。。。skip .....released channel: c1
released channel: c2
released channel: c3
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of backup command on c3 channel at 10/14/2014 00:30:34
<span style="color:#ff0000;">ORA-19502: write error on file "/backup/addrrman/arch_ADDRPROD_20141014_14093_1", block number 442369 (block size=512)
ORA-27063: number of bytes read/written is incorrect
IBM AIX RISC System/6000 Error: 28: No space left on device
Additional information: -1
Additional information: 1048576</span>

2.检查数据文件备份集大小发现数据量未剧增

oracle@p740a:/backup/addrrman[addr11g1]$ls -ltr
total 143197088
-rw-------    1 oracle   oinstall         98 Aug 21 18:53 nohup.out
-rw-r--r--    1 oracle   oinstall       7702 Oct 13 22:00 analyze.lst
-rw-r-----    1 oracle   asmadmin 23931797504 Oct 13 23:44 full_ADDRPROD_20141013_14000_1
-rw-r-----    1 oracle   asmadmin    7847936 Oct 13 23:44 full_ADDRPROD_20141013_14003_1
-rw-r-----    1 oracle   asmadmin      98304 Oct 13 23:44 full_ADDRPROD_20141013_14004_1
-rw-r-----    1 oracle   asmadmin 23550468096 Oct 13 23:45 full_ADDRPROD_20141013_14001_1
-rw-r-----    1 oracle   asmadmin 25820962816 Oct 13 23:46 full_ADDRPROD_20141013_14002_1
-rw-r--r--    1 oracle   oinstall    2659758 Oct 14 00:34 rman_delete.log
-rw-r--r--    1 oracle   oinstall     803655 Oct 14 00:37 delete_local_std_arch.log
-rw-r--r--    1 oracle   oinstall    1210456 Oct 14 00:38 rman_bk.log
-rw-r--r--    1 oracle   oinstall        527 Oct 14 00:38 delete_cd_std_arch.log

3.检查归档删除日志发现9/13日归档因为没有在所有standby去apply

RMAN-08120: WARNING: archived log not deleted, not yet applied by standby
archived log file name=+ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13079.1905.858179699 thread=1 sequence=13079
RMAN-08120: WARNING: archived log not deleted, not yet applied by standby
archived log file name=+ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13080.1618.858181499 thread=1 sequence=13080
<span style="color:#ff0000;">RMAN-08120: WARNING: archived log not deleted, not yet applied by standby</span>
archived log file name=+ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13081.1619.858182367 thread=1 sequence=13081

4.结合归档删除脚本中的archivelog删除策略

rman target / nocatalog log /backup/addrrman/rman_delete.log<<EOF
allocate channel for maintenance type disk connect 'sys/xxxx@addr11g1';
allocate channel for maintenance type disk connect 'sys/xxxx@addr11g2';
CONFIGURE RETENTION POLICY TO REDUNDANCY 1;
<span style="color:#ff0000;">CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;-->在所有standby应用后才能删除</span>
crosscheck backup;
crosscheck archivelog all;
delete noprompt archivelog until time 'sysdate-7';
delete noprompt obsolete;
delete noprompt expired backup;exit
EOF

5.检查log_archive_dest和log_archive_dest_state发现有defer的LAD

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
log_archive_dest                     string
log_archive_dest_1                   string      LOCATION=+ARCHDG VALID_FOR=(ALL_LOGFILES,ALL_ROLES) DB_UNIQUE_NAME=addrprodlog_archive_dest_3                   string      service=ADDRCD arch async valid_for=(ONLINE_LOGFILES,PRIMARY_ROLE) reopen=60 db_unique_name=ADDRCDlog_archive_dest_4                   string      service=ADDRPROD_STD arch async valid_for=(ONLINE_LOGFILES,PRIMARY_ROLE) reopen=60 db_unique_name=ADDRPROD_STD
log_archive_dest_state_1             string      ENABLE
<span style="background-color: rgb(255, 255, 0);">log_archive_dest_state_3             string      defer</span>
log_archive_dest_state_4             string      enable

三.问题解决

清理log_archive_dest_3后重新手工删除archivelog 成功:

SQL> show parameter log_archive_dest_3;NAME                                 TYPE       VALUE
------------------------------------ ---------- ------------------------------
log_archive_dest_3                   string     service=ADDRCD arch async valid_for=(ONLINE_LOGFILES,PRIMARY_ROLE) reopen=60 db_unique_name=ADDRCD
log_archive_dest_30                  string
log_archive_dest_31                  string
SQL> alter system set log_archive_dest_3='' scope=both sid='*';System altered.SQL> show parameter log_archive_dest_3;NAME                                 TYPE       VALUE
------------------------------------ ---------- ------------------------------
log_archive_dest_3                   string
log_archive_dest_30                  string
log_archive_dest_31                  string
删除归档时未再报错:
RMAN> CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;delete noprompt archivelog until time 'sysdate-7';using target database control file instead of recovery catalog
old RMAN configuration parameters:
CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;
new RMAN configuration parameters:
CONFIGURE ARCHIVELOG DELETION POLICY TO APPLIED ON ALL STANDBY;
new RMAN configuration parameters are successfully storedRMAN>allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=963 instance=addr11g1 device type=DISK
allocated channel: ORA_DISK_2
channel ORA_DISK_2: SID=1717 instance=addr11g1 device type=DISK
allocated channel: ORA_DISK_3
channel ORA_DISK_3: SID=1908 instance=addr11g1 device type=DISK
allocated channel: ORA_DISK_4
channel ORA_DISK_4: SID=2189 instance=addr11g1 device type=DISK
List of Archived Log Copies for database with db_unique_name ADDRPROD
=====================================================================Key     Thrd Seq     S Low Time
------- ---- ------- - ---------
168624  1    13079   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13079.1905.858179699168643  1    13080   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13080.1618.858181499168646  1    13081   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13081.1619.858182367168648  1    13082   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13082.1620.858182411168656  1    13083   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13083.1625.858182901168658  1    13084   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13084.1624.858182903168662  1    13085   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13085.1627.858182967168666  1    13086   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13086.1629.858184767168670  1    13087   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13087.1631.858186569168674  1    13088   A 13-SEP-14Name: +ARCHDG/addrprod/archivelog/2014_09_13/thread_1_seq_13088.1633.858188367

四.小结

这种临时性操作的收尾不干净导致的问题应该也不少见,本次没有引起重大故障(当然并不意味着每次都不会引起重大故障)。所以,日常工作中我们还是需要从多方面入手确保系统的正常运行,例如:

1).足够熟悉系统环境,清楚掌握各个临时操作之后如何恢复回去;

2).当然以上一点纯粹不靠谱啦,都说好记性不如烂笔头,最好还是有标准化的OM咯;

3).相关临时操作完成后需要对系统进行一次完整的检查。

LAD(Log Archive Dest)配置不当引起备份失败相关推荐

  1. php目录遍历漏洞复现,nginx解析漏洞,配置不当,目录遍历漏洞环境搭建、漏洞复现...

    nginx解析漏洞,配置不当,目录遍历漏洞复现 1.Ubuntu14.04安装nginx-php5-fpm 安装了nginx,需要安装以下依赖 sudo apt-get install libpcre ...

  2. 记一次CentOS7因Redis配置不当导致被Root提权沦为矿机修复过程

    未曾想过,那些年影视剧中黑客们的精彩桥段,竟在2020这个充满魔幻的年份,变成了现实. 前几日傍晚突然收到了来自阿里云安全中心的提醒,服务器疑似受到攻击了.想不到我那用作学习的机器,有朝一日竟然沦为矿 ...

  3. Redis配置不当可导致服务器被控制,已有多个网站受到影响 #通用程序安全预警#...

    文章出自:http://news.wooyun.org/6e6c384f2f613661377257644b346c6f75446f4c77413d3d 符合预警中"Redis服务配置不当& ...

  4. crossdomain.xml配置不当的利用和解决办法

    00x1: 今天在无聊的日站中发现了一个flash小站,点进crossdomain.xml一看,震惊 本屌看到这个*就发觉事情不对 百度一下,这是一个老洞,配置不当能引起各种问题就算能远程加载恶意的s ...

  5. Springboot之actuator配置不当漏洞(autoconfig、configprops、beans、dump、env、health、info、mappings、metrics、trace)

    前言 Actuator 是 springboot 提供的用来对应用系统进行自省和监控的功能模块,借助于 Actuator 开发者可以很方便地对应用系统某些监控指标进行查看.统计等.在 Actuator ...

  6. mysql8.0导入备份_mysql8.0.20配合binlog2sql的配置和简单备份恢复的步骤详解

    第一步 安装 1.安装MySQL 2.安装Python3 [root@localhost /]#yum install python3 3.下载binlog2sql文件到本地(文件在百度云盘) [ro ...

  7. 微软低代码工具 Power Apps 配置不当,暴露3800万条数据记录

     聚焦源代码安全,网罗国内外最新资讯! 编译:代码卫士 Upguard 研究院称,由于微软 Power Apps 默认配置安全性薄弱,敏感数据如 COVID-19 打疫苗情况.社保号码和邮件地址遭泄露 ...

  8. Git 仓库配置不当 日产北美公司的源代码遭泄露

     聚焦源代码安全,网罗国内外最新资讯! 编译:代码卫士团队 日产北美公司所开发和使用的移动应用及内部工具的源代码遭泄露,原因是该公司的其中一个 Git 服务器配置不当. 瑞士软件工程师 Tillie ...

  9. 【vim环境配置】解决ubuntu上 由YouCompleteMe插件配置不当引起的 自动补全失效的问题

    [vim环境配置]解决ubuntu上 由YouCompleteMe插件配置不当引起的 自动补全失效的问题 参考文章: (1)[vim环境配置]解决ubuntu上 由YouCompleteMe插件配置不 ...

最新文章

  1. 基于开源TiRG的文本检测与提取实现
  2. 解决:无法创建该DNS 服务器的委派
  3. 数据中台已成气候!大数据架构师如何站上风口?
  4. Sublime text 2/3 [Decode error - output not utf-8] 完美解决方法
  5. linux中mtools工具_Linux mtools命令
  6. poj 1384 完全背包
  7. C# 9.0 正式发布了(C# 9.0 on the record)
  8. 针对新手的Java EE7和Maven项目–第5部分–使用Arquillian / Wildfly 8进行单元测试
  9. 互联网日报 | 6月12日 星期六 | BOSS直聘正式登陆纳斯达克;腾讯回应“试点强制6点下班”;数据安全法9月1日起实施...
  10. .net vue漂亮登录界面_6个宝藏级Vue管理后台框架 必须收藏
  11. zabbix items复制
  12. Single Image Haze Removal Using Dark Channel Prior(使用暗通道先验去除单张图像雾霾)
  13. 【Vue】Nodejs下载与安装
  14. ubuntu20 yarn报错
  15. 互联网的起源发展历史
  16. DropDownMenu下拉菜单
  17. 【模拟】桐桐的新闻系统
  18. 【BUG】ELF文件执行时出现段错误Segmentation fault,解决:使用010编辑器修改ELF文件不可执行段权限
  19. 用Photoshop去除图片中的原有文字
  20. UG NX 12 鼠标操作

热门文章

  1. 字符转ASCII码,ASCII码转字符
  2. form.reset() 的真正作用,并不是清空所有输入栏
  3. 用Python写一个用二分法计算函数零点的计算程序
  4. 泛泰SKYA840黑砖QHSUSB_DLOAD救砖教程(适用于工程机和正式机)
  5. toft 测试用例rat_软件测试用例类型
  6. 韩山师范学院计算机专插本,2019年韩山师范学院专插本各专业录取情况
  7. ubuntu18.04解决因没有集成显卡驱动进不去界面问题
  8. 【PaperReading】使用limma、Glimma和edgeR对RNA-seq数据分析
  9. 邮箱foxmail 如何添加账户
  10. 阿里内部业务中台的实践之路?