一、 问题描述

用户VA7110 有一硬盘故障,VA自动rebuild失败,11月25更换硬盘后rebuild仍然失败,之后VA做balance一周都未完成,I/O比较慢,数据库checkpoint时间最高达到200多秒,业务严重受到影响

二、 告警信息

#armdsp -a va

Vendor ID:______________________________HP

Product ID:_____________________________A6189B

Array World Wide Name:__________________50060b00001535a8

Array Serial Number:____________________00SG324J0103

Alias:__________________________________va

Software Revision:______________________1.09.02 - 0191 - 060113

Command execution timestamp:____________Nov 24, 2008 5:31:30 PM

------------------------------------------------------------

ARRAY INFORMATION

Array Status:_________________________Warning

Firmware Revision:____________________38370A140P0513051631

Product Revision:_____________________A140

Local Controller Product Revision:____A140

Remote Controller Product Revision:___A140

Last Event Log Entry for Page 1:______140908

Last Event Log Entry for Page 2:______140896

Last Event Log Entry for Page 5:______131219

ENCLOSURES

Enclosure at M

Enclosure ID__________________________0

Enclosure Status______________________Failed

Enclosure Type________________________HP StorageWorks Virtual Array 7110

Node WWN______________________________50060b00001535a8

FRU HW COMPONENT IDENTIFICATION ID STATUS

===========================================================================

M Enclosure 00SG324J0103 Failed

M/P1 Power Supply 94030JD01148 Good

M/P2 Power Supply 94030JD01145 Good

M/MP1 MidPlane 000617570117 Good

M/C2 Controller 00PR00D50084 Good

M/C2.H1 Host Port Good

M/C2.J1 BackEnd Port Good

M/C2.B1 Battery 44298:MOLTECHPS:NI2040:2003/3/12 Good

M/C2.PM1 Processor HP:A6189B:A140 Good

M/C2.M1 DIMM 512 Good

M/C2.M2 DIMM 512 Good

M/C1 Controller 00PR00D50060 Good

M/C1.H1 Host Port Good

M/C1.J1 BackEnd Port Good

M/C1.B1 Battery 44304:MOLTECHPS:NI2040:2003/3/12 Good

M/C1.PM1 Processor HP:A6189B:A140 Good

M/C1.M1 DIMM 512 Good

M/C1.M2 DIMM 512 Good

M/D1 Disk 3HX0WX3G Good

M/D2 Disk 3HX0X2RN Good

M/D3 Disk 3HX0XBBG Failed

M/D4 Disk 3HX0X625 Good

CONTROLLERS

Controller At M/C2:

Status:_______________________________Good

Serial Number:________________________00PR00D50084

Vendor ID:____________________________HP

Product ID:___________________________A6189B

Product Revision:_____________________A140

Firmware Revision:____________________38370A140P0513051631

Manufacturing Product Code:___________IJMTU00016

Controller Type:______________________HP StorageWorks Virtual Array 7110

Battery Charger Firmware Revision:____5.0

Front Port At M/C2.H1:

Status:_____________________________Good

Port Instance:______________________0

Hard Address:_______________________126

Link State:_________________________Link Up

Node WWN:___________________________50060b00001535a8

Port WWN:___________________________50060b00001b7a28

Topology:___________________________Point To Point, Fabric Attached

Data Rate:__________________________2 GBit/sec

Port ID:____________________________0x10000

Device Host Name:___________________zhsmp1

Hardware Path:______________________0/6/2/0.1.0.0.0.0.0

Device Path:________________________/dev/dsk/c6t0d0

Back Port At M/C2.J1:

Status:_____________________________Good

Port Instance:______________________0

Hard Address:_______________________125

Link State:_________________________Link Up

Node WWN:___________________________50060b00001535a8

Port WWN:___________________________50060b00001b7a29

Topology:___________________________Private Loop

Data Rate:__________________________2 GBit/sec

Port ID:____________________________125

Battery at M/C2.B1:

Status:_____________________________Good

Identification:_____________________44298:MOLTECHPS:NI2040:2003/3/12

Manufacturer Name:__________________MOLTECHPS

Device Name:________________________NI2040

Manufacturer Date:__________________March 12, 2003

Remaining Capacity:_________________5700 mAh

Remaining Capacity:_________________95 %

Voltage:____________________________12349 mVolts

Discharge Cycles:___________________2

Processor at M/C2.PM1:

Status:_____________________________Good

Identification:_____________________HP:A6189B:A140

DIMM at M/C2.M1:

Status:_____________________________Good

Identification:_____________________512

Capacity:___________________________512 MB

DIMM at M/C2.M2:

Status:_____________________________Good

Identification:_____________________512

Capacity:___________________________512 MB

Controller At M/C1:

Status:_______________________________Good

Serial Number:________________________00PR00D50060

Vendor ID:____________________________HP

Product ID:___________________________A6189B

Product Revision:_____________________A140

Firmware Revision:____________________38370A140P0513051631

Manufacturing Product Code:___________IJMTU00016

Controller Type:______________________HP StorageWorks Virtual Array 7110

Battery Charger Firmware Revision:____5.0

Front Port At M/C1.H1:

Status:_____________________________Good

Port Instance:______________________0

Hard Address:_______________________126

Link State:_________________________Link Up

Node WWN:___________________________50060b00001535a8

Port WWN:___________________________50060b00001b67ac

Topology:___________________________Point To Point, Fabric Attached

Data Rate:__________________________2 GBit/sec

Port ID:____________________________0x10000

Device Host Name:___________________zhsmp1

Hardware Path:______________________0/4/0/0.1.0.0.0.0.0

Device Path:________________________/dev/dsk/c4t0d0

Back Port At M/C1.J1:

Status:_____________________________Good

Port Instance:______________________0

Hard Address:_______________________125

Link State:_________________________Link Up

Node WWN:___________________________50060b00001535a8

Port WWN:___________________________50060b00001b67ad

Topology:___________________________Private Loop

Data Rate:__________________________2 GBit/sec

Port ID:____________________________125

Battery at M/C1.B1:

Status:_____________________________Good

Identification:_____________________44304:MOLTECHPS:NI2040:2003/3/12

Manufacturer Name:__________________MOLTECHPS

Device Name:________________________NI2040

Manufacturer Date:__________________March 12, 2003

Remaining Capacity:_________________5821 mAh

Remaining Capacity:_________________97 %

Voltage:____________________________12575 mVolts

Discharge Cycles:___________________2

Processor at M/C1.PM1:

Status:_____________________________Good

Identification:_____________________HP:A6189B:A140

DIMM at M/C1.M1:

Status:_____________________________Good

Identification:_____________________512

Capacity:___________________________512 MB

DIMM at M/C1.M2:

Status:_____________________________Good

Identification:_____________________512

Capacity:___________________________512 MB

PORTS

Settings for port M/C2.H1:

Port ID:______________________________108

Behavior:_____________________________HPUX

Topology:_____________________________Point To Point, Fabric Attached

Queue Full Threshold:_________________4

Data Rate:____________________________2 GBit/sec

Settings for port M/C2.J1:

Data Rate:____________________________2 GBit/sec

Settings for port M/C1.H1:

Port ID:______________________________110

Behavior:_____________________________HPUX

Topology:_____________________________Point To Point, Fabric Attached

Queue Full Threshold:_________________4

Data Rate:____________________________2 GBit/sec

Settings for port M/C1.J1:

Data Rate:____________________________2 GBit/sec

DISKS

Disk at M/D1:

Status:_______________________________Good

Disk State:___________________________Included

Vendor ID:____________________________HP 36.4G

Product ID:___________________________ST336753FC

Product Revision:_____________________HP03

Data Capacity:________________________33.378 GB (70000000 blocks)

Block Length:_________________________520 bytes

Address:______________________________111

Node WWN:_____________________________2000000c5029bcf8

Initialize State:_____________________Ready

Redundancy Group:_____________________1

Volume Set Serial Number:_____________0000C38C0000000A

Serial Number:________________________3HX0WX3G

Firmware Revision:____________________HP03

Recovery Maps are on this disk.

Space is reserved on this disk for subsystem metadata and

may be a map disk.

Disk at M/D2:

Status:_______________________________Good

Disk State:___________________________Included

Vendor ID:____________________________HP 36.4G

Product ID:___________________________ST336753FC

Product Revision:_____________________HP03

Data Capacity:________________________33.378 GB (70000000 blocks)

Block Length:_________________________520 bytes

Address:______________________________112

Node WWN:_____________________________2000000c5029bcca

Initialize State:_____________________Ready

Redundancy Group:_____________________1

Volume Set Serial Number:_____________0000C38C0000000A

Serial Number:________________________3HX0X2RN

Firmware Revision:____________________HP03

Recovery Maps are on this disk.

Space is reserved on this disk for subsystem metadata and

may be a map disk.

Disk at M/D3:

Status:_______________________________Failed

Disk State:___________________________Failed

Vendor ID:____________________________HP 36.4G

Product ID:___________________________ST336753FC

Product Revision:_____________________HP03

Data Capacity:________________________33.378 GB (70000000 blocks)

Block Length:_________________________520 bytes

Address:______________________________113

Node WWN:_____________________________2000000c5029886b

Initialize State:_____________________Ready

Redundancy Group:_____________________1

Volume Set Serial Number:_____________0000C38C0000000A

Serial Number:________________________3HX0XBBG

Firmware Revision:____________________HP03

Disk at M/D4:

Status:_______________________________Good

Disk State:___________________________Included

Vendor ID:____________________________HP 36.4G

Product ID:___________________________ST336753FC

Product Revision:_____________________HP03

Data Capacity:________________________33.378 GB (70000000 blocks)

Block Length:_________________________520 bytes

Address:______________________________114

Node WWN:_____________________________2000000c5029924c

Initialize State:_____________________Ready

Redundancy Group:_____________________1

Volume Set Serial Number:_____________0000C38C0000000A

Serial Number:________________________3HX0X625

Firmware Revision:____________________HP03

LUNS

LUN 0:

Redundancy Group:_____________________1

Active:_______________________________True

Data Capacity:________________________20 MB

WWN:__________________________________60060b00001535a80000000000000010

Number Of Business Copies:____________0

LUN 1:

Redundancy Group:_____________________1

Active:_______________________________True

Data Capacity:________________________11 GB

WWN:__________________________________60060b00001535a80001000000000011

Number Of Business Copies:____________0

LUN 2:

Redundancy Group:_____________________1

Active:_______________________________True

Data Capacity:________________________16 GB

WWN:__________________________________60060b00001535a80002000000000012

Number Of Business Copies:____________0

LUN 3:

Redundancy Group:_____________________1

Active:_______________________________True

Data Capacity:________________________11 GB

WWN:__________________________________60060b00001535a80003000000000013

Number Of Business Copies:____________0

LUN 4:

Redundancy Group:_____________________1

Active:_______________________________True

Data Capacity:________________________16 GB

WWN:__________________________________60060b00001535a80004000000000014

Number Of Business Copies:____________0

CAPACITY Totals for Redundancy Group 1:

REGULAR LUNs:_________________________54.019 GB

BUSINESS COPIES:______________________0 bytes

CAPACITY USAGE

Total Disk Enclosures:________________1

Redundancy Group:_____________________1

Total Disks:________________________3

Total Physical Size:________________100.135 GB

Allocated to Regular LUNs:__________54.019 GB

Allocated as Business Copies:_______0 bytes

Used as Active Hot Spare:___________0 bytes

Used for Redundancy:________________46.116 GB

Unallocated (Available for LUNs):___0 bytes

Used by Non-Included Disks:___________33.378 GB

VFP

Settings for VFP Serial Port M/C1.VFP:

VFP Baud Rate:________________________9600

VFP Paging Value:_____________________24

Settings for VFP Serial Port M/C2.VFP:

VFP Baud Rate:________________________9600

VFP Paging Value:_____________________24

SUB-SYSTEM SETTINGS

RAID Level:___________________________RAID1+0

Auto Format Drive:____________________On

Hang Detection:_______________________On

Capacity Depletion Threshold:_________100%

Queue Full Threshold Maximum:_________4096

Enable Optimize Policy:_______________True

Enable Manual Override:_______________False

Manual Override Destination:__________False

Read Cache Disable:___________________False

Rebuild Priority:_____________________Low

Security Enabled:_____________________False

Shutdown Completion:__________________0

Subsystem Type ID:____________________0

Unit Attention:_______________________True

Volume Set Partition (VSpart):________False

Write Cache Enable:___________________True

Write Working Set Interval:___________8640

Enable Prefetch:______________________True

Disable Secondary Path Presentation:__False

Backend Diagnostics:__________________On

RESILIENCY SETTINGS

Simplified Resiliency Setting:________Normal Performance (Default)

Enable Secure Mode:___________________True

Disable NVRAM on UPS Absent:__________False

Disable NVRAM on WCE False:___________False

Disable Read Hits:____________________False

Force Unit Access Response:___________1

Lock Write Cache On:__________________True

Performance Goal Configuration:_______Normal Performance

Resiliency Threshold:_________________4

Single Controller Warning:____________True

DISK SETTINGS

Auto Include:_________________________On

Auto Rebuild:_________________________On

Hot Spare:____________________________None

Max Drives per Loop Pair:_____________45

Max Drives per Subsystem:_____________45

ENCLOSURE SETTINGS

Max Enclosures per Loop Pair:_________2

Max Enclosures per Subsystem:_________2

LUN SETTINGS

LUN Creation Limit:___________________1024

Maximum LUN Creation Limit:___________1024

Migrating Write Destination:__________False

CRUB SETTINGS

Scrub Restart Period:_________________0 minutes

Scrub State:__________________________Not running, system in warning state

OPERATIONS IN PROGRESS

None

WARNINGS

WARNING: Unallocated capacity has fallen below the threshold specified by the capacity threshold mode parameter.

WARNING: A physical drive has failed, failed initialization, been downed or is in the previously used state.

WARNING: A rebuild operation failed.

WARNING: Some data in the device lacks redundancy, and is exposed to becoming unavailable if further drive removals or failures occur.

三、 分析问题原因及解决办法

原因:

此次造成rebuild失败的原因是VA的剩余空间不足,一共4个硬盘做RAID1+0,没有热备盘,所以更换硬盘也不能Rebuild成功,接着VA做balance动作,每当前端有I/O读写时balance动作就暂停,经过一周的balance仍然未完成。

解决办法:

由于造成rebuild失败的原因是VA本身剩余空间不足,因此根据HP的建议有以下办法:

1、备份数据,格式化VA,重建所有的LUN并恢复数据。

2、删除无用的LUN,以便腾出空间加快balance完成。

3、停止前端I/O读写,加快balance完成后手工rebuild。

4、新增两块硬盘,. 以便有更多的剩余空间,加快balance完成。

5、将raid1+0转换为raid5,以便腾出空间加快balance完成。

四、 处理过程

以上5种处理办法中办法2和办法5显然是不可取的,考虑到业务可以暂时停下来,首先采取了办法3,即暂时停掉业务和数据库,经过12月3日一个晚上balcnce仍未完成,业务还是受影响,效果不明显。于是12月4日采取办法2,处理过程如下:

1、备份卷组配置信息

#vgcfgbackup vg01

#vgcfgbackup vg02

#vgcfgbackup vglock

2、停止业务,数据库

#cmhaltcl

3、 格式化VA

#armfmt -f va

4、重建VA的LUN,跟原来一样

#armcfg -L 0 -a 20M -g 1 va

#armcfg -L 1 -a 11G -g 1 va

#armcfg -L 2 -a 16G -g 1 va

#armcfg -L 3 -a 11G -g 1 va

#armcfg -L 4 -a 16G -g 1 va

5 、扫描硬盘,以认到以上几个lun

#insf -C disk

6、恢复卷组、逻辑卷信息

#vgcfgrestore -n vg01 /dev/rdsk/c4t0d1

#vgcfgrestore -n vg01 /dev/rdsk/c4t0d3

#vgcfgrestore -n vg02 /dev/rdsk/c4t0d2

#vgcfgrestore -n vg02 /dev/rdsk/c4t0d4

#vgcfgrestore -n vglock /dev/rdsk/c4t0d0

7、起双机服务,激活卷组

#cmruncl –v

#vgchange -a y vg01

#vgchange -a y vg02

#vgchange -a y vglock

8、 恢复数据库

#su – Informix

$ontape –r

y

n

n

n

9、起双机包、切换测试……

#cmrunpkg –e –v pkg_smp

[@more@]

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/9479798/viewspace-1050072/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/9479798/viewspace-1050072/

HP VA7110 硬盘故障重建失败处理相关推荐

  1. 计算机显示硬盘故障是怎么回事,如果计算机硬盘出现故障怎么办

    第一个: 系统无法识别硬盘 首先,让我谈一个常见的故障问题,即无法启动硬盘,并且无法从A盘进入C盘. cmos中的自动监视功能无法找到硬盘的存在. 这种故障将出现在连接电缆中. 或在IDE端口上,硬盘 ...

  2. Linux内核I/O系统报错日志与硬盘故障对应关系

    日志信息 故障现象描述 与硬盘关系 scsi1: ERROR on channel 0, id 7, lun 0, CDB: Read (10) 00 73 fc 62 bf 00 00 80 00 ...

  3. 常见的BIOS硬盘故障现象及急救措施

    硬盘是电脑的数据仓库,是最为重要的存储设备,由BIOS直接管理.如果硬盘出现故障,一般情况下系统通常会显示一些提示信息,说明问题所在.下面,将一些常见的硬盘故障信息向大家一一介绍. 1  C:Driv ...

  4. linux 硬盘报错日志,Linux内核I/O系统报错日志与硬盘故障对应关系

    日志信息 故障现象描述 与硬盘关系 scsi1: ERROR on channel 0, id 7, lun 0, CDB: Read (10) 00 73 fc 62 bf 00 00 80 00 ...

  5. 固态硬盘故障表现及数据恢复方案

    固态硬盘出现硬件损坏时,通常是NAND控制芯片损坏造成的,主控芯片是固态硬盘的存取控制芯片,是固态硬盘的灵魂所在.相比于闪存颗粒有限的擦写寿命,在闪存颗粒依然坚挺的时候,主控芯片却损坏的概率反而要高得 ...

  6. 计算机主机硬盘主要有,常见的硬盘故障都有哪些

    硬盘故障及解决方法 一.开机检测硬盘出错 开机时检测硬盘有时失败,出现:"primary masterharddiskfail".有时能检测通过正常启动.检测失败后有时在BIOS中 ...

  7. 计算机硬盘出现过哪些问题,常见电脑硬盘故障有哪些?怎么解决?

    硬盘故障在电脑故障中占用的比例也比较大,由于传统的硬盘均为机械硬盘,比较容易受到震动或不良使用而导致的硬盘故障,下面编小编为大家整理了一些常见的电脑硬盘故障导致的电脑问题,以及对应的解决办法,一起来看 ...

  8. HP compaq dx2708故障

    HP compaq dx2708故障 来源:http://hi.baidu.com/rabbitlhf/blog/item/dffd14130f523a28dd54013a.html 1.HP com ...

  9. 「技术世界」SSD硬盘故障修复方法,轻松get新技能

    虽然国内疫情形势逐渐转好,各地也开始有序的开展复工生产,但是在这关键时刻,源妹也希望大家不要放松警惕,继续做好防护,减少出门,齐心协力,助力疫情早日过去. 疫情期间,不少朋友都在利用空暇时间学习充电, ...

最新文章

  1. js转Java的list_JS之JSON字符串到后台用Java转换成List实体类
  2. python不能处理excel文件-python处理Excel文件
  3. VS2013试用期结束后如何激活
  4. win7 clr20r3程序终止_mscorsvw.exe是什么进程 win7系统怎么禁用mscorsvw.exe进程【禁用方法】...
  5. 两个简单的python入门小游戏
  6. Metasploit 之生成木马(msfvenom)
  7. php自然排序法的比较过程,PHP中strnatcmp()函数“自然排序算法”进行字符串比较用法分析(对比strcmp函数)...
  8. 类加载是为了执行静态方法
  9. (02)VHDL模块介绍
  10. Kudu : tablet=null, server=null, status=Timed out: can not complete before timeout: Batch
  11. Spring 定时执行任务重复执行多次
  12. 唠唠SE的IO-03——字符输入输出流
  13. GIS 地图制作 学习总结
  14. matlab 灰度图像矩阵,MatLab矩阵运算——图像灰度化
  15. 第四章 数据字典详解
  16. 如何制定软件开发计划
  17. 轨道交通检测中心-轨道交通产品可靠性检测机构
  18. RTOS系统全Thumb编译+Neon加速火力全开
  19. java爬虫——工具
  20. matlab三相功率测量不对,测量信号的功率 - MATLAB Simulink - MathWorks 中国

热门文章

  1. 综合调度系统数据单向传输与网络隔离应用方案
  2. Portraiture4.0最新PS专属修图磨皮美白插件
  3. EasyExcel导入,如何校验导入的数据(例如:不能为空。)?
  4. thinkphp开发的erp系统旗舰版,模块齐全,功能强大
  5. 基于 WebSocket 的聊天室项目(下)
  6. Uigreat v1.5.1响应式网站平扁设计风格blog自媒体平台WordPress主题
  7. 详细!如何判断各种电子元器件受损表现
  8. 基于javaweb的在线书城书店系统(jsp+ssm+mysql)
  9. Excel作图过程的对比分析作图及图表系列名称更改
  10. vimium:全键盘操作插件 Chrome插件图文教程