一.故障信息
1.1.故障信息
Log摘要 System Configuration:Oracle Corporationsun4u Sun Fire 880 System clock frequency: 150 MHz Memory size: 8192 Megabytes ========================= CPUs =============================================== RunE$CPUCPU BrdCPUMHzMB Impl.Mask --- ----- ---- ---- ------- ---- A09008.0 US-III+2.3 B19008.0 US-III+2.3 A29008.0 US-III+2.3 B39008.0 US-III+2.3 ========================= Memory Configuration =============================== LogicalLogicalLogical MCBankBankBankDIMMInterleaveInterleaved BrdIDnumsizeStatusSizeFactorwith ------------------------------------------------------- A00512MBno_status256MB8-way0 A01512MBno_status256MB8-way0 A02512MBno_status256MB8-way0 A03512MBno_status256MB8-way0 B10512MBno_status256MB8-way1 B11512MBno_status256MB8-way1 B12512MBno_status256MB8-way1 B13512MBno_status256MB8-way1 A20512MBno_status256MB8-way0 A21512MBno_status256MB8-way0 A22512MBno_status256MB8-way0 A23512MBno_status256MB8-way0 B30512MBno_status256MB8-way1 B31512MBno_status256MB8-way1 B32512MBno_status256MB8-way1 B33512MBno_status256MB8-way1 ========================= IO Cards ========================= BusMax IOPort BusFreq BusDev, BrdTypeIDSide Slot MHzFreq Func State NameModel ---- ---- ---- ---- ---- ---- ---- ---- ----- ------------------------------------------------------ I/OPCI8B333332,0okSUNW,qlc-pci1077,2300.1077.106.1+ I/OPCI9B633332,0okSUNW,qlc-pci1077,2300.1077.106.1+ I/OPCI9B533333,0okSUNW,qlc-pci1077,2300.1077.106.1+ I/OPCI9B433334,0okSUNW,qlc-pci1077,2300.1077.106.1+ No failures found in System =========================== ========================= Environmental Status ========================= System Temperatures (Celsius): ------------------------------- DeviceTemperatureStatus --------------------------------------- CPU069OK CPU169OK CPU270OK CPU366OK MB32OK IOB29OK DBP030OK ================================= Front Status Panel: ------------------- Keyswitch position: NORMAL System LED Status: GEN FAULTREMOVE [ ON][OFF] DISK FAULTPOWER FAULT [OFF][OFF] LEFT THERMAL FAULT RIGHT THERMAL FAULT [ ON][OFF] LEFT DOORRIGHT DOOR [OFF][OFF] ================================= Disk Status: PresenceFault LEDRemove LED DISK0: [PRESENT][OFF][OFF] DISK1: [PRESENT][OFF][OFF] DISK2: [PRESENT][OFF][OFF] DISK3: [PRESENT][OFF][OFF] DISK4: [PRESENT][OFF][OFF] DISK5: [PRESENT][OFF][OFF] DISK6: [EMPTY] DISK7: [EMPTY] DISK8: [EMPTY] DISK9: [EMPTY] DISK10: [EMPTY] DISK11: [EMPTY] ================================= Fan Bank : ---------- BankSpeedStatusFan State ( RPMS ) ------------------------------ CPU0_PRIM_FAN2189[ENABLED]OK CPU1_PRIM_FAN2290[ENABLED]OK CPU0_SEC_FAN0[DISABLED]OK CPU1_SEC_FAN0[DISABLED]OK IO0_PRIM_FAN0[DISABLED]ERROR IO1_PRIM_FAN0[DISABLED]ERROR IO0_SEC_FAN3947[ENABLED]OK IO1_SEC_FAN3896[ENABLED]OK IO_BRIDGE_PRIM_FAN3333[ENABLED]OK IO_BRIDGE_SEC_FAN0[DISABLED]OK ================================= Power Supplies: --------------- Current Drain: SupplyStatusFan FailTemp FailCS Fail3.3V5V12V48V ------------------------------------------------------ PS0GOOD6323 PS1GOOD6323 PS2GOOD6323 ========================= HW Revisions ======================================= System PROM revisions: ---------------------- OBP 4.6.7 2002/07/24 15:42 IO ASIC revisions: ------------------ Port ModelIDStatus Version -------- ---- ------ ------- Schizo8ok4 Schizo9ok4 # System Configuration:Oracle Corporationsun4u Sun Fire 880 System clock frequency: 150 MHz Memory size: 8192 Megabytes ========================= CPUs =============================================== RunE$CPUCPU BrdCPUMHzMB Impl.Mask --- ----- ---- ---- ------- ---- A09008.0 US-III+2.3 B19008.0 US-III+2.3 A2 9008.0 US-III+2.3 B39008.0 US-III+2.3 ========================= Memory Configuration =============================== LogicalLogicalLogical MCBankBankBankDIMMInterleaveInterleaved Brd IDnumsizeStatusSizeFactorwith ------------------------------------------------------- A00512MBno_status256MB8-way0 A01512MBno_status256MB8-way0 A02512MBno_status256MB8-way0 A03512MBno_status256MB8-way0 B10512MBno_status256MB8-way1 B11512MBno_status256MB8-way1 B12512MBno_status256MB8-way1 B13512MBno_status256MB8-way1 A20512MBno_status256MB8-way0 A21512MBno_status256MB8-way0 A22512MBno_status256MB8-way0 A23512MBno_status256MB8-way0 B30512MBno_status256MB8-way1 B31512MBno_status256MB8-way1 B32512MBno_status256MB8-way1 B33512MBno_status256MB8-way1 ========================= IO Cards ========================= BusMax IOPort BusFreq BusDev, BrdTypeIDSide Slot MHzFreq Func State NameModel ---- ---- ---- ---- ---- ---- ---- ---- ----- ------------------------------------------------------ I/OPCI8B333332,0okSUNW,qlc-pci1077,2300.1077.106.1+ I/OPCI9B633332,0okSUNW,qlc-pci1077,2300.1077.106.1+ I/OPCI9B533333,0okSUNW,qlc-pci1077,2300.1077.106.1+ I/OPCI9B433334,0okSUNW,qlc-pci1077,2300.1077.106.1+ No failures found in System #dmesg Jan 15 11:39:50 csu picld[203]: [ID 625010 daemon.error] WARNING: Device IO0_PRIM_FAN failure detected Jan 15 13:16:53 csu picld[208]: [ID 625010 daemon.error] WARNING: Device IO0_PRIM_FAN failure detected Jan 15 14:46:34 csu picld[204]: [ID 625010 daemon.error] WARNING: Device IO0_PRIM_FAN failure detected |
1.2.故障定位
客户反应,SUN V880前面板,温度LED指示灯橙×××。
通过系统错误信息得知:
IO0_PRIM_FAN0[DISABLED]ERROR
IO1_PRIM_FAN0[DISABLED]ERROR
通过dmesg 查看出IO0_PRIM_FAN 检查有问题,不排除IO1_PRIM_FAN和机器内部灰尘过多导致.
因此,先准备2块PCI I/O Fan Tray和吸尘器(有可能机器内部灰尘过多,导致风扇告警),同时解决此次故障。
二.故障处理
2.1.先决条件
注意 |
确保系统关机,电源断开 操作时,使用防静电护腕 添加或更换硬件组件之前请作好数据备份。如果部件未正确安装,则可能会导致数据丢失。 |
2.2.准备项
准备确认项 | ||
类型 | 准备项 | 状态 |
硬件 | 笔记本一台 | 已准备就绪 |
串口线一根 | 已准备就绪 | |
一字、十字螺丝刀各一把 | 已准备就绪 | |
防静电护腕一个 | 已准备就绪 | |
新2块IO_PRIM_FAN | 已准备就绪 | |
软件 | ||
其它 | ||
2.3.操作项
操作项列表 | |||
序号 | 操作项 | 操作内容 | 状态 |
1 | 确认系统关机 | 建议客户应用及业务数据备份 | |
2 | 使用POST诊断 | 定位系统故障的确切位置 | |
3 | 佩戴防静电护腕 | 确认已经佩戴防静电护腕,并且防静电护腕连接到机柜上的未涂漆部分 | |
4 | 断开电源 | 断开主电源和次电源 | |
5 | 移除服务检修盖 | ||
6 | 拆除处理器板 | ||
7 | 将取下的处理器板放置在防静电的材质表面 | ||
8 | 拆开移除处理器前盖 | ||
9 | 确认IO_PRIM_FAN位置 | ||
10 | 从防静电包装中取出IO_PRIM_FAN | ||
11 | 安装IO_PRIM_FAN | ||
12 | 重新安装处理器板 | ||
13 | 确认故障影响消失 | 确认新更换的硬件无告警 | |
确认新的硬件在系统中就绪 | |||
用户确认应用及业务数据不受影响 | |||
14 | 收尾 | 清理现场,结束工作 |