Quantcast
Channel: DELL-Daniel My's Activities
Viewing all articles
Browse latest Browse all 2846

R720XD spontaneous shutdown and controller/drive predictive failure warnings

$
0
0

Server has a PERC H710P mini controller with a RAID-10 virtual disk comprised of three spans of two drives each. This server has shut down twice on its own without any log evidence of distress. The first time was Dec 16 2016 and we were able to start it from the iDRAC. The second time, Jan 26, it would not start from the iDRAC, and we needed to disconnect power to discharge the power supplies to get it running again.

The server's Windows event log shows periodic predictive failure warnings for both the controller and some of the disks in the array for the past year, usually one or two a month, but increasing recently to one to three a day. The disk warnings started on one disk (0:1:6), but now appear for three (0:1:6, 0:1:7, 0:1:10), one in each span. Immediately after restart from both shutdowns the log showed a large number of controller warnings, over the next 15-45 seconds, then returned to the 1-3 per day rate.

The event log events don't look particularly illuminating. Here are samples of a controller event and disk event, along with just the descriptions that cover the content of all the recent events:

controller:

Date:          01/27/2017 00:56:28

Event ID:      2335

Task Category: Storage Service

Level:         Warning

Keywords:      Classic

User:          N/A

Computer:      DB3.databank.loc

Description:

Controller event log: Predictive failure: PD 0a(e0x20/s10):  Controller 0 (PERC H710P Mini)

Controller event log: Predictive failure: PD 07(e0x20/s7):  Controller 0 (PERC H710P Mini)

Controller event log: Predictive failure: PD 06(e0x20/s6):  Controller 0 (PERC H710P Mini)

disk:

Event ID:      2094

Task Category: Storage Service

Level:         Warning

Keywords:      Classic

User:          N/A

Computer:      DB3.databank.loc

Description:

Predictive Failure reported:  Physical Disk 0:1:10 Controller 0, Connector 0

Predictive Failure reported:  Physical Disk 0:1:7 Controller 0, Connector 0

Predictive Failure reported:  Physical Disk 0:1:6 Controller 0, Connector 0

Since both the controller and drive errors refer to the same numbers, 6, 7 and 10, it is possible that the problem is either just the three disks or just the controller. If I had to proceed without further information I'd replace the controller.

Could these controller/disk warnings be related to the shutdowns? What are your recommendations?

Other information:

H710P:

Firmware Version 21.2.0-0007

Driver Version 6.801.05.00

Storport Driver Version 6.1.7601.18386

disk 0:1:6:

Product ID ST3600057SS

Serial No. 6SL7R16J

disk 0:1:7:

Product ID ST3600057SS

Serial No. 6SL7R3W4

disk 0:1:10:

Product ID ST3600057SS

Serial No. 6SL7NM3B


Viewing all articles
Browse latest Browse all 2846

Trending Articles