Re: [ALUG] Hard disk error

26 Oct 2004


      On Tuesday 26 October 2004 6:24 pm, Mark Rogers wrote:
...
...
Yes, I've removed the new RAM now (and I gave all the cables a good
shove while I was in there).
I get exactly the same problem.
Well it's looking like the good 'ole lightbulb thing.
Lightbulbs fail most frequently just at the point that you turn them on. 
Thermal cycles present a greater strain on hardware than continuous running 
sometimes.
It may well be that hdb is failing and the power off/fit ram/power on cycle 
was the last straw. I'd get the smartools (smartctl) on there read the man 
file and tell the drive to do a offline test (confusingly enough you can do 
an offline test when the drive is mounted, it just slows access down a 
lot)...it will give you an estimate of the test run time, check back when the 
test is finished. This all of course assumes that your drives are smart 
capable (most are) it does not have to be enabled in the bios, that just 
turns on a basic smart test during post
But before you do any of that, and before you unpower/repower that server any 
more I would carefully check that you have the contents of /home backed up 
safely somewhere.
Useful smart commands are as follows
smartctl --test=offline /dev/hdb
runs a full test of hdb
smartctl --test=conveyance /dev/hdb
runs a special (IDE only) test that checks for transit damage 
smartctl -l selftest /dev/hdb
Reports back the results of the tests (some of them can take over an hour to 
complete)
smartctl -H /dev/hdb
Reports a basic health status (but not all drives test themselves at boot 
unless the BIOS asks them to)
smartctl -a /dev/hdb
Reports back all smart information

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: [ALUG] Hard disk error