RE: [ALUG] Problems with tar

4 Aug 2005


      On 03-Aug-05 Stuart Bailey wrote:
...
I'm using a SCSI 72Gb tape drive on a Fedora Core3  system which
has recently been updated with the latest kernel (about 2 weeks ago).
The tape backup (using tar -cf /dev/st0 ...) had been working fine -
for about 5 months, and was used to restore data at that time.
I have just discovered that the backup has been failing recently.
When I run tar -cvf /dev/st0 ... only 2048 bytes are backed up before
this message is displayed:
tar: /dev/st0: Wrote only 2048 of 10240 bytes
tar: Error is not recoverable: exiting now
If I then run tar -tvf /dev/st0, I get all the files upto the point
at which the error message was generated.
Any ideas what may have gone wrong? Are there any tools to run
diagnostics on the tape unit?
A few questions/suggestions.
1. Did this trouble start concurrently with the kernel upgrade?
   If so, possibly the cause is there. Can you re-instate the
   previous kernel and see if it still gives trouble? If it's
   the kernel upgrade then I don't have useful ideas.
2. Do you get the same problem regardless of which tape you put
   in the drive? If it's just one tape, then there may be a
   defect on the tape itself. But if it's independent of the
   tape, and it's not the kernel, then this points to the tape
   drive itself.
3. If it's not the tape, then try putting a spare (i.e. potentially
   disposable) tape in the drive and raw-writing to it:
a) Set up a test file with decipherable structure:
echo -e "\n" |
     awk '{for(i=1;i<=1000000;i++){printf("%07.0f\n",i)}}' > testfile
which will give you a test file with 8000000 bytes (7 for each
   integer plus a newline, so 8 bytes per integer).
b) raw-write this to the tape in various ways, e.g.:
dd if=testfile of=/dev/st0 bs=512 count=8
which will write 4096 bytes to the device in 8 blocks of 512 bytes.
c) raw-read it back (you will need to re-wind the tape first), e.g.:
dd if=/dev/st0 bs=512 count=8
[or e.g. bs=4096 count=1]
and see how far it gets. If only 2048 bytes of the 4096 got written
   to the tape, then the last line to be printed to the console would
   be "0000256".
d) Vary the above with different values for "bs" and "count".
The fact that your tape error says that only 2048 bytes were
   written suggests that the mechanism may be using a block-size of
   2048 bytes and only one block got written. Where this failure to
   move to the next block arises, however, is not clear. It may be
   a hardware failure in the drive (internal buffer of 2048, not
   recycled); failure to communicate with the drive (e.g. the
   "handshake" from the drive would announce that the "write" had
   been cleared and it was ready for the next block, but the handshake
   was not being read and acted on); the kernel was using a 2048-byte
   block of RAM as a buffer but not re-cycling this; etc.
Using "bs" greater than 2048 as well as less than or equal to
   (e.g. "bs=4096" or "bs=8192") may discriminate.
Hoping this provides a useful pointer or two!
Best wishes,
Ted.
--------------------------------------------------------------------
E-Mail: (Ted Harding) Ted.Harding@nessie.mcc.ac.uk
Fax-to-email: +44 (0)870 094 0861
Date: 04-Aug-05                                       Time: 10:22:08
------------------------------ XFMail ------------------------------

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

RE: [ALUG] Problems with tar