on Sat, Dec 01, 2001 at 12:10:01PM -0000, Ian Douglas wrote:
I am very much a Linux newbie and probably wont understand what you say but is there any chance of explaining the problems e2fsck found with my filesystem, and what it did to correct them, so I (and perhaps other list subscribers) can learn a little more about the Linux filesystem from this problem?
In summary (feel free to post any of my reply to you): By mkdir'ing lost+found, it will have either (or both):
1. Increased the size of the root directory of /dev/sda5 2. Attached or forced the recreation of the lost+found directory.
There were several inconsistancies on the filesystem, including duplicate usage of blocks where two inodes (essentially files) were claiming the to use the same blocks as each other. (explaination of inodes on foldoc.org) '.' and '..' entries were missing from lost+found directory or misplaced, for some reason. fsck added them in the correct place.
4 directories were unconnected from the filesystem. They existed but weren't refered to by anything directly. These 4 directories also had incorrect reference counts. Each time an inode is refered to, it's reference count is increased. When this reference count reaches 0, it is deleted. fsck attached these directories to the filesystem in /lost+found (relative to /dev/sda5), and updated the reference counts.
Various accounting information (block usage, block bitmaps, inode and directory counts) was incorrect, because of corrections above, and an inconsistent state. fsck corrected this by calculating the information from other sources.
Alternatively, is there a good book/help file/web page somewhere which you could recommend to anyone who wants to learn a little more about Linux filesystems (and perhaps more about the debugfs program you mentioned) to use as a reference for future filesystem crashes?
http://www.linuxdoc.org/LDP/LG/issue21/ext2.html is extremely good. There also is the Linux 2.4 kernel internals book, but this isn't for the faint hearted. http://www.linuxdoc.org/LDP/lki/index.html
There is also a book (that I am yet to get my hands on..) which is very good, although not linux specific (more solaris/sunos specific), it gives a good overview and explaination of techniques. It is aptly called "PANIC! UNIX System crash dump analysis handbook".
Another book covers this and filesystem handling itself, although again for a different operating system; "Design and Implmentation of 4.4BSD". (ext2 is based loosely on the BSD Fast filesystem)
debugfs can be used to manipulate the filesystem at a very low level. It is possible to put a filesystem in the state yours was in as well as attempt to recover it. To use it, though, it is really a good idea to have a good understanding of the underlying filesystem. The man page documents all the fun things you can do with it. I suspect the expand_dir and ln commands (from within debugfs) might have helped in this instance.
If you wish to play with debugfs without messing up a real partition, then you can create a loopback filesystem. This is done as follows:
~/tmp > dd if=/dev/zero of=testfs bs=400k count=1 1+0 records in 1+0 records out ~/tmp > mkfs.ext2 testfs mke2fs 1.15, 18-Jul-1999 for EXT2 FS 0.5b, 95/08/09 testfs is not a block special device. Proceed anyway? (y,n) y ... Writing inode tables: done Writing superblocks and filesystem accounting information: done ~/tmp > debugfs -w testfs debugfs 1.15, 18-Jul-1999 for EXT2 FS 0.5b, 95/08/09 debugfs: ls 2 (12) . 2 (12) .. 11 (1000) lost+found debugfs:
"?" will show available commands. "q" quits. The -w option to debugfs enables writing back to the filesystem (it will only let you do serious damage with -w)