valid HTML

valid CSS
 
ASEM-51
 
Application Note 0002 Draft 2.2 October 15, 2006

Why does ASEM-51 V1.3 hang under Linux?


asem 1.3 and hexbin 2.3 may hang when invoked

  • on Linux systems with kernel 2.6 thru 2.6.10,
  • on ext3 filesystems with dir_index enabled,
  • in subdirectories of large parent directories,
  • and with file parameters with relative paths.

The reason is that the hashed b-tree implementation (htree) for the ext3 filesystem in Kernel 2.6 contained severe bugs which could not all be fixed before version 2.6.11. Therefore the system function readdir sometimes returned wrong results. For details see the kernel 2.6.11 changelog:

ftp://ftp.kernel.org/pub/linux/kernel/v2.6/ChangeLog-2.6.11
<tytso@mit.edu>
    [PATCH] ext3 htree telldir() fix

    telldir() is broken on large ext3 dir_index'd directories because
    getdents() gives d_off==0 for the first entry

    Here's a patch which fixes the problem, but note the following warning
    from the readdir man page:

        According to POSIX, the dirent structure contains a field char d_name[]
        of unspecified size, with at most NAME_MAX characters preceding the
        terminating null character. Use of other fields will harm the porta-
        bility of your programs.

    Also, as always, telldir() and seekdir() are truly awful interfaces
    because they implicitly assume that (a) a directory is a linear data
    structure, and (b) that the position in a directory can be expressed
    in a cookie which hsa only 31 bits on 32-bit systems.

    So there will be hash colliions that will cause programs that assume
    that seekdir(dirent->d_off) will always return the next directory
    entry to sometimes lose directory entries in the
    not-as-unlikely-as-we-would wish case of a 31-bit hash collision.
    Really, any program which is using telldir/seekdir really should be
    rewritten to not use these interfaces if at all possible. So with
    these caveats....

    What we need to do is wire '.' and '..' to have hash values of (0,0) and
    (2,0), respectively, without ignoring other existing dirents with colliding
    hashes. (In those cases the programs will break, but they are statistically
    rare, and there's not much we can do in those cases anyway.)

    Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
    Signed-off-by: Andrew Morton <akpm@osdl.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

To determine whether your system is concerned, and to get it right, perform the following steps:

  1. Check your kernel version. Type at the shell prompt
        uname -a
    If your kernel is older than 2.6 or newer than 2.6.10 you have no problem.
     
  2. If you have a 2.6.x kernel older than version 2.6.11, list your mounted filesystems with
        df -T
    If you are not working on ext3 filesystems, you have no problem.
     
  3. If your home directory is on an ext3 filesystem, login as root and check whether its dir_index option is enabled, e.g.
        tune2fs -l /dev/hdxy | grep features
    If the filesystem features do not include the dir_index option, you have no problem.
     
  4. If dir_index is enabled, change to your ASEM-51 working directory and invoke asem as follows:
        asem blink.a51
    Be sure to try this in a directory whose parent directory is larger than 1 block (to ensure that it is really indexed)! Specify a file parameter with a relative path.
    If asem terminates (with or without errors) you have not yet a problem.
     
  5. If it is running into an endless loop however, you have exactly the "readdir-problem" described above. In this case a quick workaround is to specify all file parameters for asem and hexbin with absolute paths, e.g.
        asem /home/me/8051/blink.a51
        hexbin /home/me/8051/blink.hex
    This prevents asem and hexbin of calling readdir.
     
  6. For effective troubleshooting, you can switch off directory indexing for the whole filesystem. First of all, ensure that it is properly unmounted. (If it happens to be your root filesystem, shutdown your computer and reboot from a live-CD.)
    Then type (as root)
        tune2fs -O ^dir_index /dev/hdxy
        e2fsck -Df /dev/hdxy
    It is said that running e2fsck is not absolutely necessary, but it is highly recommended to optimize the filesystem for traditional linear directories.
     
  7. Of course the best solution is to update to kernel 2.6.11 or later, or to update the whole Linux distribution, e.g. to
        Mandriva Linux 10.2   (kernel 2.6.11)
        Fedora Core 4         (kernel 2.6.11)
        Ubuntu 5.10           (kernel 2.6.12)
        SuSE Linux 10.0       (kernel 2.6.13)
    ... or any later version.

    The latest stable Debian release 3.1 (Sarge) still comes with kernel 2.4.27 and doesn't cause any problems. If you are using another Linux distribution, please check the kernel version by yourself, and update if necessary.
     
  8. If your asem or hexbin also hangs with other kernels or filesystems, or under different conditions, please report it as soon as possible! Thanks in advance.

Thanks to Michael de Nil and Richard Stover for reporting this problem.

The next ASEM-51 version (1.4, or something) will be compiled with the latest FreePascal release, whose runtime system does no longer call readdir, and the above problem will (hopefully) be solved.

 
     Get this application note in plain ASCII format.
 

Last revised:   W.W. Heinz,   December 15, 2010