followup: stuck lmtpd processes

Etienne Goyer etienne.goyer at linuxquebec.com
Wed Sep 24 12:30:10 EDT 2003


On Wed, Sep 24, 2003 at 12:57:37PM -0300, Henrique de Moraes Holschuh wrote:
> I did check ALL the documentation already, and ALL of it says that sigalarm
> MUST interrupt the syscall, and that it HAS to return EINTR.  So, it is a
> bug.  So, it needs to be squashed, and people have to either patch or
> upgrade their systems... or deal with diminished performance.

Please have a look at the Stevens reference I made in reply to Rob.
According to him, BSD circa 1992 was not adhering to this behavior.  If
modern BSD perpetuate this or not, I can't tell.  According to Stevens
again, SunOS 4.1.2 had yet another behavior in this regard.  Wheter
these ancient OS should be accomodated or not is a decision I am not
qualified to comment upon.

> > And please don't scoff it as "a problem with Linux, not Cyrus".  Linux
> > may well be broken (I can't tell), but it still constitute the vast
> > majority of Cyrus installation (I would believe), and thus merit to be
> > accomodated.
> 
> Something that works in Linux, sure.  Something that works in broken Linux?
> No.  Fix the breakage in Linux, instead.  That's our strenght, and I *will*
> stick to it as a Debian maintainer.

While I agree with you on a technical level and admire your commitment
to excellence, this may not be practictal.  The installed base is huge
and the interested party (Linux distributor) numerous.  Getting
everybody to update broken packages will be quite an endeavour.
Considering this bug touch upon the kernel and glibc, expecting end-user
to patch themselve without support from their distributor is not an
option either.

> There is a proper Unix way to do it (using alarm().  this needs to be added
> to Cyrus IMHO) that *might* not work in certain Linux glibc/kernel
> combinations.

That's the crux of the problem : if the glibc/kernel combination
correspond to the major part of the installed base, it might continue to 
hurt for a long time.

> Now, if other Unixes have stupid lock and alarm() bugs, that deadlock
> testing code would be even more useful... :-)

In the case of closed-source OS, there may be nothing we can do about it
except working around the bug.

-- 
Etienne Goyer                    Linux Québec Technologies Inc.
http://www.LinuxQuebec.com       etienne.goyer at linuxquebec.com




More information about the Info-cyrus mailing list