followup: stuck lmtpd processes

Henrique de Moraes Holschuh hmh at debian.org
Wed Sep 24 10:13:06 EDT 2003


On Wed, 24 Sep 2003, Rob Siemborski wrote:
> On Wed, 24 Sep 2003, Henrique de Moraes Holschuh wrote:
> > Agreed, but if we are going to keep the blocking-on-lock behaviour (and I
> > know we are ;-)), we really, really should have a way to timeout and kill
> > the process if that lock does not release after a while.
> >
> > Resilience IS necessary... As an admin, I want to know there are problems
> > from syslog events, not because the whole system stopped.  Right now, at
> > least in Linux (which DOES have kernel/glibc bugs in that area) that means
> > we end up needing the slow-as-hell backoff non-blocking locks stuff.
> 
> As I understand it (based on your comments to Bug 1177), just setting an
> alarm() around the flock/fcntl calls isn't good enough to solve the Linux
> problem.

It is not a general solution when you hit glibc/kernel bugs, but I can
certainly live with it IF I manage to track down a version of glibc and
kernel that won't deadlock, that we can recommend. Either that, or allow for
runtime-switchable behaviours (I am willing to code this).

Now, I need to find some time to write a fctnl/flock deadlock test case. If
anyone has one already, please send it my way...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh




More information about the Info-cyrus mailing list