followup: stuck lmtpd processes
Henrique de Moraes Holschuh
hmh at debian.org
Wed Sep 24 10:13:06 EDT 2003
On Wed, 24 Sep 2003, Rob Siemborski wrote:
> On Wed, 24 Sep 2003, Henrique de Moraes Holschuh wrote:
> > Agreed, but if we are going to keep the blocking-on-lock behaviour (and I
> > know we are ;-)), we really, really should have a way to timeout and kill
> > the process if that lock does not release after a while.
> >
> > Resilience IS necessary... As an admin, I want to know there are problems
> > from syslog events, not because the whole system stopped. Right now, at
> > least in Linux (which DOES have kernel/glibc bugs in that area) that means
> > we end up needing the slow-as-hell backoff non-blocking locks stuff.
>
> As I understand it (based on your comments to Bug 1177), just setting an
> alarm() around the flock/fcntl calls isn't good enough to solve the Linux
> problem.
It is not a general solution when you hit glibc/kernel bugs, but I can
certainly live with it IF I manage to track down a version of glibc and
kernel that won't deadlock, that we can recommend. Either that, or allow for
runtime-switchable behaviours (I am willing to code this).
Now, I need to find some time to write a fctnl/flock deadlock test case. If
anyone has one already, please send it my way...
--
"One disk to rule them all, One disk to find them. One disk to bring
them all and in the darkness grind them. In the Land of Redmond
where the shadows lie." -- The Silicon Valley Tarot
Henrique Holschuh
More information about the Info-cyrus
mailing list