Problematic error checking in lmtpd.c

Michael Bacon baconm at duke.edu
Wed Oct 1 13:58:07 EDT 2003


Hello, folks,

We've got a post office that sporadically goes into states where it gets 
about 4 or 5 lmtpd processes eating huge amounts of CPU time.  After 
spending some quality time with my debugger and a stack of core files, I 
think I've found the issue.  Somehow, we're getting into a state where the 
fork() call in open_sendmail() [lmtpd.c] fails and returns -1, but the 
subsequent fdopen() succeeds and returns something other than NULL.  The 
return values of these two functions get passed back to the calling 
funciton as sm_pid and sm, respectively.  In the three instances where 
open_sendmail() is called, the calling function checks to ensure that sm is 
not NULL, but does not check to ensure that sm_pid is positive.  As such, 
in the instances of the runaway lmtpds that we've seen, sm_pid comes back 
as -1, and is subsequently passed to waitpid().  Because of the special 
symantecs of waitpid(), this causes the process to request status for any 
child in the same process group as init.  No wonder it's eating so much CPU 
time!

The following is a diff against 2.0.17, just because it was what I had 
handy, but the code appears to be roughly the same in the current CVS. 
This seems to fix it, but there may be better ways to go about it that I 
haven't considered.

Thanks,
Michael Bacon
OIT Systems and Core Services
Duke University


diff -u -r1.63.2.2 lmtpd.c
--- lmtpd.c     21 Aug 2001 17:50:12 -0000      1.63.2.2
+++ lmtpd.c     1 Oct 2003 17:49:56 -0000
@@ -498,7 +498,7 @@
     smbuf[4] = rejto;
     smbuf[5] = NULL;
     sm_pid = open_sendmail(smbuf, &sm);
-    if (sm == NULL) {
+    if ( (sm == NULL) || (sm_pid < 1) ) {
        return -1;
     }

@@ -582,7 +582,7 @@
     smbuf[5] = NULL;
     sm_pid = open_sendmail(smbuf, &sm);

-    if (sm == NULL) {
+    if ( (sm == NULL) || (sm_pid < 1) ) {
        return -1;
     }

@@ -827,7 +827,7 @@
     smbuf[4] = src->addr;
     smbuf[5] = NULL;
     sm_pid = open_sendmail(smbuf, &sm);
-    if (sm == NULL) {
+    if ( (sm == NULL) || (sm_pid < 1) ) {
        *errmsg = "Could not spawn sendmail process";
        return -1;
     }





More information about the Info-cyrus mailing list