MUPDATE database problems -- the importance of thread safety

Wed Jun 17 09:44:16 EDT 2009

It turns out that this was an issue with mupdate being a multi-threaded 
daemon, and in a critical place in the non-blocking prot code (in 
prot_flush_internal()), the behavior relies on the value of errno.  If it's 
EAGAIN, the write will try again, otherwise it sets s->error and quits. 
Naturally, being a global variable normally, errno doesn't work terribly 
well in multi-threaded code unless the necessary thread safety switch is 
passed to the compiler.  Hence, when thread #5 was getting a -1 from the 
write(2) system call, it was reading errno as 0, rather than EAGAIN as it 
should have been.

The solution, should anyone else run into this, is as simple as recompiling 
with the thread safety switch.  (In the case of Sun's SPro, it's -mt.  I 
think it's -mthread for gcc, but I'm not sure.)  Maddening that the fix was 
that simple, as I spent two solid weeks hunting for the dratted bug.

I have two requests to the CVS maintainers out there. First, the below 
patch to current CVS isn't terribly comprehensive, and doesn't narrow it 
down from about a dozen places s->error could be set, but at least would 
have given SOME kind of indication on the server that something had gone 
wrong, and might have saved me about a week of hunting.

Secondly, I am very weak in the ways of autoconf, but it strikes me that 
since Cyrus now builds mupdate as multithreaded by default (good decision, 
IMO), autoconf should make some attempt to figure out what thread safety 
switch is appropriate and add it to CFLAGS.

Regards,
Michael Bacon
ITS Messaging
UNC Chapel Hill

--- prot.c      23 Apr 2009 17:10:07 -0000      1.97
+++ prot.c      17 Jun 2009 13:34:26 -0000
@@ -1038,6 +1038,8 @@
     /* If we are exiting with an error, we should clear our memory buffer
      * and set our return code */
     if(s->error) {
+        syslog(LOG_DEBUG, "prot_flush_internal: Error -- %s", s->error);
+
         s->ptr = s->buf;
         s->cnt = s->maxplain;
        return EOF;


--On June 13, 2009 4:22:03 PM -0400 Michael Bacon <baconm at email.unc.edu> 
wrote:

> Hello all,
>
> We're in the middle of trying to move from our single server installation
> to a new murder installation on all new hardware.  We're getting into the
> late stages of setup, when we've run into a killer problem with getting
> the  old server to sync up with the MUPDATE server so that we can migrate
> off of  it.  We're under a deadline to get the expensive new hardware
> rolled out  into production, so any help would be enormously appreciated.
>
> The test installation with a test backend of, oh, a couple dozen
> mailboxes  worked flawlessly.  Syncing happened just as it was supposed
> to, and  everything looked good for production.  The next step was to
> start the old  server syncing its database with the MUPDATE server, and
> that's where we're  stuck.
>
> The initial sync from the old backend works just fine.  During the second
> sync, however (ctl_mboxlist -m), the backend connects to the MUPDATE
> server, executes a LIST <servername>, and then the server returns
> somewhere  between 2500-10,000 lines (of a 830k+ mailboxes database), and
> freezes.  A  combination of telemetry logs and truss output shows that
> the server  records itself as having sent more data than the client
> receives, but  truss'ing the client shows the client expectantly waiting
> in a read state.  (The server continues to spin in a
> fstat/stat/fcntl/fcntl cycle on the  mailboxes database, which as far as
> I can tell is normal behavior for the  skiplist driver, but still looks
> really weird in a truss.)
>
> Now, here's where it gets even weirder: if I connect using mupdatetest
> and  issue the same LIST command and let it run, the command runs to
> completion  without error.  However, if I at some point use flow control
> on my ssh  session and hit ^S, then a ^Q, the scrolling continues
> briefly, and then  the server hangs in a very similar way as above.  To
> make things even  odder, when I run a super-aggressive truss on the
> process (truss -aeflE -v  all), the error never occurs.  It's as if
> slowing down the mupdate process  keeps it out of whatever error state it
> gets into.
>
> To make matters stranger, when I used the berkeley-hash driver on the
> MUPDATE mboxlist, the MUPDATE server fails to return anything from a LIST
> command, even when its database is full of matching entries.  When
> ctl_mboxlist -m is run, an assert() fails and the process exits without
> performing any work.
>
> Because of all of this, I suspect something going wrong with a buffer
> filling up ungracefully somewhere.  The spot I'm attacking right now is
> the  64-bit build -- I'm spending the weekend in the office rebuilding
> everything as 32 bit instead (libraries from the ground up), in case
> there's some problem with a different interpretation of size_t or some
> such  thing in the 64-bit world.  I'll share any findings in a few days,
> but I  wanted to get this out earlier.
>
> We've eliminated hardware, OS, network, and compiler-specific errors by
> trying uploading the same database from numerous different clients to
> numerous different servers.  (See the combinations tried below).  I'm
> open  to any and all suggestions at this point.
>
> Michael Bacon
> ITS Messaging
> UNC Chapel Hill
>
>
>
> Current system information:
> Hardware:  Sun T5220s (Sparc CoolThreads architecture) running Solaris 10
> Build: 64-bit binaries built using the Sun SPro compiler (to get
> CoolThreads optimizations)
> Configuration: tlscache, duplicate, and mboxlist_db all defined to
> skiplist
>
>
> Combinations tried: (backend client -> mupdate server)
> (all builds currently 64 bit 2.3.13)
>
> Sun 6800+Sol 9+gcc build -> Sun 5220+Sol 10+spro build
> Sun 6800+Sol 9+gcc build -> Sun 5120+Sol 10+spro build
> Sun 280R+Sol 9+gcc build -> Sun 5220+Sol 10+spro build
> Sun 280R+Sol 9+gcc build -> Same machine, separate cyrus install over
> localhost
> Sun 5220+Sol 10+spro build -> Sun 5220+Sol 10+spro build
> Sun 5220+Sol 10+spro build -> Sun 280R+Sol 9+gcc build
> We tried others too, but this covers most of the important combinations,
> I  think.
> ----
> Cyrus Home Page: http://cyrusimap.web.cmu.edu/
> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html