MUPDATE database problems -- the importance of thread safety
Wesley Craig
wes at umich.edu
Wed Jun 17 14:16:44 EDT 2009
Please open a report in bugzilla and mark it was a "blocker". Thanks
for finding the issue.
:wes
On 17 Jun 2009, at 09:44, Michael Bacon wrote:
> It turns out that this was an issue with mupdate being a multi-
> threaded daemon, and in a critical place in the non-blocking prot
> code (in prot_flush_internal()), the behavior relies on the value
> of errno. If it's EAGAIN, the write will try again, otherwise it
> sets s->error and quits. Naturally, being a global variable
> normally, errno doesn't work terribly well in multi-threaded code
> unless the necessary thread safety switch is passed to the
> compiler. Hence, when thread #5 was getting a -1 from the write(2)
> system call, it was reading errno as 0, rather than EAGAIN as it
> should have been.
>
> The solution, should anyone else run into this, is as simple as
> recompiling with the thread safety switch. (In the case of Sun's
> SPro, it's -mt. I think it's -mthread for gcc, but I'm not sure.)
> Maddening that the fix was that simple, as I spent two solid weeks
> hunting for the dratted bug.
>
> I have two requests to the CVS maintainers out there. First, the
> below patch to current CVS isn't terribly comprehensive, and
> doesn't narrow it down from about a dozen places s->error could be
> set, but at least would have given SOME kind of indication on the
> server that something had gone wrong, and might have saved me about
> a week of hunting.
>
> Secondly, I am very weak in the ways of autoconf, but it strikes me
> that since Cyrus now builds mupdate as multithreaded by default
> (good decision, IMO), autoconf should make some attempt to figure
> out what thread safety switch is appropriate and add it to CFLAGS.
More information about the Cyrus-devel
mailing list