sync_client segmentation fault when using TLS

Raphael Jaffey rjaffey at artic.edu
Wed Mar 31 13:31:25 EDT 2010


Dietmar Rieder wrote:
> Hi,
> 
> we just updated our master + replication servers from 2.3.13 to 2.3.16 
> and discovered, that the sync_client is dying with a segfault when it 
> connects to the replication server which has set "allowplaintext: no".
> 
> We managed to trace down the problem and came up with the following 
> patch against imap/backend.c to solve it.
> 
> Gernot & Didi
> 
> p.s.:
> Allowing plaintext authentication by "allowplaintext: yes" on the 
> replication server would also be an option as workarround ... but not 
> for us :-)
> 
> --- backend.c.orig	Thu Apr 23 19:10:05 2009
> +++ backend.c	Tue Jan 12 23:39:24 2010
> @@ -93,7 +93,7 @@
>  
>      resp = (automatic == AUTO_BANNER) ? prot->banner.resp : prot->capa_cmd.resp;
>  
> -    if (!automatic) {
> +    if (!automatic && (prot->capa_cmd.cmd!=NULL)) {
>  	/* no capability command */
>  	if (!prot->capa_cmd.cmd) return NULL;
>  	
> @@ -449,7 +449,8 @@
>      if ((server[0] != '/') ||
>  	(strcmp(prot->sasl_service, "lmtp") &&
>  	 strcmp(prot->sasl_service, "csync"))) {
> -	char *mlist = xstrdup(mechlist); /* backend_auth is destructive */
> +	char *mlist=NULL;
> +	if (mechlist) mlist= xstrdup(mechlist); /* backend_auth is destructive */
>  
>  	if ((r = backend_authenticate(ret, prot, &mlist, userid,
>  				      cb, auth_status))) {

As I mentioned in a previous post in this thread, I was having the same
problem at our site following a 2.3.13 to 2.3.16 upgrade, and tried the
patch included above.  As this appeared to introduce another problem, I
spent some time analyzing the situation and came up with the following,
similar patch (also included as an attachment), which only alters the
behavior of ask_capability in backend.c when invoked within sync_client 
instead of all callers.  While I realize that this is probably not the 
best approach, it does address the problem at hand without requiring a 
deeper understanding best left to the developers and without the 
potential of introducing undesirable side effects on other components 
that rely on backend.c.


--- backend.c      2009-04-23 12:10:05.000000000 -0500
+++ backend.c   2010-03-30 15:58:22.000000000 -0500
@@ -93,7 +93,7 @@

        resp = (automatic == AUTO_BANNER) ? prot->banner.resp :
prot->capa_cmd.resp;

-    if (!automatic) {
+    if (!automatic && strcmp(prot->sasl_service, "csync")) {
           /* no capability command */
           if (!prot->capa_cmd.cmd) return NULL;

@@ -449,7 +449,8 @@
        if ((server[0] != '/') ||
           (strcmp(prot->sasl_service, "lmtp") &&
            strcmp(prot->sasl_service, "csync"))) {
-       char *mlist = xstrdup(mechlist); /* backend_auth is destructive */
+        char *mlist = NULL;
+       if (mechlist) mlist = xstrdup(mechlist); /* backend_auth is
destructive */

           if ((r = backend_authenticate(ret, prot, &mlist, userid,
                                         cb, auth_status))) {




There are really two problems affecting sync_client that were introduced 
by the new version, both of which are triggered when the sync_client 
attempts to authenticate to the sync_server.  More on this later.

As I seem to be one of only two users on this list who reported this
problem, I'm led to believe that there's something about our two
installations that's unique compared to everyone else's.  To satisfy my
curiosity as to why such problem could go unnoticed by anyone else for 
several months I'd like to compare my setup with other list members who 
are using replication.

I tested 2.3.16 replication on the following platforms:

   Dell PE2950III dual quad core / 32 bit RHEL 4.7

   Dell PE2950III dual quad core / 32 bit RHEL 5.4

Relevant config snippets:

   From master /etc/cyrus.conf:

     syncclient       cmd="sync_client -r"

   From master /etc/imapd.conf:

     guid_mode: sha1
     sync_host: somehost.somedomain.tld
     sync_authname: csync
     sync_password: *********
     sync_log: 1
     sync_shutdown_file: /var/lib/imap/sync_shutdown

   From replica /etc/cyrus.conf:

      syncserver    cmd="sync_server" listen="csync"

   From replica /etc/imapd.conf:

     sasl_pwcheck_method: saslauthd
     sasl_mech_list: PLAIN
     allowplaintext: no
     sasl_minimum_layer: 128
     tls_cert_file: /etc/pki/tls/certs/cert.pem
     tls_ca_file: /etc/pki/tls/certs/chain.pem
     tls_key_file: /etc/pki/tls/private/key.pem
     tls_cipher_list: !ADH:MEDIUM:HIGH
     guid_mode: sha1

   From replica /etc/sysconfig/saslauthd:

     MECH=pam


The banner presented by sync_server:

     $ telnet localhost 2005
     Trying 127.0.0.1...
     Connected to localhost.localdomain (127.0.0.1).
     Escape character is '^]'.
     * STARTTLS
     * OK somehost.somedomain.tld Cyrus sync server v2.3.16


Observations:

   There are no mechs in the banner for ask_capability()
   to return, hence ask_capability() returns NULL mechlist.

   This causes backend_connect() to segfault when it tries to
   xstrdup() mechlist before calling backend_authenticate(), or
   rather when xstrdup() tries to strlen() a NULL string.

   This is the first of the two aforementioned problems.  Version
   2.3.13 did not copy mechlist prior to calling
   backend_authenticate(), so this problem did not exist.

   Once the xstrdup problem has been patched away, the second
   problem becomes apparent.  The behavior of ask_capability()
   has changed between 2.3.13 and 2.3.16.  Version 2.3.16
   of ask_capability() returns NULL immediately if automatic and
   prot->capa_cmd.cmd are logically false.  In
   this case, backend_authenticate() never obtains a
   mech following a STARTTLS and can therefore never authenticate.
   Version 2.3.13 ask_capability(), on the other hand, would proceed
   to parse the banner for mechs in the event that automatic and
   prot->capa_cmd.cmd are logically false.

   I felt this second issue was better addressed by limiting the patch
   in ask_capability() to affect only cases where prot->sasl_service
   is "csync", as it is less than obvious to me what
   undesireable effects might result from changing the behavior of
   ask_capability() for every caller without expending significantly
   more time studying the code and testing changes.  I leave the
   permanent fix in the far more capable hands of the developers.


How do the settings and the banner above differ from those on 2.3.16 
installations where replication is working?  Would anyone with a working 
installation be willing to compare theirs with mine?

Thanks.

-- 
___________________________________________________________________________
Raphael Jaffey                             E-mail: rjaffey at artic.edu
Director of Network Services
The Art Institute of Chicago                Voice: (312) 629-6543
111 S. Michigan Ave, Chicago, IL  60603       FAX: (312) 641-3406


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: cyrus-imapd-2.3.16-sync_client.patch
Url: http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20100331/83439a1c/attachment.ksh 


More information about the Info-cyrus mailing list