use valgrind / Re: SIGSEGV in cyrus-imapd 3.0.7 mupdate

Michael Menge michael.menge at zdv.uni-tuebingen.de
Thu Jul 12 05:09:08 EDT 2018


Hi,

Дилян had suggested to add some debug outputs to imap/tls.c

--------

diff --git a/imap/tls.c b/imap/tls.c
--- a/imap/tls.c
+++ b/imap/tls.c
@@ -893,7 +893,9 @@ EXPORTED int     tls_init_serverengine(const char *ident,

  #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
      /* Load DH params for DHE-* key exchanges */
+    syslog(LOG_CRIT, "dh_params will be set, current value=%p", dh_params);
      dh_params = load_dh_param(server_key_file, server_cert_file);
+    syslog(LOG_CRIT, "dh_params were set current_value=%p", dh_params);
      SSL_CTX_set_tmp_dh(s_ctx, dh_params);
  #endif

@@ -1308,7 +1310,11 @@ EXPORTED int tls_shutdown_serverengine(void)
          }

  #if (OPENSSL_VERSION_NUMBER >= 0x0090800fL)
-        if (dh_params) DH_free(dh_params);
+        if (dh_params) {
+               syslog(LOG_CRIT, "dh_params will be freed %p", dh_params);
+               DH_free(dh_params);
+               syslog(LOG_CRIT, "dh_params were freed %p", dh_params);
+       }
  #endif
      }
-----

I did run mupdate with this debug output


Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params will be set, current  
value=(nil)
Jul 12 10:17:25 mx02 mu/mupdate[6537]: inittls: Loading DH parameters  
from file
Jul 12 10:17:25 mx02 mu/mupdate[6537]: dh_params were set  
current_value=0x7fc7541b9600
Jul 12 10:17:25 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
Jul 12 10:17:25 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
[10.23.21.78] cyrus PLAIN+TLS User logged in
Jul 12 10:18:37 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
Jul 12 10:18:37 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
[10.23.21.78] cyrus PLAIN+TLS User logged in
Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params will be freed 0x7fc7541b9600
Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params were freed 0x7fc7541b9600
Jul 12 10:18:37 mx02 mu/mupdate[6537]: starttls: TLSv1.2 with cipher  
ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits new) no authentication
Jul 12 10:18:37 mx02 mu/mupdate[6537]: login: msmx02.mail.localhost  
[10.23.21.78] cyrus PLAIN+TLS User logged in
Jul 12 10:18:37 mx02 mu/mupdate[6537]: dh_params will be freed 0x7fc7541b9600
Jul 12 10:18:38 mx02 mu/master[6534]: process type:SERVICE  
name:mupdate path:/usr/local/libexec/mupdate age:89.131s pid:6537  
signaled to death by signal 11 (Segmentation fault, core dumped)
Jul 12 10:18:38 mx02 mu/master[6534]: service mupdate/ipv4 pid 6537 in  
READY state: terminated abnormally

so it seems to me that the dh_params were set once on startup but  
freed for each closed connection

Regards

     Michael Menge


Quoting Michael Menge <michael.menge at zdv.uni-tuebingen.de>:

> Hi Дилян,
>
>
> Quoting Дилян Палаузов <Dilyan.Palauzov at aegee.org>:
>
>> Hello Michael,
>>
>> this is likely either a memory mishandling issue (use after free(),
>> double free(), invalid read()/write()...), which gets evident if cyrus
>> is run under valgrind --tool=memcheck.  I run it with
>>
>> valgrind --num-callers=30 --leak-check=full --track-origins=yes --read-
>> var-info=yes --show-leak-kinds=all --trace-children=yes --track-fds=yes
>> /usr/local/cyrus/bin/master -D &> log-file-memcheck
>>
>
> thanks for the valgind options. Valgrind did find some "Invalid read  
> of size 4"
> and "size 8" in DH_free as well as "Invalid write of size 4" in  
> CRYPTO_add_lock
> and "size 8" in OPENSSL_cleanse. As far as i can tell the memory was  
> always free'd
> before by CRYPTO_free (mem.c:434).
>
> I have attached the full log
>
>>
>> Another reason can be multi-threaded inconsistencies: mutexes locked in
>> inconsistent order by differnt threads (while this is not a cause for
>> crash, it leads to deadlock), mutexes locked in one thread and unlocked
>> in another or alike.  This can be detected by valgrind/helgrind
>>
>> valgrind --tool=helgrind --num-callers=30 --leak-check=full --track-
>> origins=yes --read-var-info=yes --trace-children=yes --track-fds=yes
>> /usr/local/cyrus/bin/master -D &> log-file-helgrind
>>
>
> I had to remove some options here, as my valgrind didn't know them with
> --tool=helgrind
>
> I used /usr/bin/valgrind --tool=helgrind --num-callers=30  
> --read-var-info=yes --trace-children=yes --track-fds=yes  
> /usr/local/libexec/master -C /etc/imapd_mu.conf -M  
> /etc/cyrus_mu.conf -p /var/run/cyurs_mu.pid -D &>  
> /tmp/cyrus-mupdate-log-file-helgrind
>
> Valgrind did find some "Possible data race during read" and "This  
> conflicts with a previous write"
> I have attached the full log as well
>
>
>
>> Of course, it is useful to have debugging symbols.  Compiling with -O3
>> migh make things faster, while compiling with -O0 will make it run
>> considerably slower. If ld (=ld.bfd) is very, very recent (e.g. 2.31),
>> then link with  -Wl,-z,noseparate-code otherwise valgrind cannot read
>> the debug info (https://bugs.kde.org/show_bug.cgi?id=395682).
>>
>> Then it shall be possible to see in one or the other log files what is
>> relevant for causing SIGSERV, in particular the first place where bad
>> operation happens, which might be some time before the crash.
>>
>
> I hope this helps untangle the web of multiple threads, race conditions
> and use after free, to find the root of this problem.
>
> If there is any more that i can do to help let me know.
>
>>
>> On Mon, 2018-07-02 at 17:24 +0200, Michael Menge wrote:
>>> Hi,
>>>
>>> we are in the process of setting up our new production mailserver with
>>> cyrus-imapd 3.0.7 on RHEL 7.5 Servers.
>>>
>>> At the moment we encounter many crashes (SIGSEGV) of the mupdate process on
>>> the mupdate master instance. As soon as we issue a command that updates
>>> multiple mailboxes in short time we trigger a SIGSEGV.
>>>
>>>> sam user/zrstes* cyrus all
>>>
>>> Setting ACL on user/zrstes1...OK.
>>> Setting ACL on user/zrstes1/Mail...OK.
>>> Setting ACL on user/zrstes1/Mail/drafts...OK.
>>> Setting ACL on user/zrstes1/Mail/s-spam...OK.
>>> Setting ACL on user/zrstes1/Mail/sent...OK.
>>> Setting ACL on user/zrstes1/Mail/trash...OK.
>>> Setting ACL on user/zrstes1/Mail/v-spam...OK.
>>> Setting ACL on user/zrstes2...cyrus: lrswipkxtea: no connection to server
>>>
>>> I suspect that at this time the connection from the backend to the
>>> mupdate master
>>> is lost as the mupdate process received the SIGSEGV
>>>
>>> (gdb) bt
>>> #0  0x00007fb49c000098 in ?? ()
>>> #1  0x00007fb4a9a59a85 in DH_free (r=0x7fb49c1b9600) at dh_lib.c:194
>>> #2  0x00007fb4aa84dbef in tls_shutdown_serverengine () at imap/tls.c:1311
>>> #3  0x0000000000404075 in conn_free (C=0x7fb4840009f0) at  
>>> imap/mupdate.c:379
>>> #4  0x00000000004062c0 in thread_main (rock=0x0) at imap/mupdate.c:1330
>>> #5  0x00007fb4a771add5 in start_thread (arg=0x7fb4a22d9700) at
>>> pthread_create.c:308
>>> #6  0x00007fb4a718fb3d in clone () at
>>> ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
>>>
>>>
>>> Further information, as DH_free is in the bt, we have NOT configured
>>> dh_params in
>>> our ssl certificate/key or in the imapd.conf
>>>
>
> Kind regards
>
>      Michael Menge
>
>
>
> --------------------------------------------------------------------------------
> M.Menge                                Tel.: (49) 7071/29-70316
> Universität Tübingen                   Fax.: (49) 7071/29-5912
> Zentrum für Datenverarbeitung          mail:  
> michael.menge at zdv.uni-tuebingen.de
> Wächterstraße 76
> 72074 Tübingen



--------------------------------------------------------------------------------
M.Menge                                Tel.: (49) 7071/29-70316
Universität Tübingen                   Fax.: (49) 7071/29-5912
Zentrum für Datenverarbeitung          mail:  
michael.menge at zdv.uni-tuebingen.de
Wächterstraße 76
72074 Tübingen



More information about the Cyrus-devel mailing list