The master janitor goes crazy / Re: Debugging Deadlocks

Дилян Палаузов dilyan.palauzov at aegee.org
Mon Nov 25 09:29:01 EST 2019


Hello,

> I run cyrus imap 3.0.x with some private changes.
> 
> Sometimes when stop the master process, the master process utilizes one CPU core to 100% for 5 minutes.  After the fifth
> minute, systemd enforces kill -9. When I attach to the maste process, I see that it some janitor does some work, but I
> have not checked the details.  Has anybody experienced this?

I run cyrus imap.   At some moment I recompile and reinstall the binaries, which in theory means that the binaries
detect this change and restart theirselves.  At some moment I call "systemctl stop cyrus-imap" which I guess sends
SIGTERM to the master process.   Then the CPU utilization of the master process goes to 100%.  In the systemd service
file I have TimeoutStopSec=320 . After this time, the master process continues running and systemd sends 9/SIGKILL.  It
is not necessary that on re-installing the binaries, and then shutting down the CPU goes to 100%: it is possible that
the CPU goes to 100%, without reinstalling (and thus triggering self-restarting) of the imapd/httpd binaries.

It is often, but not always, that this 100% CPU loop is entered on shutdown.

I have a webmail client and to speedup things it uses SquirrelMail's IMAP Proxy (http://www.imapproxy.org/ a Caching
IMAP proxy).  It is recommended in the installation manual of Horde/IMP.  The IMAP caching proxy connects to
127.0.0.2:143 (and is therefore permitted to skip the TLS overload).  In master conf I have a line
“imaplocal     cmd="imapd -C /usr/local/etc/cyrus/imapdlocal.conf" listen="127.0.0.2:imap" prefork=0”.

When the CPU goes to 100% on shutdown I connect with gdb to the master process.  Below is the full backtrace.  Does
somebody have an explanation why the master process enters a never ending loop?

I do not say that all above information has to be involved in the anwer.  Has somebody else experienced this effects? 
Any suggestions how to investigate this deeper?

Greetings
  Дилян

---
warning: Could not load vsyscall page because no executable was specified
Reading symbols from /usr/local/libexec/master...
Attaching to program: /usr/local/libexec/master, process 9247
Reading symbols from /usr/local/lib/libcyrus_min.so.0...
Reading symbols from /lib/libuuid.so.1...
Reading symbols from /usr/local/lib/libgssapi_krb5.so.2...
Reading symbols from /usr/local/lib/libkrb5.so.3...
Reading symbols from /usr/local/lib/libk5crypto.so.3...
Reading symbols from /usr/local/lib/libcom_err.so.3...
Reading symbols from /usr/local/lib/libkrb5support.so.0...
Reading symbols from /usr/local/lib/libpcreposix.so.0...
(No debugging symbols found in /usr/local/lib/libpcreposix.so.0)
Reading symbols from /usr/local/lib/libpcre.so.1...
(No debugging symbols found in /usr/local/lib/libpcre.so.1)
Reading symbols from /usr/local/lib/libxml2.so.2...
Reading symbols from /usr/local/lib/liblzma.so.5...
(No debugging symbols found in /usr/local/lib/liblzma.so.5)
Reading symbols from /usr/local/lib/libical.so.3...
Reading symbols from /usr/local/lib/libicalss.so.3...
Reading symbols from /usr/local/lib/libicalvcal.so.3...
Reading symbols from /usr/local/lib/libicui18n.so.63...
Reading symbols from /usr/local/lib/libicuuc.so.63...
Reading symbols from /usr/local/lib/libicudata.so.63...
(No debugging symbols found in /usr/local/lib/libicudata.so.63)
Reading symbols from /usr/local/lib/libsqlite3.so.0...
(No debugging symbols found in /usr/local/lib/libsqlite3.so.0)
Reading symbols from /usr/local/lib/libz.so.1...
(No debugging symbols found in /usr/local/lib/libz.so.1)
Reading symbols from /lib64/libm.so.6...
Reading symbols from /lib64/libdl.so.2...
Reading symbols from /lib64/libpthread.so.0...
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Reading symbols from /lib64/libc.so.6...
Reading symbols from /lib64/ld-linux-x86-64.so.2...
Reading symbols from /lib64/libresolv.so.2...
Reading symbols from /usr/local/lib/libdb-18.1.so...
Reading symbols from /usr/local/lib64/libstdc++.so.6...
Reading symbols from /usr/local/lib64/libgcc_s.so.1...
Reading symbols from /usr/local/lib64/libssl.so.1.1...
Reading symbols from /usr/local/lib64/libcrypto.so.1.1...
Reading symbols from /lib64/libnss_db.so.2...
Reading symbols from /lib64/libnss_files.so.2...
Reading symbols from /lib64/libnss_dns.so.2...
0x0000000000405406 in child_janitor (now=...) at master/master.c:1192
1192	        janitor_position = janitor_position % child_table_size;
™(gdb) bt f
  Id   Target Id                                 Frame 
* 1    Thread 0x7f6a08759780 (LWP 9247) "master" 0x0000000000405406 in child_janitor (now=...) at master/master.c:1192
#0  0x0000000000405406 in child_janitor (now=...) at master/master.c:1192
        i = 9299
        p = 0x4132e0 <ctable+16224>
        c = 0x0
#1  0x0000000000409dd7 in main (argc=4, argv=0x7ffea3075108) at master/master.c:2600
        i = 14
        ready_fds = 3
        total_children = 11
        tv = {
          tv_sec = 0,
          tv_usec = 0
        }
        msg = {
          message = 1,
          service_pid = 28219
        }
        maxfd = 41
        tvptr = 0x0
        interrupted = 0
        pidfile = 0x40c4f0 "/var/run/cyrus-master.pid"
        pidfile_lock = 0x2135ba0 "/usr/local/etc/cyrus/imapdlocal.conf"
        startup_pipe = {6, 7}
        pidlock_fd = -1
        i = 14
        opt = -1
        close_std = 1
        daemon_mode = 1
        error_log = 0x0
        alt_config = 0x0
        fd = 3
        rfds = {
          fds_bits = {266272, 0 <repeats 15 times>}
        }
        r = 1
        now = {
          tv_sec = 1574690925,
          tv_usec = 958878
        }
        p = 0x0
quit
Detaching from program: /usr/local/libexec/master, process 9247
[Inferior 1 (process 9247) detached]





More information about the Cyrus-devel mailing list