The master janitor goes crazy / Re: Debugging Deadlocks
ellie timoney
ellie at fastmail.com
Tue Nov 26 17:17:13 EST 2019
Can you strace the master process next time it's spinning at 100%? What is it doing at that time?
On Tue, Nov 26, 2019, at 1:29 AM, Дилян Палаузов wrote:
> Hello,
>
> > I run cyrus imap 3.0.x with some private changes.
> >
> > Sometimes when stop the master process, the master process utilizes one CPU core to 100% for 5 minutes. After the fifth
> > minute, systemd enforces kill -9. When I attach to the maste process, I see that it some janitor does some work, but I
> > have not checked the details. Has anybody experienced this?
>
> I run cyrus imap. At some moment I recompile and reinstall the
> binaries, which in theory means that the binaries
> detect this change and restart theirselves. At some moment I call
> "systemctl stop cyrus-imap" which I guess sends
> SIGTERM to the master process. Then the CPU utilization of the master
> process goes to 100%. In the systemd service
> file I have TimeoutStopSec=320 . After this time, the master process
> continues running and systemd sends 9/SIGKILL. It
> is not necessary that on re-installing the binaries, and then shutting
> down the CPU goes to 100%: it is possible that
> the CPU goes to 100%, without reinstalling (and thus triggering
> self-restarting) of the imapd/httpd binaries.
>
> It is often, but not always, that this 100% CPU loop is entered on shutdown.
>
> I have a webmail client and to speedup things it uses SquirrelMail's
> IMAP Proxy (http://www.imapproxy.org/ a Caching
> IMAP proxy). It is recommended in the installation manual of
> Horde/IMP. The IMAP caching proxy connects to
> 127.0.0.2:143 (and is therefore permitted to skip the TLS overload).
> In master conf I have a line
> “imaplocal cmd="imapd -C /usr/local/etc/cyrus/imapdlocal.conf"
> listen="127.0.0.2:imap" prefork=0”.
>
> When the CPU goes to 100% on shutdown I connect with gdb to the master
> process. Below is the full backtrace. Does
> somebody have an explanation why the master process enters a never
> ending loop?
>
> I do not say that all above information has to be involved in the
> anwer. Has somebody else experienced this effects?
> Any suggestions how to investigate this deeper?
>
> Greetings
> Дилян
>
> ---
> warning: Could not load vsyscall page because no executable was
> specified
> Reading symbols from /usr/local/libexec/master...
> Attaching to program: /usr/local/libexec/master, process 9247
> Reading symbols from /usr/local/lib/libcyrus_min.so.0...
> Reading symbols from /lib/libuuid.so.1...
> Reading symbols from /usr/local/lib/libgssapi_krb5.so.2...
> Reading symbols from /usr/local/lib/libkrb5.so.3...
> Reading symbols from /usr/local/lib/libk5crypto.so.3...
> Reading symbols from /usr/local/lib/libcom_err.so.3...
> Reading symbols from /usr/local/lib/libkrb5support.so.0...
> Reading symbols from /usr/local/lib/libpcreposix.so.0...
> (No debugging symbols found in /usr/local/lib/libpcreposix.so.0)
> Reading symbols from /usr/local/lib/libpcre.so.1...
> (No debugging symbols found in /usr/local/lib/libpcre.so.1)
> Reading symbols from /usr/local/lib/libxml2.so.2...
> Reading symbols from /usr/local/lib/liblzma.so.5...
> (No debugging symbols found in /usr/local/lib/liblzma.so.5)
> Reading symbols from /usr/local/lib/libical.so.3...
> Reading symbols from /usr/local/lib/libicalss.so.3...
> Reading symbols from /usr/local/lib/libicalvcal.so.3...
> Reading symbols from /usr/local/lib/libicui18n.so.63...
> Reading symbols from /usr/local/lib/libicuuc.so.63...
> Reading symbols from /usr/local/lib/libicudata.so.63...
> (No debugging symbols found in /usr/local/lib/libicudata.so.63)
> Reading symbols from /usr/local/lib/libsqlite3.so.0...
> (No debugging symbols found in /usr/local/lib/libsqlite3.so.0)
> Reading symbols from /usr/local/lib/libz.so.1...
> (No debugging symbols found in /usr/local/lib/libz.so.1)
> Reading symbols from /lib64/libm.so.6...
> Reading symbols from /lib64/libdl.so.2...
> Reading symbols from /lib64/libpthread.so.0...
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> Reading symbols from /lib64/libc.so.6...
> Reading symbols from /lib64/ld-linux-x86-64.so.2...
> Reading symbols from /lib64/libresolv.so.2...
> Reading symbols from /usr/local/lib/libdb-18.1.so...
> Reading symbols from /usr/local/lib64/libstdc++.so.6...
> Reading symbols from /usr/local/lib64/libgcc_s.so.1...
> Reading symbols from /usr/local/lib64/libssl.so.1.1...
> Reading symbols from /usr/local/lib64/libcrypto.so.1.1...
> Reading symbols from /lib64/libnss_db.so.2...
> Reading symbols from /lib64/libnss_files.so.2...
> Reading symbols from /lib64/libnss_dns.so.2...
> 0x0000000000405406 in child_janitor (now=...) at master/master.c:1192
> 1192 janitor_position = janitor_position % child_table_size;
> ™(gdb) bt f
> Id Target Id Frame
> * 1 Thread 0x7f6a08759780 (LWP 9247) "master" 0x0000000000405406 in
> child_janitor (now=...) at master/master.c:1192
> #0 0x0000000000405406 in child_janitor (now=...) at
> master/master.c:1192
> i = 9299
> p = 0x4132e0 <ctable+16224>
> c = 0x0
> #1 0x0000000000409dd7 in main (argc=4, argv=0x7ffea3075108) at
> master/master.c:2600
> i = 14
> ready_fds = 3
> total_children = 11
> tv = {
> tv_sec = 0,
> tv_usec = 0
> }
> msg = {
> message = 1,
> service_pid = 28219
> }
> maxfd = 41
> tvptr = 0x0
> interrupted = 0
> pidfile = 0x40c4f0 "/var/run/cyrus-master.pid"
> pidfile_lock = 0x2135ba0 "/usr/local/etc/cyrus/imapdlocal.conf"
> startup_pipe = {6, 7}
> pidlock_fd = -1
> i = 14
> opt = -1
> close_std = 1
> daemon_mode = 1
> error_log = 0x0
> alt_config = 0x0
> fd = 3
> rfds = {
> fds_bits = {266272, 0 <repeats 15 times>}
> }
> r = 1
> now = {
> tv_sec = 1574690925,
> tv_usec = 958878
> }
> p = 0x0
> quit
> Detaching from program: /usr/local/libexec/master, process 9247
> [Inferior 1 (process 9247) detached]
>
>
>
>
More information about the Cyrus-devel
mailing list