Cyrus 2.5.9 imapd children and tcp_keepalive & idle socket questions

ellie timoney ellie at fastmail.com
Mon Oct 3 18:07:43 EDT 2016


On Tue, Oct 4, 2016, at 12:46 AM, Andy Dorman via Info-cyrus wrote:
> On 10/02/2016 06:57 PM, ellie timoney via Info-cyrus wrote:
> > Hi Andy,
> >
> > Thanks very much for the detailed information.  It may take me a while
> > to digest this, but it looks like it will be very helpful.
> >
> > A very very initial (and, thus, possibly mistaken) assessment:
> >
> > * pselect timing out every 60 seconds sounds like what I would expect
> > while the server is in the IDLE command
> > * the fact that it keeps timing out tells us that, at least as far as
> > the server is concerned, the client is still there
> > * if the connection had dropped out, tcp_keepalive would mark it as EOF,
> > and pselect would return a number greater than zero -- not "0 (Timeout)"
> > -- to indicate that one of the selected descriptors had activity on it
> > (the EOF is a kind of activity).
> >
> > A couple of questions:
> >
> > * in your cyrus.conf, do you have "idled" enabled?  (It might be in the
> > START or DAEMON sections.)  You might have already answered this, I'll
> > check through your older messages in a minute
> > * in your syslog logs, do you get lines like "IDLE: error sending
> > message INIT to idled for mailbox <mailbox>: <error>. Falling back to
> > polling every 60 seconds."?  If so, what is the "error" section?
> >
> > If you could continue monitoring the strace output for pid 16168 and
> > send an update if any new patterns emerge, that would be great.
> >
> > Another thing that would be interesting would be if you could coordinate
> > with her to do a bunch of activity with the blackberry -- manually check
> > for new mail, manually open each folder, read some messages, change some
> > flags (such as toggling some messages as unread and then reading them
> > again), repeat for each folder, that sort of thing -- and continue
> > monitoring pid 16168 and see if the pattern changes at all.
> >
> > If you can avoid restarting for as long as possible, so we can keep
> > watching 16168 specifically, that would help.
> >
> > If you reach the point where it's going to be necessary to restart
> > cyrus, before you do so, try getting her to power cycle the blackberry,
> > and see what effect that has on 16168.  (But also don't do this until as
> > late as possible, so we can continue to look for interesting
> > patterns/deviations in the meantime.)
> >
> > Cheers,
> >
> > ellie
> >
> > On Sat, Oct 1, 2016, at 12:00 AM, Andy Dorman via Info-cyrus wrote:
> >> On 09/28/2016 09:59 AM, Andy Dorman wrote:
> >>> On 09/27/2016 07:08 PM, ellie timoney via Info-cyrus wrote:
> >>>> I wonder if we can see what the old processes are actually doing...
> >>>>
> >>>> Is there any chance you're able to obtain a backtrace of these?
> >>>>
> >>>> You can use a command something like:
> >>>>
> >>>>       gdb -p pid /path/to/imapd
> >>>>
> >>>> where pid is the one of the imapd process ids (which is the last column
> >>>> in that ls output from /run/cyrus/proc/) and where the correct path to
> >>>> wherever your imapd executable actually lives (I think this might be
> >>>> "/usr/lib/cyrus/bin/imapd" on Debian).  You should run this command as
> >>>> the same user that your cyrus server runs as (probably "cyrus").
> >>>>
> >>>> Once you're in gdb, look for output like "Reading symbols from
> >>>> /path/to/imapd...".  If this line also contains "no debugging symbols
> >>>> found", then we can't get anything useful like this, just "quit" -> "y"
> >>>> to exit gdb.  But if it does find debugging symbols, great: run "bt" to
> >>>> obtain a backtrace, then "quit" -> "y" to exit, and send through the
> >>>> output from the session (probably best to paste the output into a text
> >>>> file and then attach that to your email).
> >>>>
> >>>> Another possibility is to trace the system calls.  You can use a command
> >>>> something like this (also as the "cyrus" user):
> >>>>
> >>>>     strace -p pid
> >>>>
> >>>> Do not blindly email the output from this command, it may contain
> >>>> sensitive/private content!
> >>>>
> >>>> Let this run for 10 minutes or so, then press ctrl-c to stop it, and
> >>>> then I guess kind of try to describe the output, and we can go from
> >>>> there.  If it's very noisy, you've caught it in the middle of doing
> >>>> something. Wait a little while and try again, or try a different imapd
> >>>> pid: we're trying to see what the inactive ones are waiting for.
> >>>>
> >>>> Cheers,
> >>>>
> >>>> ellie
> >>>
> >>> Ellie, thanks for doing the above "extended debug" version of a reply.
> >>> While not a total "nube" to debugging compiled processes, I have not
> >>> done it for a long time and never with Linux, so I am unfamiliar with
> >>> the tools.  Your words are a great help.
> >>>
> >>> I installed and ran gdb on one of our "problem" servers against an
> >>> "idled" pid, 16168, and as you suspected, I found this.
> >>>
> >>> Reading symbols from /usr/lib/cyrus/bin/imapd...(no debugging symbols
> >>> found)...done.
> >>>
> >>> FYI, I confirmed this pid is indeed Idle according to this line in
> >>> /run/cyrus/proc/16168
> >>>
> >>> imap    gertrude.ironicdesign.com [192.168.0.16]    donations at cogift.co
> >>> cogift.co!user.donations    Idle
> >>>
> >>> So I have installed and started strace against idled pid 16168 and am
> >>> collecting the output to a file.
> >>>
> >>> sudo -u cyrus strace -p 16168 -t -o cyrusIdled
> >>>
> >>> 16168 is a process that was started about two hours ago (before I saw
> >>> your email) after I restarted cyrus to clear the old processes before we
> >>> reached the max. I added -t to get a time stamp on each line in case I
> >>> need to correlate anything weird with what is going on with the
> >>> Blackberry (like perhaps it is unable to connect as the owner drives
> >>> around?)
> >>>
> >>> I am tailing the log and so far I am seeing a 60-second cycle with
> >>> opening three files (donations.lock, cyrus.index, and cyrus.header) and
> >>> then ending with closing all three.
> >>>
> >>> Below is the strace output of a typical cycle I am seeing.
> >>>
> >>> Please note the first and last lines in this output snippet.  As you can
> >>> see by the time stamps, the first line is written at the end of the
> >>> previous 1 minute cycle, but what you can't see here is that it is
> >>> written without the "= 0 (Timeout)".  The next 1 min cycle begins by
> >>> adding the "= 0 (Timeout)" to the end of that line and finishes with the
> >>> same line again without the "= 0 (Timeout)".
> >>>
> >>> So it looks like the pselect call is timing out after 60 seconds and the
> >>> cycle starts over.  I have no idea if this is normal.  I may install
> >>> strace on our dev server and see what the strace output looks like with
> >>> a normal IMAP client like Thunderbird.
> >>>
> >>> 09:39:08 pselect6(16, [0 15], NULL, NULL, {60, 0}, {[], 8}) = 0 (Timeout)
> >>> 09:40:08 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> >>> 09:40:08 gettimeofday({1475073608, 554369}, NULL) = 0
> >>> 09:40:08
> >>> open("/run/cyrus/lock/domain/W/cogift.co/U/user/donations.lock",
> >>> O_RDWR|O_CREAT|O_TRUNC, 0666) = 19
> >>> 09:40:08 fcntl(19, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET,
> >>> l_start=0, l_len=0}) = 0
> >>> 09:40:08 fcntl(12, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET,
> >>> l_start=0, l_len=0}) = 0
> >>> 09:40:08 fstat(12, {st_mode=S_IFREG|0600, st_size=218344, ...}) = 0
> >>> 09:40:08 stat("/var/lib/cyrus/mailboxes.db", {st_mode=S_IFREG|0600,
> >>> st_size=218344, ...}) = 0
> >>> 09:40:08 fcntl(12, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET,
> >>> l_start=0, l_len=0}) = 0
> >>> 09:40:08
> >>> open("/var/spool/cyrus/mail/domain/W/cogift.co/U/user/donations/cyrus.index",
> >>> O_RDWR) = 20
> >>> 09:40:08 fstat(20, {st_mode=S_IFREG|0600, st_size=1688, ...}) = 0
> >>> 09:40:08 mmap(NULL, 16384, PROT_READ, MAP_SHARED, 20, 0) = 0x7f05ab7a9000
> >>> 09:40:08 fcntl(20, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET,
> >>> l_start=0, l_len=0}) = 0
> >>> 09:40:08 gettimeofday({1475073608, 555550}, NULL) = 0
> >>> 09:40:08
> >>> stat("/var/spool/cyrus/mail/domain/W/cogift.co/U/user/donations/cyrus.index",
> >>> {st_mode=S_IFREG|0600, st_size=1688, ...}) = 0
> >>> 09:40:08
> >>> open("/var/spool/cyrus/mail/domain/W/cogift.co/U/user/donations/cyrus.header",
> >>> O_RDONLY) = 21
> >>> 09:40:08 fstat(21, {st_mode=S_IFREG|0600, st_size=207, ...}) = 0
> >>> 09:40:08 mmap(NULL, 207, PROT_READ, MAP_SHARED, 21, 0) = 0x7f05ab7a8000
> >>> 09:40:08 munmap(0x7f05ab7a8000, 207)    = 0
> >>> 09:40:08 fcntl(20, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET,
> >>> l_start=0, l_len=0}) = 0
> >>> 09:40:08 gettimeofday({1475073608, 556060}, NULL) = 0
> >>> 09:40:08 open("/var/lib/cyrus/msg/shutdown", O_RDONLY) = -1 ENOENT (No
> >>> such file or directory)
> >>> 09:40:08 fcntl(14, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET,
> >>> l_start=0, l_len=0}) = 0
> >>> 09:40:08 fstat(14, {st_mode=S_IFREG|0600, st_size=336, ...}) = 0
> >>> 09:40:08 stat("/var/lib/cyrus/user_deny.db", {st_mode=S_IFREG|0600,
> >>> st_size=336, ...}) = 0
> >>> 09:40:08 fcntl(14, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET,
> >>> l_start=0, l_len=0}) = 0
> >>> 09:40:08 fcntl(20, F_SETLKW, {l_type=F_RDLCK, l_whence=SEEK_SET,
> >>> l_start=0, l_len=0}) = 0
> >>> 09:40:08 gettimeofday({1475073608, 556577}, NULL) = 0
> >>> 09:40:08
> >>> stat("/var/spool/cyrus/mail/domain/W/cogift.co/U/user/donations/cyrus.index",
> >>> {st_mode=S_IFREG|0600, st_size=1688, ...}) = 0
> >>> 09:40:08 close(21)                      = 0
> >>> 09:40:08
> >>> open("/var/spool/cyrus/mail/domain/W/cogift.co/U/user/donations/cyrus.header",
> >>> O_RDONLY) = 21
> >>> 09:40:08 fstat(21, {st_mode=S_IFREG|0600, st_size=207, ...}) = 0
> >>> 09:40:08 mmap(NULL, 207, PROT_READ, MAP_SHARED, 21, 0) = 0x7f05ab7a8000
> >>> 09:40:08 munmap(0x7f05ab7a8000, 207)    = 0
> >>> 09:40:08 fcntl(20, F_SETLKW, {l_type=F_UNLCK, l_whence=SEEK_SET,
> >>> l_start=0, l_len=0}) = 0
> >>> 09:40:08 gettimeofday({1475073608, 557191}, NULL) = 0
> >>> 09:40:08 close(21)                      = 0
> >>> 09:40:08 close(20)                      = 0
> >>> 09:40:08 munmap(0x7f05ab7a9000, 16384)  = 0
> >>> 09:40:08 close(19)                      = 0
> >>> 09:40:08 rt_sigprocmask(SIG_BLOCK, [INT QUIT ALRM TERM CHLD], [], 8) = 0
> >>> 09:40:08 pselect6(16, [0 15], NULL, NULL, {60, 0}, {[], 8}
> >>>
> >>> I have no idea if the pselect timeout is important, but I wanted to
> >>> point out that behavior since it is not apparent from just looking at
> >>> the output after the fact.
> >>>
> >>> I am going to let this run all day and will scan it for anything
> >>> different from this 60-second cycle and will post results back to the
> >>> list around this time tomorrow.
> >>>
> >>> Thanks again for all your help.
> >>>
> >>
> >> OK, after almost 24 hours, donations at cogift.com had 56 Idle imap
> >> processes.  The original pid #16168 I watched above continued to do this
> >> 60-second cycle the entire 24 hours.  I picked some of the other 56 pids
> >> at random and watched them for a bit and they were all doing the same
> >> 60-second cycle.
> >>
> >> It is probably obvious to others on this list, but for a comparison I
> >> watched a pid for a different low volume account (maybe 5 emails/day)
> >> but with a Thunderbird client.
> >>
> >> When I started the trace the proc file showed the imap process as Idle
> >> just like the problem one above, but the trace looked completely
> >> different...no timeout on the pselect6 call (until the end), and no
> >> fixed 60-second loop. And perhaps more importantly, after about two
> >> hours the first pselect6 call got a timeout and the pid turned itself
> >> off like this.
> >>
> >> 09:24:09 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> >> 09:24:09 rt_sigprocmask(SIG_BLOCK, [INT QUIT ALRM TERM CHLD], [], 8) = 0
> >> 09:24:09 pselect6(1, [0], NULL, NULL, {0, 0}, {[], 8}) = 0 (Timeout)
> >> 09:24:09 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> >> 09:24:09 sendto(6, "<20>Sep 29 09:24:09 cyrus/imap[7"..., 75,
> >> MSG_NOSIGNAL, NULL, 0) = 75
> >> 09:24:09 write(1,
> >> "\322Rp\212t\205T\254i\371E\n%\371\371\300Z8/\235\227\v\0\0\0\377\377",
> >> 27) = 27
> >> 09:24:09 unlink("/run/cyrus/proc/7426") = 0
> >> 09:24:09 munmap(0x7f56c3c5e000, 135168) = 0
> >> 09:24:09 munmap(0x7f56c3c7f000, 135168) = 0
> >> 09:24:09 open("/dev/null", O_RDWR)      = 16
> >> 09:24:09 shutdown(0, SHUT_RD)           = 0
> >> 09:24:09 dup2(16, 0)                    = 0
> >> 09:24:09 shutdown(1, SHUT_RD)           = 0
> >> 09:24:09 dup2(16, 1)                    = 1
> >> 09:24:09 shutdown(2, SHUT_RD)           = 0
> >> 09:24:09 dup2(16, 2)                    = 2
> >> 09:24:09 close(16)                      = 0
> >> 09:24:09 write(3, "\1\0\0\0\2\35\0\0", 8) = 8
> >> 09:24:09 rt_sigaction(SIGALRM, {0x7f56c284db80, [],
> >> SA_RESTORER|SA_RESETHAND|SA_SIGINFO, 0x7f56c307eeb0}, NULL, 8) = 0
> >> 09:24:09 rt_sigaction(SIGQUIT, {0x7f56c284db80, [],
> >> SA_RESTORER|SA_RESETHAND|SA_SIGINFO, 0x7f56c307eeb0}, NULL, 8) = 0
> >> 09:24:09 rt_sigaction(SIGINT, {0x7f56c284db80, [],
> >> SA_RESTORER|SA_RESETHAND|SA_SIGINFO, 0x7f56c307eeb0}, NULL, 8) = 0
> >> 09:24:09 rt_sigaction(SIGTERM, {0x7f56c284db80, [],
> >> SA_RESTORER|SA_RESETHAND|SA_SIGINFO, 0x7f56c307eeb0}, NULL, 8) = 0
> >> 09:24:09 rt_sigaction(SIGHUP, {0x7f56c284db80, [],
> >> SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f56c307eeb0}, NULL, 8) = 0
> >> 09:24:09 alarm(109)                     = 0
> >> 09:24:09 fcntl(15, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET,
> >> l_start=0, l_len=0}) = ? ERESTARTSYS (To be restarted if SA_RESTART is
> >> set)
> >> 09:25:58 --- SIGALRM {si_signo=SIGALRM, si_code=SI_KERNEL} ---
> >> 09:25:58 rt_sigreturn({mask=[]})        = -1 EINTR (Interrupted system
> >> call)
> >> 09:25:58 close(10)                      = 0
> >> 09:25:58 munmap(0x7f56c3ca8000, 40960)  = 0
> >> 09:25:58 close(11)                      = 0
> >> 09:25:58 munmap(0x7f56c3cc8000, 24576)  = 0
> >> 09:25:58 close(12)                      = 0
> >> 09:25:58 munmap(0x7f56c3ca4000, 16384)  = 0
> >> 09:25:58 close(14)                      = 0
> >> 09:25:58 munmap(0x7f56c3ca0000, 16384)  = 0
> >> 09:25:58 close(13)                      = 0
> >> 09:25:58 unlink("/var/lib/cyrus/socket/idle.7426") = 0
> >> 09:25:58 shutdown(0, SHUT_RD)           = -1 ENOTSOCK (Socket operation
> >> on non-socket)
> >> 09:25:58 close(0)                       = 0
> >> 09:25:58 shutdown(1, SHUT_RD)           = -1 ENOTSOCK (Socket operation
> >> on non-socket)
> >> 09:25:58 close(1)                       = 0
> >> 09:25:58 shutdown(2, SHUT_RD)           = -1 ENOTSOCK (Socket operation
> >> on non-socket)
> >> 09:25:58 close(2)                       = 0
> >> 09:25:58 exit_group(0)                  = ?
> >> 09:25:58 +++ exited with 0 +++
> >>
> >> Just FYI, my tcp_keepalive settings are here:
> >>
> >> tcp_keepalive: 1
> >> tcp_keepalive_idle: 900
> >> tcp_keepalive_cnt: 5
> >> tcp_keepalive_intvl: 75
> >>
> >> So I expect at this point my options are:
> >>
> >> - Tell this user to NOT use their BlackBerry
> >> OR
> >> - plan to restart cyrus-imap every 24 hours on the 3 servers that the
> >> Blackberry is checking until this user's Blackberry dies or Cyrus 2.5.x
> >> is updated to deal with it (whatever "it" is).
> >>
> >> If it will help I am still willing to set up wireshark to collect port
> >> 143 traffic from the BB to see if we can see what the BB is doing (or
> >> not doing) to cause this.
> >>
> >> --
> >> Andy Dorman
> >> Ironic Design, Inc.
> >> AnteSpam.com
> >>
> >> ----
> >> Cyrus Home Page: http://www.cyrusimap.org/
> >> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> >> To Unsubscribe:
> >> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
> > ----
> > Cyrus Home Page: http://www.cyrusimap.org/
> > List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> > To Unsubscribe:
> > https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
> >
> 
> Ellie, thank you again for responding. To answer your questions:
> 
> * We have this in START {
> ...
> # this is only necessary if idlemethod is set to "idled" in imapd.conf
> idle            cmd="idled"
> ...
> }
> 
> * I checked the cyrus syslog entries for the past 8 hours and did not 
> see anything like that.
> 
> Here are the log entries over the past 8 hours for one of the 
> "misbehaving" email accounts being checked by this old Blackberry. The 
> only time we logged any errors at all was during a restart I did this 
> morning at 0734 after the process count for these two mailboxes had 
> climbed to over 100 in about 10 hours.
> 
> 2016-10-03T00:08:24.612798-05:00 dorcas cyrus/imap[7380]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-7380-1475471304-1-6911887814836643318>
> 2016-10-03T00:24:14.391568-05:00 dorcas cyrus/imap[7831]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-7831-1475472254-1-5820101441045651623>
> 2016-10-03T00:28:59.679112-05:00 dorcas cyrus/imap[7831]: USAGE 
> donations at cogift.co user: 0.012000 sys: 0.008000
> 2016-10-03T00:29:01.906754-05:00 dorcas cyrus/imap[8066]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-8066-1475472541-1-4242711481019243188>
> 2016-10-03T00:38:19.818027-05:00 dorcas cyrus/imap[9277]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-9277-1475473099-1-4936125750805326048>
> 2016-10-03T00:54:28.759669-05:00 dorcas cyrus/imap[9852]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-9852-1475474068-1-12165950019524758937>
> 2016-10-03T01:09:28.689482-05:00 dorcas cyrus/imap[10690]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-10690-1475474968-1-15742350519470797417>
> 2016-10-03T01:13:25.916203-05:00 dorcas cyrus/imap[11003]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-11003-1475475205-1-8476452722367884553>
> 2016-10-03T01:33:28.762142-05:00 dorcas cyrus/imap[11583]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-11583-1475476408-1-8653809589351466426>
> 2016-10-03T01:41:21.636360-05:00 dorcas cyrus/imap[11692]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-11692-1475476881-1-12970257928111959648>
> 2016-10-03T02:21:04.896098-05:00 dorcas cyrus/imap[11692]: USAGE 
> donations at cogift.co user: 0.036000 sys: 0.000000
> 2016-10-03T02:21:06.061706-05:00 dorcas cyrus/imap[13400]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-13400-1475479266-1-12497454465692310397>
> 2016-10-03T02:34:51.827624-05:00 dorcas cyrus/imap[13623]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-13623-1475480091-1-5919053177859451233>
> 2016-10-03T02:54:28.365411-05:00 dorcas cyrus/imap[14236]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-14236-1475481268-1-16059406662734805679>
> 2016-10-03T03:09:27.911036-05:00 dorcas cyrus/imap[15348]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-15348-1475482167-1-6651971256242559735>
> 2016-10-03T03:20:18.101926-05:00 dorcas cyrus/imap[15557]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-15557-1475482818-1-12999714951485474830>
> 2016-10-03T03:35:26.076582-05:00 dorcas cyrus/imap[16027]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-16027-1475483726-1-17484312272884286688>
> 2016-10-03T03:54:28.151894-05:00 dorcas cyrus/imap[16678]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-16678-1475484868-1-7730339889453383253>
> 2016-10-03T03:59:27.520312-05:00 dorcas cyrus/imap[16866]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-16866-1475485167-1-11448772539541097453>
> 2016-10-03T04:10:54.348631-05:00 dorcas cyrus/imap[17156]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-17156-1475485854-1-8010797533615008746>
> 2016-10-03T04:24:36.884778-05:00 dorcas cyrus/imap[17705]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-17705-1475486676-1-12868681160384341448>
> 2016-10-03T04:32:41.259179-05:00 dorcas cyrus/imap[17862]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-17862-1475487161-1-6273312087185643950>
> 2016-10-03T04:47:18.375677-05:00 dorcas cyrus/imap[18572]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-18572-1475488038-1-15483399098754499531>
> 2016-10-03T05:05:40.304037-05:00 dorcas cyrus/imap[19546]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-19546-1475489140-1-11796786166884717875>
> 2016-10-03T05:17:02.062617-05:00 dorcas cyrus/imap[20038]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-20038-1475489822-1-16709716609084832908>
> 2016-10-03T05:31:17.310397-05:00 dorcas cyrus/imap[20349]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-20349-1475490677-1-15234240411390132694>
> 2016-10-03T05:48:11.928860-05:00 dorcas cyrus/imap[21014]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-21014-1475491691-1-11754979809267352059>
> 2016-10-03T05:52:29.057422-05:00 dorcas cyrus/imap[21213]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-21213-1475491949-1-2595590127926842652>
> 2016-10-03T06:09:32.527403-05:00 dorcas cyrus/imap[22210]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-22210-1475492972-1-205848532342279172>
> 2016-10-03T06:24:28.098286-05:00 dorcas cyrus/imap[22403]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-22403-1475493868-1-16779192082609503452>
> 2016-10-03T06:39:28.185876-05:00 dorcas cyrus/imap[22960]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-22960-1475494768-1-3435864451273408603>
> 2016-10-03T06:53:35.488585-05:00 dorcas cyrus/imap[23325]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-23325-1475495615-1-1765830263139952779>
> 2016-10-03T07:09:28.469822-05:00 dorcas cyrus/imap[24904]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-24904-1475496568-1-17727629097969100823>
> 2016-10-03T07:24:27.973398-05:00 dorcas cyrus/imap[25551]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-25551-1475497467-1-15780735317522922191>
> 
> *** initiated a restart at 0734 ***
> 
> 2016-10-03T07:34:37.746933-05:00 dorcas cyrus/imap[3991]: IDLE: error 
> sending message DONE to idled for mailbox cogift.co!user.donations: 
> Resource temporarily unavailable.
> 2016-10-03T07:34:37.746972-05:00 dorcas cyrus/imap[16866]: IDLE: error 
> sending message DONE to idled for mailbox cogift.co!user.donations: 
> Resource temporarily unavailable.
> 2016-10-03T07:39:28.226240-05:00 dorcas cyrus/imap[25991]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-25991-1475498368-1-3231332521834286919>
> 2016-10-03T07:54:35.401648-05:00 dorcas cyrus/imap[27066]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-27066-1475499275-1-764446623378088852>
> 2016-10-03T08:00:25.354125-05:00 dorcas cyrus/imap[27103]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-27103-1475499625-1-9978294241760492136>
> 
> As for monitoring pid 16168, I am afraid that boat sailed several days 
> ago because we have to keep the process count under pretty tight 
> control.  However, I can (and will) coordinate a time with that user 
> this week to do what you request. I will initiate a new strace, 
> establish a baseline similar to 16618, and then have her manipulate the 
> account as you suggest and finish by power cycling the BB.
> 
> I do have one question about running the strace above if the account has 
> multiple pids. For example, here an hour after a restart that account 
> has 4 pids already (the BB with two accounts has 8 pids).
> 
> -rw------- 1 cyrus mail  96 Oct  3 07:42 25991
> -rw------- 1 cyrus mail  96 Oct  3 07:54 27066
> -rw------- 1 cyrus mail  96 Oct  3 08:00 27103
> -rw------- 1 cyrus mail  96 Oct  3 08:20 27507
> 
> Do you have any idea of which pid to run strace on?  Should I run it on 
> the oldest or newest? Or should I run a restart, start tracing the first 
> pid that pops up for this user and stay with that one?
> 
> Thanks for all your help.
> 
> -- 
> Andy Dorman
> Ironic Design, Inc.
> AnteSpam.com
> 
> ----
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> To Unsubscribe:
> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Hi Andy,

> 2016-10-03T00:24:14.391568-05:00 dorcas cyrus/imap[7831]: login: 
> hermione.ironicdesign.com [192.168.0.17] donations at cogift.co plaintext 
> User logged in SESSIONID=<cyrus-7831-1475472254-1-5820101441045651623>
> 2016-10-03T00:28:59.679112-05:00 dorcas cyrus/imap[7831]: USAGE 
> donations at cogift.co user: 0.012000 sys: 0.008000

That's interesting -- the "USAGE ..." line is logged when the user
session logs out (or the server is shutting down).  So we can see from
the presence of the "USAGE" lines that at least some of these sessions
get logged out normally, and don't dangle around indefinitely.

> Do you have any idea of which pid to run strace on?  

Not really, unfortunately...

> Should I run it on 
> the oldest or newest? Or should I run a restart, start tracing the first 
> pid that pops up for this user and stay with that one?

Oldest would be best, I think, especially if you can get the first one
that pops up after a restart.  Maybe a selection of several, some very
old, some fairly new, some in between -- if it can be managed without
putting too much additional strain on the system (I don't actually know
how resource intensive strace is).

I don't know if it's possible to "exit" the mail application on the
blackberry (like you could on a PC), or if it's deeply embedded and
always running while the device is on (like the SMS app on a phone). 
But if you *can* exit the mail application, such that it completely
stops running and is no longer checking for mail, then we might get
another interesting data point if you first exit the mail application,
wait a while like that (maybe 10-30 mins?) and then power off the device
(and maybe leave it off for a while too if the user can abide that).

I expect that at some point during this test, some or all of those
connections will disconnect.  It will be interesting to see exactly when
this occurs: it might be when the mail application exits, a little while
after the mail application exits, when the device is powered off, a
little while after the device is powered off, or something else.  And
also, which ones go away vs hang around.

Cheers,

ellie


More information about the Info-cyrus mailing list