prevent stuck processes with large folder manipulations

Paul Dekkers Paul.Dekkers at surfnet.nl
Sun Jan 3 13:35:07 EST 2010


Hi Brian,

On 03-01-10 03:29, Brian Awood wrote:

> We used to run into this fairly frequently when we were running 2.2 proxy 
> hosts.  Although it never reached the point where it caused memory 
> exhaustion, usually we would catch it at an early stage because of a 

Are you running 32 or 64-bits? We run 64-bits, and I realized that this
allows a single imapd process to consume a considerable amount of memory
(eg. all) instead of "just" 2G or so per process. (The server I'm
talking about has 6G of memory and 2G swap, should be ok for ~50 active
users.) But now there's nothing limiting the memory consumption by just
one single user/process, I guess.

> slowdown in replication.  It always appeared to be a Thunderbird client, 
> and at one time I was able to reproduce the behavior.
> 
> Do you have any proxies in your configuration, if so what version are 
> they?  I believe there is some sort of weird interaction between cyrus 
> 2.2 and thunderbird (at least 2.x).  I had posted some info to the 
> mozilla bugtracker;  https://bugzilla.mozilla.org/show_bug.cgi?id=340265  
> But before I really had the time to investigate we upgraded our proxies to 
> 2.3 and as far as I know the problem seems to have stopped. 

We are not running any proxies, well, other than stunnel doing SSL
offloading on another box. So apart from SSL the clients are directly
talking to imapd. I do believe it's just Thunderbird timing-out. This
box is running 2.3.13 (Simon's RPM on 64-bit RHEL 4), but I recall
seeing it with olders versions before (of both Cyrus and TB).

Actually; the mozilla bugtracker reference sounds very similar. It
happens with large deletions too, just like with large copies, as
described in this bug. But we see this with much more recent versions,
Thunderbird 3 and the recent 2.3 cyrus.

Regarding the last suggestion in this bug; for the deletes I did
consider the delayed expunge and/or delete, but that wouldn't help with
the large copies.

(While scripting something for moves myself in python, I indeed did it
batch-wise rather than "1:*", but this is clearly not something TB does
- it rather does all messages at the same time resulting in these timeouts.

Thanks for your dupseek suggestion; I will definitely take a look at that,

Regards,
Paul


> On Friday 01 January 2010 @ 15:45, Paul Dekkers wrote:
>> Hi,
>>
>> From time to time (but mostly at the start of the year ;-)), I notice a
>> lot of load caused by people archiving their mail-folders. Maybe this
>> is mostly caused by Thunderbird going mad, but I was wondering if I
>> could do anything on the server-side to prevent things from going bad.
>> Because now I see memory (and swap) exhaustion and the side-effects of
>> that (Linux kernel killing processes)...
>>
>> One example: someone was moving tens of thousands of messages from 2009
>> to a new "2009 folder". Apparently Thunderbird was stuck, maybe because
>> these things don't happen "instantly" moving this number of messages so
>> the server doesn't finish quickly: but Thunderbird created a lot (~100)
>> of sessions / imapd-processes for this user, maybe after timeouts.
>>
>> (I think) Only one process was active doing the link's, it looked like
>> the others were mostly waiting for a write lock (fortunately), waiting
>> to do the same thing. (Inspected with strace.) But when the process
>> that hogged the CPU was killed, the next process took over, until all
>> similar processes were killed. And the new archive-folder now ended up
>> with several duplicates, taking about millions instead of tens of
>> thousands. (We'll have to see how to dedup that, any ideas are
>> appreciated otherwise I'll write something for that.)
>>
>> It just happened, but it happened before. This mail-server is not that
>> busy, <100 users, but it happens at least a few times per year.
>>
>> Any idea how to prevent things like this? Judging from the man-pages I
>> don't think I could do this from within cyrus, but that I would have to
>> prevent from linux's ulimit or so and tune that (sounds like a tough
>> job)... or could I actually do this with cyrus parameters?
>>
>> Curious if people have similar experiences :-)
>>
>> Regards,
>> Paul
>>
>> P.S. This specific machine is running Red Hat 4 and a version of
>> Simon's (s)rpm.
>>
>> ----
>> Cyrus Home Page: http://cyrusimap.web.cmu.edu/
>> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
>> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
> 
> 
> ----
> Cyrus Home Page: http://cyrusimap.web.cmu.edu/
> Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
> List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html



More information about the Info-cyrus mailing list