sync_server "memory leak" with giant new mailbox first sync

Bron Gondwana brong at fastmail.fm
Sun Sep 10 11:15:26 EDT 2006


Ok, so this isn't a memory leak as such, but...

When sync_client has a large folder to send (for
the sake of far too many hours of me trying to
make this work let's just say it's 180,000
messages), then it just sends a single
"UPLOAD [lastuid] [lastappenddate]" followed by
every single message on after the other.

There's logic on the server end to send a [RESTART]
back after 1000 new files arrive, but it doesn't
get to be called until all 180,000 messages have
arrived... or at least it would be if the sync_server
process didn't receive a SIGABRT somewhere around
102,000 messages in.  I tried all sorts of things
to find the underlying cause, then finally just
watched 'top' on the sync_server machine as it ran.

This machine has 8Gb of memory, and was seeing over
30% being used by this one sync_server before it
died!

Well, the attached isn't the most elegant patch in
the world, and may not be the best way to solve the
problem, but at least it got that user replicated
and happy.  The first time we had to deal with it
was moving the user off a corrupted filesystem that
I could only mount read-only, and it took about 3
hours for each run to fail thanks to the insanely
high IO load on that drive unit, so debugging was
more of a pain than you'd hope.

I hope something inspired by this can be merged
upstream to solve the "spam sync_server until it
falls over" failure mode.

Bron.
-- 
  Bron Gondwana
  brong at fastmail.fm

-------------- next part --------------
diff -ur --new-file cyrus-imapd-cvs/imap/sync_client.c cyrus-imapd-cvs.new/imap/sync_client.c
--- cyrus-imapd-cvs/imap/sync_client.c	2006-08-26 10:48:27.000000000 -0400
+++ cyrus-imapd-cvs.new/imap/sync_client.c	2006-09-10 10:51:06.000000000 -0400
@@ -1198,7 +1198,7 @@
 static int upload_messages_list(struct mailbox *mailbox,
 				struct sync_msg_list *list)
 {
-    unsigned long msgno;
+    unsigned long msgno = 1;
     int r = 0;
     struct index_record record;
     struct sync_msg *msg;
@@ -1212,8 +1212,11 @@
         return(IMAP_IOERROR);
     }
 
+repeatupload:
+
     msg = list->head;
-    for (msgno = 1 ; msgno <= mailbox->exists ; msgno++) {
+    count = 0;
+    for (; count < 1000 && msgno <= mailbox->exists ; msgno++) {
         r = mailbox_read_index_record(mailbox, msgno, &record);
 
         if (r) {
@@ -1272,6 +1275,12 @@
 	syslog(LOG_INFO, "UPLOAD: received RESTART");
     }
 
+    /* don't overload the server with too many uploads at once! */
+    if (count >= 1000) {
+	syslog(LOG_INFO, "UPLOAD: hit %d uploads at msgno %d", count, msgno);
+	goto repeatupload;
+    }
+
     return(0);
 }
 


More information about the Info-cyrus mailing list