switch to cyrus murder (aggregator) feedback

Nic Bernstein nic at onlight.com
Mon Sep 22 09:24:41 EDT 2014


Michael,
I'd like to thank you for having written up such a succinct and 
reasonable description of a well thought out murder installation. Lot's 
of good information here, especially for people who may be considering a 
move like yours.  This could be the bones of a good Wiki article.

Cheers,
     -nic

On 09/22/2014 06:20 AM, Michael Menge wrote:
> Hi,
>
> 3 weeks ago we changed our changed cyrus imap servers form stand
> alone systems to a cyrus murder cluster. We have ~44000 accounts,
> ~457000 Mailboxes, and 2x6.5 TB Mails
>
> In our previos setup we had 6 cyrus imap 2.4.17 servers running as KVM
> VMs with 8 GB memory and 4 Cores each, on an HP Blade center (G7 Blades).
> Each server was running 2 cyrus instances one master system an one 
> replica
> of one of the other servers. We used DNS cnames to distribute our 
> users to
> our servers. The filesystems are stored on two Infortrend iSCSI Raids, so
> that the replic is not on the same iSCSI system as the master.
>
> In our new setup each server is running 3 - 4 cyrus instances.
> One Frontend, one backend, one replic and on one of the servers
> the cyrus mupdate master. ClusterIP is used to distribute the access
> to our frontend instances. The backend and replics are only listening
> on private IPs.
>
> If one server goes down, we will switch that ClusterIP bucket to one
> of the other servers, and we will restart the replic as backend by 
> changing
> the config and switch the IP of the replic with the ip of the backend. 
> This
> is much faster than updating  the mailbox location of all the affected
> mailboxes.
>
> If the mupdate master is down we start it on one of the other servers,
> using the mailboxdb of the frontend and running "ctl_mboxlist -m -a"
> on all backend instances.
>
> Since the migration we discovered some small issues and some bugs.
>
> 1. usually Cyrus is not CPU bound. One exception is the mupdate master
>    keeping encrypted connection to all frontends and establishing
>    new encrypted connections from the backend for every mailbox creation,
>    rename and remove, was too much for the 4 cores so we added 4 
> additional
>    cores to the VMs.
>
> 2. Our frontend instances use IMAPs and POP3s and don't allow STARTTLS.
>    But we hat to use IMAP and POP3 with STARTTLS on our backends, as
>    the frontends will always use STARTTLS over IMAP and POP3 to proxy
>    the connection.
>
>
> 3. We see more IOERRORs in our cyrus logs. In the standalone
>    cyrus imap IOERROR indicated a corruption in one of the cyrus files
>    but that is not the case for the new errors we have found:
>
>    a) "reading message: unexpected end of file" as far as i can tell,
>       this is triggert by the imap append command. I suspect when the
>       connection between frontend and backend is lost or the frontend
>       dies during upload of the message.
>
>    b) "opening index %s: Invalid mailbox name" the mailbox name seem to
>       be fine in most cases. I haven only figured out why the mailbox
>       name was considered invalid in one case (the Sting "Posteingang"
>       was translated by the client and the name "INBOX" ins reserved.
>
>    It would help if the String IOERROR would not be used in these cases,
>    and if the mailbox name would always be logged consistent to the
>    unixhierarchysep option.
>
>
> 4. Deleting an mailbox with delete_mode: delayed can create a corrupt
>    mailbox in the DELETED tree. In the logs we found the following:
>
>    be/beimap[62020]: Rename: user.LoginID.Mail.drafts -> 
> DELETED.user.LoginID.Mail.drafts.5416CD11
>
>    be/beimap[62020]: MUPDATE: can't commit mailbox entry for 
> 'DELETED.user.LoginID.Mail.drafts.5416CD11'
>    be/beimap[62020]: Deleted mailbox 
> DELETED.user.LoginID.Mail.drafts.5416CD11
>
>    and on the next cyr_expire run
>
>    be/cyr_expire[144388]: IOERROR: opening index 
> DELETED.user.LoginID.Mail.drafts.5416CD11: System I/O error
>
>    in the filesystem DELETED/user/LoginID/Mail/drafts was an empty 
> directory.
>    I couldn't find any hints why the mupdate master couldn't commit the
>    mailbox entry, but as "5416CD11" is the timestamp of the action, I am
>    certain that the mailbox did not exist in the mailboxdb before. And as
>    this only happens in some rare cases I suspect a race condition.
>
> 5. Some frontend imapd processes receive a SIGSEGV.
>    As this seams to happen in the libopenssl I asked on their 
> mailinglist,
>    but didn't receive an answer jet. At the end you will fine an BT of 
> the
>    core dump.
>
> I would be glad if changes regarding the logging of IOERRORs
> and mailbox names would be included in Cyrus 2.5
>
> Regarding 4. and 5. are these known bugs? I could not find any matching
> entries in the bug tracker. If they are not know I would add them to 
> the bug tracker.
>
> Regards
>
>     Michael menge
>
> -----  ldd imapd ----
> linux-vdso.so.1 =>  (0x00007fff3ffed000)
> libsasl2.so.2 => /usr/lib64/libsasl2.so.2 (0x00007f40e62a8000)
> libssl.so.0.9.8 => /usr/lib64/libssl.so.0.9.8 (0x00007f40e6052000)
> libcrypto.so.0.9.8 => /usr/lib64/libcrypto.so.0.9.8 (0x00007f40e5cb2000)
> libz.so.1 => /lib64/libz.so.1 (0x00007f40e5a9c000)
> libwrap.so.0 => /lib64/libwrap.so.0 (0x00007f40e5891000)
> libnsl.so.1 => /lib64/libnsl.so.1 (0x00007f40e5678000)
> libc.so.6 => /lib64/libc.so.6 (0x00007f40e52ff000)
> libdl.so.2 => /lib64/libdl.so.2 (0x00007f40e50fb000)
> libresolv.so.2 => /lib64/libresolv.so.2 (0x00007f40e4ee3000)
> /lib64/ld-linux-x86-64.so.2 (0x00007f40e64f7000)
>
> --- bt on imapd core dump ----
>    #0  0x000000000080e130 in ?? ()
>    #1  0x00007fe5a839334f in ssl3_get_message (s=0x80e430, 
> st1=8347825, stn=-1470427072, mt=<optimized out>, max=102400, 
> ok=0x7fffcc974d08)
>     at s3_both.c:522
>    #2  0x00007fe5a838ba0d in ssl3_get_key_exchange (s=0x0) at 
> s3_clnt.c:1103
>    #3  0x00007fe5a838dff8 in ssl3_connect (s=0x80e430) at s3_clnt.c:316
>    #4  0x000000000046a177 in tls_start_clienttls (readfd=16, 
> writefd=16, layerbits=0x7fffcc975104, authid=0x7fffcc975108, 
> ret=0x7e1fa0,
>     sess=0x7e1fa8) at tls.c:1311
>    #5  0x00000000004669f4 in do_starttls (s=0x7e16a0, tls_cmd=0x78a4d0 
> <imap_protocol+208>) at backend.c:201
>    #6  0x0000000000467217 in backend_authenticate (s=0x7e16a0, 
> prot=0x78a400 <imap_protocol>, mechlist=0x7fffcc976468,
>     userid=0x7f5c90 "REPLACED_LOGINID", cb=0x80de30, 
> status=0x7fffcc976460) at backend.c:378
>    #7  0x0000000000467a1a in backend_connect (ret_backend=0x7e16a0, 
> server=0x7a8960 <partition.17660> "ma03.mail.localhost",
>     prot=0x78a400 <imap_protocol>, userid=0x7f5c90 "REPLACED_LOGINID", 
> cb=0x0, auth_status=0x0) at backend.c:552
>    #8  0x0000000000426603 in proxy_findserver (server=0x7a8960 
> <partition.17660> "ma03.mail.localhost", prot=0x78a400 <imap_protocol>,
>     userid=0x7f5c90 "REPLACED_LOGINID", cache=0x7a3010 
> <backend_cached>, current=0x7a3008 <backend_current>, inbox=0x7a3000 
> <backend_inbox>,
>     clientin=0x7be450) at proxy.c:179
>    #9  0x0000000000426beb in proxy_findinboxserver (userid=0x7f5b20 
> "REPLACED_LOGINID") at imap_proxy.c:145
>    #10 0x00000000004197c8 in cmd_list (tag=0x7f3720 "42.117", 
> listargs=0x7fffcc977510) at imapd.c:6036
>    #11 0x000000000040c9ee in cmdloop () at imapd.c:1574
>    #12 0x000000000040aea5 in service_main (argc=2, argv=0x7b9010, 
> envp=0x7fffcc97b650) at imapd.c:946
>    #13 0x0000000000409ba4 in main (argc=6, argv=0x7fffcc97b618, 
> envp=0x7fffcc97b650) at service.c:582
> -----------------------------
>
>
>
>
>
>
> -------------------------------------------------------------------------------- 
>
> M.Menge                                Tel.: (49) 7071/29-70316
> Universität Tübingen                   Fax.: (49) 7071/29-5912
> Zentrum für Datenverarbeitung          mail: 
> michael.menge at zdv.uni-tuebingen.de
> Wächterstraße 76
> 72074 Tübingen
>
>
> ----
> Cyrus Home Page: http://www.cyrusimap.org/
> List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
> To Unsubscribe:
> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

-- 
Nic Bernstein                             nic at onlight.com
Onlight, Inc.                             www.onlight.com
219 N. Milwaukee St., Suite 2a            v. 414.272.4477
Milwaukee, Wisconsin  53202

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20140922/55dbcc1f/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nic.vcf
Type: text/x-vcard
Size: 271 bytes
Desc: not available
Url : http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20140922/55dbcc1f/attachment-0001.vcf 


More information about the Info-cyrus mailing list