another problem with conversations db

Michael Menge michael.menge at zdv.uni-tuebingen.de
Mon Feb 4 06:00:13 EST 2019


Quoting Bron Gondwana <brong at fastmail.fm>:

> On Mon, Feb 4, 2019, at 20:21, Michael Menge wrote:
>> Hi,
>>
>> Quoting Bron Gondwana <brong at fastmailteam.com>:
>>
>> > Hi Michael,
>> >
>> > Sorry about the delay in looking at this - I was mad crazy busy
>> > getting ready to go overseas. At Fosdem now, about to give a talk
>> > about JMAP!
>> >
>> > OK, let's start with the things that give me a little bit of hives...
>> >
>> > configdirectory: /srv/cyrus-be
>> > partition-default: /srv/cyrus-be
>> > partition-ssd: /srv/cyrus-be/ssd-part
>> >
>> > Ouch. There's a couple of things I wouldn't do there - having the
>> > partition be the same as the config directory, and having a separate
>> > partition be a subdirectory of a partition. They're both asking for
>> > trouble. I would probably lay my system out like:
>> >
>> > configdirectory: /srv/cyrus-be/conf
>> > partition-default: /srv/cyrus-be/default-part
>> > partition-ssd: /srv/cyrus-be/ssd-part
>> >
>>
>> partition-default isn't used any more. To use the metapartition we moved
>> all accounts form the default partition to the ssd partition which is the
>> the new defaultpartition ("defaultpartition: ssd")
>
> Right - that makes sense.
>
>> > And then each tree would only have one type of thing in it.
>> >
>> > Anyway, I don't think that would break anything.
>> >
>> > metapartition-ssd: /srv/cyrus-ssd-be/meta/ssd-part
>> > metapartition_files: header index cache expunge squat annotations
>> > lock dav archivecache
>> >
>> > Ooh, I haven't tested having cache and archivecache on the same
>> > location. That's really interesting. Again, I'd be in favour of
>> > separation here, give them different paths. That might be tricky
>> > with ssd though, the way this is laid out. I assume you have some
>> > kind of symlink farm going on?
>> >
>>
>> I didn't know that there could be a problem with cache and archivecache.
>> At the time we decided on the configuration for cyrus 3.0 I looked at the
>> imapd.conf man page and for metapartition_files decided that I want all
>> meta files on the ssd storage. There was no indication in the man page
>> that there could be a problem.
>
> Fair. I'd have to test that to see if it works correctly. I would  
> hope so, but I haven't tested that configuration. This is the  
> downside with having lots of different ways to do things!
>
>> How do I separate location of archivecache from the other  
>> metapartition path?
>> And fix the cache and archivecache files?
>
> This I don't know a good answer for. I will test if having the same  
> path for cache and archivecache could fail. I THINK that I made the  
> code safe for it, but I'm not sure that it's been tested.
>
>> No there is no sysmlink farm. We have mounted different iSCSI volumes to
>> /srv/cyrus-ssd-be, /srv/cyrus-hdd-be and /srv/cyrus-be
>
> Right. That makes sense.
>
>> > Otherwise it all looks OK. Are you getting other IOERRORs in your
>> > log files which could show things aborting? It really looks like
>> > your conversations DB is getting out of sync due to other failures.
>>
>> I found a few instances of 3 related errors.
>>
>> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR: opening
>> /srv/cyrus-be/ssd-part/L/user/XXXX/2185.: No such file or directory
>> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR: opening
>> /srv/cyrus-be/ssd-part/L/user/XXXX/2185.: No such file or directory
>> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR archive
>> user.XXXX 2185 failed to copyfile
>> (/srv/cyrus-be/ssd-part/L/user/XXXX/2185. =>
>> /srv/cyrus-hdd-be/archive/ssd-part/L/user/XXXX/2185.): Unknown code
>> ____ 255
>
>
> Ouch. Yeah, that could have been caused by a bug in delivery, and  
> would definitely cause conversations DB corruption if the index file  
> was updated but the conversations DB wasn't or vice versa.
>
>> The file was already at /srv/cyrus-hdd-be/archive/ssd-part/L/user/XXXX/2185.
>
> Right! I do wonder if there are some bugs in 3.0.x which are fixed  
> on master around delivery to archive partition. We definitely had  
> bugs on master, but I thought they were newly introduced on master  
> as well, which is why the fixes weren't backported. But if you're  
> having files be in the wrong location, maybe there are bugs on 3.0.x  
> as well.
>
> Do you have the syslog lines at the time that email was delivered?

I dont' have the log, for that message, but I will search for a
more recent example.


 From the mail headers i know that it was not dilivered to the archive  
partition
but moved by cyr_expire. The conversation db was not used at that time.

PS.: the timesamp of the file is not the internal date but the time
the mail was moved to the archive partition. I was wondering about the reason.

Michael


--------------------------------------------------------------------------------
M.Menge                                Tel.: (49) 7071/29-70316
Universität Tübingen                   Fax.: (49) 7071/29-5912
Zentrum für Datenverarbeitung          mail:  
michael.menge at zdv.uni-tuebingen.de
Wächterstraße 76
72074 Tübingen



More information about the Info-cyrus mailing list