another problem with conversations db

Bron Gondwana brong at fastmail.fm
Mon Feb 4 05:12:46 EST 2019


On Mon, Feb 4, 2019, at 20:21, Michael Menge wrote:
> Hi,
> 
> Quoting Bron Gondwana <brong at fastmailteam.com>:
> 
> > Hi Michael,
> >
> > Sorry about the delay in looking at this - I was mad crazy busy 
> > getting ready to go overseas. At Fosdem now, about to give a talk 
> > about JMAP!
> >
> > OK, let's start with the things that give me a little bit of hives...
> >
> > configdirectory: /srv/cyrus-be
> > partition-default: /srv/cyrus-be
> > partition-ssd: /srv/cyrus-be/ssd-part
> >
> > Ouch. There's a couple of things I wouldn't do there - having the 
> > partition be the same as the config directory, and having a separate 
> > partition be a subdirectory of a partition. They're both asking for 
> > trouble. I would probably lay my system out like:
> >
> > configdirectory: /srv/cyrus-be/conf
> > partition-default: /srv/cyrus-be/default-part
> > partition-ssd: /srv/cyrus-be/ssd-part
> >
> 
> partition-default isn't used any more. To use the metapartition we moved
> all accounts form the default partition to the ssd partition which is the
> the new defaultpartition ("defaultpartition: ssd")

Right - that makes sense.

> > And then each tree would only have one type of thing in it.
> >
> > Anyway, I don't think that would break anything.
> >
> > metapartition-ssd: /srv/cyrus-ssd-be/meta/ssd-part
> > metapartition_files: header index cache expunge squat annotations 
> > lock dav archivecache
> >
> > Ooh, I haven't tested having cache and archivecache on the same 
> > location. That's really interesting. Again, I'd be in favour of 
> > separation here, give them different paths. That might be tricky 
> > with ssd though, the way this is laid out. I assume you have some 
> > kind of symlink farm going on?
> >
> 
> I didn't know that there could be a problem with cache and archivecache.
> At the time we decided on the configuration for cyrus 3.0 I looked at the
> imapd.conf man page and for metapartition_files decided that I want all
> meta files on the ssd storage. There was no indication in the man page
> that there could be a problem.

Fair. I'd have to test that to see if it works correctly. I would hope so, but I haven't tested that configuration. This is the downside with having lots of different ways to do things!

> How do I separate location of archivecache from the other metapartition path?
> And fix the cache and archivecache files?

This I don't know a good answer for. I will test if having the same path for cache and archivecache could fail. I THINK that I made the code safe for it, but I'm not sure that it's been tested.

> No there is no sysmlink farm. We have mounted different iSCSI volumes to
> /srv/cyrus-ssd-be, /srv/cyrus-hdd-be and /srv/cyrus-be

Right. That makes sense.

> > Otherwise it all looks OK. Are you getting other IOERRORs in your 
> > log files which could show things aborting? It really looks like 
> > your conversations DB is getting out of sync due to other failures.
> 
> I found a few instances of 3 related errors.
> 
> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR: opening 
> /srv/cyrus-be/ssd-part/L/user/XXXX/2185.: No such file or directory
> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR: opening 
> /srv/cyrus-be/ssd-part/L/user/XXXX/2185.: No such file or directory
> Feb 4 01:10:55 mailserv03 be/cyr_expire[7626]: IOERROR archive 
> user.XXXX 2185 failed to copyfile 
> (/srv/cyrus-be/ssd-part/L/user/XXXX/2185. => 
> /srv/cyrus-hdd-be/archive/ssd-part/L/user/XXXX/2185.): Unknown code 
> ____ 255


Ouch. Yeah, that could have been caused by a bug in delivery, and would definitely cause conversations DB corruption if the index file was updated but the conversations DB wasn't or vice versa.

> The file was already at /srv/cyrus-hdd-be/archive/ssd-part/L/user/XXXX/2185.

Right! I do wonder if there are some bugs in 3.0.x which are fixed on master around delivery to archive partition. We definitely had bugs on master, but I thought they were newly introduced on master as well, which is why the fixes weren't backported. But if you're having files be in the wrong location, maybe there are bugs on 3.0.x as well.

Do you have the syslog lines at the time that email was delivered?

Bron.
-- 
 Bron Gondwana
 brong at fastmail.fm

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20190204/e70bba44/attachment-0001.html>


More information about the Info-cyrus mailing list