Does anyone else see skiplist recovery errors?

Robert Mueller robm at fastmail.fm
Thu Jun 15 03:06:54 EDT 2006


> However, it may also depend on the way cyrus-imapd is stopped by the
> system. At least on RedHat/Fedora, the function used by the init scripts
> send a TERM to the master, and if it doesn't die for some time, it sends
> KILL which _could_ result in corrupt ondisk data if I understand it
> correctly. Maybe on very large and busy servers, the method used by
> RedHat/Fedora is not so good. Maybe the stop function is really important
> and should be optimized like those usually used with other slow stopping
> daemons like squid.
> How exactly do you stop cyrus?

We use a TERM signal to master, and if it doesn't cleanup properly, we then 
use a KILL signal. However, I don't remember ever seeing it use the KILL 
signal, TERM normally seems fine.

Anyway despite that, it still shouldn't corrupt the DB should it. I thought 
the point of a transactional/logging DB like skiplists is that killing 
something accessing it at any time should not corrupt the DB, it should just 
"rollback" to the last transaction point. Maybe skiplists aren't designed to 
be "kill" safe, but they definitely should be!

> Anyway, I'm not happy with how we can handle skiplist dbs. There are no
> easy recovery tools which can be used to fix things other by doing by
> hand. I mean, something which can be automated easily.

Agreed on that one as well.

Rob



More information about the Info-cyrus mailing list