Cyrus review
Greg Banks
gnb at fastmail.fm
Mon Jan 23 03:55:31 EST 2012
G'day,
Bron said we should have this discussion in public, which is a good idea, so...
On Mon, Jan 23, 2012, at 08:22 AM, Bron Gondwana wrote:
> On Mon, Jan 23, 2012 at 01:52:54PM +1100, Greg Banks wrote:
> > commit "dlist: unlink before writing temporary file"
> > commit "always mkdir actually"
> >
> > I don't get this - are you trying to protect against a race with some other code? Or does the filename get accidentally reused somehow?
>
> Heh... so consider a buggy sync_client which uploads the same file
> multiple times. The filename gets reused, because it's a direct
> product of pid and sha1. That's pretty bad.
>
> What is worse: RESERVE
>
> user/foo/5. has sha1 X
> RESERVE (user.foo) (X)
> $pid/X is now a hard link to user/foo/5.
> along comes
> user.foo.sub UID 2 SHA1 X - but the sync_client chooses to upload.
>
> It actually overwrites user/foo/5. with the new content. Now, this
> is always the same file - so nothing is lost at the end. But if
> ANOTHER process also hardlinks user/foo/5. into its own space and
> starts reading it, it can get a short read.
>
Ah, right. Ouch.
Doing an open(O_RDWR|O_CREAT|O_EXCL) followed by an fdopen() might be a better solution.
The O_EXCL should close the race.
> My suspicion is that the Linux kernel changed some behaviour which
> exposed this mistake, where previously the clients were always seeing
> the old cached version, because we parse the file just before linking
> the first time.
It could just be a CPU scheduler change.
> > commit "twoskip: don't crash during checkpoint on full disk"
> >
> > + int r2 = mycheckpoint(db);
> > + if (r2) {
> > + syslog(LOG_NOTICE, "twoskip: failed to checkpoint %s: %m",
> > + _fname(db));
> > + }
> >
> > Hmm, the error message uses %m when an error is returned from mycheckpoint(),
> > but errno is not reliable in mycheckpoint() as there is all sorts of potentially errno-trashing
> > system call activity in between the proximate error and the return.
> >
> > Otherwise, looks good.
>
> So should probably be error_message(r2) - except we don't have
> error_message in here.
Yeah, because of those wacky CYRUSDB_* errors.
> Damn I hate the error handling dual-framework.
We should just #define CYRUSDB_* to some new codes in imap/imap_err.et
> > commit "add Xmove command"
> >
> > I like the idea! This one is actually useful for loads of people. But, it's existence should
> > be announced in CAPABILITY.
>
> It does:
>
> { "XLIST", 2 },
> + { "XMOVE", 2 }, /* not standard */
> { "SPECIAL-USE", 2 },
>
> At least in the copy in my repository.
Woops, missed that :(
>
> > r = append_setup(&appendstate, name, state->userid,
> > - state->authstate, ACL_INSERT, qdiffs,
> > + state->authstate, ACL_INSERT,
> > + ismove ? NULL : qdiffs,
> > namespace, isadmin);
> >
> > This is only correct if both the source and destination mailboxes have the same
> > quotaroot, which isn't always the case. Contrary cases include a) moving to another
> > user's mailbox, b) moving to a mailbox in the shared space, c) moving between the
> > same user's mailboxes across the boundary of a nested quotaroot.
>
> True - we need to test for that case. Damn. It's not easy with all the
> layers either :(
Yeah, hairy in the extreme.
> > commit "mailbox: fixup delete"
> >
> > I'm deeply confused about this.
>
> That probably means it's not well described in the code. I'll see if I
> can
> clear it up a bit.
>
> The core bug was: mailbox_lock_index() now returns a "doesn't exist"
> response
> if the mailbox has an OPT_DELETED flag set in the index header. Which is
> correct in general. It means any client re-locking the mailbox after it
> has been deleted will get a "this mailbox is deleted, go away" rather
> than
> having to check the mailbox itself.
>
> The problem is - this codepath is during unlock, and it's to see if we
> need
> to do various cleanups under a mboxname exclusive lock. In this case
> we're
> basically setting the "index_locktype" field because the various assert
> checks elsewhere in the code aren't smart enough to know that we have an
> exclusive namelock, which gives the same protections.
Gah. That is definitely unobvious.
>
> > commit "Bug #3381 - make rehash tool 64-bit safe"
> >
> > Hmm. There's some really ugly Perl in there.
>
> I know. I didn't like it much, but Leena has a point - it was broken
> as it was. The whole of tools/rehash is awful. I did rewrite it once,
> a few years ago, to support a new format which we never went ahead with.
> It was a nice format which allowed a much better "fast rename", but it
> was yet another format.
>
> These days I'd be tempted to do directory names based on UNIQUEID,
Now that they're actually unique, sure.
> and
> never bloody rename files for folder renames. But it's a big change.
We could provide symlinks...
But really, I'm not sure it's worth optimising folder delete and rename, they're
relatively infrequent operations. Another argument is that with real folder names
as directory names then you do actually have a folder name when reconstructing
from nothing but a tar backup of messages.
>
> > + rename ($h_32/$f, "$h_64/$f") or ouch "couldn't move $h_32 into $h_64";
>
> > Why is the 1st argument of the second rename() not quoted?
>
> Now that's definitely a bug, thanks.
>
> > Also, it seems like the whole do-two-hashes-and-some-renames dance should
> > be in a sub.
> >
> > foreach my $b (split(/ */, $name)) {
> >
> > This was in the old code and copy-n-pasted into new code. I don't think it
> > does what the author expected. Like the equivalent C code it splits a string
> > into characters; unlike the C code it swallows spaces.
>
> If it was in the old code, then I don't care so much. The whole thing
> is ugly as sin.
>
> > my $uint32 = (2**32);
> > ...
> > $n = ((($n << 3) ^ ($n >> 5)) % $uint32) ^ ord($b);
> >
> > So, this should, as expected, yield the incorrect 32b-limited result when run on a 64 Perl.
> > Experiment indicates it also does that on a 32b Perl, but it had me twitching for a while.
>
> Yeah, ouch.
>
> > commit "Bug #3598 - auditlog sessionid through lmtpproxy in murder"
> >
> > strncpy() leaves an unterminated string on overflow.
> > Also, we should we extract the rsessionid outside the if() which guards the
> > only place it's used.
>
> Should forward these review comments to the authors :)
I've cc'd the authors of the above two patches.
>
>
> More seriously - should we actually take these discussions to the public
> mailing list? It would good for everyone to see them and be able to
> chime
> in.
>
Done.
--
Greg.
More information about the Cyrus-devel
mailing list