NFSv4, anyone?
Paul Dekkers
Paul.Dekkers at surfnet.nl
Sat Nov 25 09:55:38 EST 2006
Hi,
Simon Matter wrote:
>>Since the use of NFSv4 was mentioned again in the HA thread, I thought
>>I'd give this a shot by using the imaptest-utility from dovecot (the
>>only imap stress-testing tool I can think of, I've stressed replication
>>with it too) and NFSv4 mounts on RedHat.
>>Maybe it does not show any errors... maybe it does: better then no
>>testing at all. Probably a far too simple test and a lot to tweak.
>>
>>It would be very interesting to see how and if this works with NetApp
>>filers for instance, instead of a RH server. Or on Solaris. I have a
>>feeling that this might show different results.
>>
>>I exported the filesystem simply with:
>> /usr/sbin/exportfs -o rw,fsid=0,no_root_squash haver:/data/vol2
>>and also tried with the secure_locks option added to it, and mounted on
>>the client with:
>> mount -t nfs4 gerst:/ mnt
>>so nothing really special. (Suggestions for further tests welcome!)
>>
>>I tried insecure_locks too, eventually, and that seems to result in a
>>similar situation to NFSv3, and apparently secure_locks are default.
>>
>>The cyrus install is cyrus-imapd-2.3.7-3 from Simon/Invoca with mostly
>>default options: so skiplist is the default for all databases. I changed
>
>
> To be a bit more precise here: The rpm uses skiplist for all those
> databases that are berkeley by default.
>
>
>>to "flushseenstate: 0" though, because imaptest with a local filesystem
>>gave me errors if I didn't (!).
>
>
> That's a bit strange, at least I didn't see any difference running
> imaptest with "flushseenstate: on or off.
Hmm, I get a lot of "No matching messages" if I change this (and more of
them with more concurrent clients).
0 22 22 17 22 11 13 11 29 20 0
Error: FETCH BODY[] failed: f2 NO No matching messages (0.000 sec)
Auth Logi Sele Fetc Fet2 Stor Dele Expu Appe Logo Disc
0 24 23 21 19 14 18 20 32 25 0
0 18 18 22 25 8 22 18 25 20 0
Error: FETCH BODY[] failed: f2 NO No matching messages (0.000 sec)
0 24 26 26 21 8 15 18 24 20 0
0 24 21 22 24 7 17 17 25 22 0
0 22 22 21 25 15 21 17 28 24 0
I also tried this on a install compiled from source both on Linux and
FreeBSD: same result...
>>With an ext3 filesystem I just get a normal output from imaptest:
>>
>>[root at haver dovecot-1.0.rc7]# ./imaptest
>>Auth Logi Sele Fetc Fet2 Stor Dele Expu Appe Logo Disc
>> 0 43 41 40 39 23 30 26 28 25 0
>> 0 35 36 36 37 17 26 24 39 33 0
>> 0 33 33 34 33 21 26 25 45 37 0
>> 0 38 39 38 38 11 23 24 43 34 0
>> 0 34 33 32 33 21 28 28 43 36 0
>>
>>If I mv the spool to the NFSv4 mount, and start cyrus with that
>>partition I see a lot of errors, unfortunally:
>>
>>[root at haver dovecot-1.0.rc7]# ./imaptest
>>Auth Logi Sele Fetc Fet2 Stor Dele Expu Appe Logo Disc
>> 0 20 20 20 20 1 3 1 0 0 0
>>Error: STORE failed: s NO System I/O error
>
>
> I tried the same now with NFSv4 mounts between two RHEL4 XEN instances and
> I can confirm the same errors. Now I tried something really crazy which I
> expected to not work at all: I created a loop mounted filesystem on the
> NFSv4 volume and mounted it as cyrus spool. And guess what, it works fine.
>
> That's how filesystems are mounted:
> /dev/sda1 on /var/spool/imap type ext3 (rw)
> client128:/ on /var/spool/imap2 type nfs4 (rw,noac,addr=192.168.10.128)
> /var/spool/imap2/fs1 on /var/spool/imap3 type ext3 (rw,loop=/dev/loop0)
>
> Cyrus works fine on /var/spool/imap and /var/spool/imap3, but not on
> /var/spool/imap2.
> Of course performance is bad that way, it's just interesting that it works.
I tried GFS yesterday, that worked fine too, but I just mounted the
volume just on one box (no real cluster) so it was a lousy test
anyway... (I could try it later, if I can find some time.)
> Regarding NFSv4, would be nice if the problem turns out to be a
> configuration issue. Maybe someone with a current Solaris environment
> could try the same there?
I just tried the in-memory NFSv4 server from citi (in newpynfs); it is
very very slow, and I had to lower the number of concurrent users in
imaptest (to prevent 'stalling errors') - but it seems to work better! I
don't get any locking problems in the logs.
So it could be just the server on Linux that is causing problems... and
it might work on a Solaris server or NAS like NetApp...
Paul
More information about the Info-cyrus
mailing list