Cyrus - GFS slow start and poor performace
Bron Gondwana
brong at fastmail.fm
Wed May 14 06:30:44 EDT 2008
[reply number 2 - addressing bits I missed in the first reply...]
On Wed, 14 May 2008 10:59:33 +0200, "Maurizio Lo Bosco" <maurizio.lobosco at eng.it> said:
> The dump of the database takes 7 minutes but the disk usage is definitely
> low
> (less than 5%)
A dump of the database visits all records in alphabetical order. This can result
in somewhat random looking seeks around the file due to the layout of a skiplist,
but it will happen within the mmap.
> RedHat is saying that there is no way to optimise the performance on the
> GFS
> locking architecture and they will now take a look to the cyrus code.
You may want to pass on the RedHat engineers that Cyrus uses an MMAP of the entire
file to read all records, and uses seeks and direct writes the same fd (or a different
fd depending on compilation settings) to write. Skiplist appends entire records to
the file, but also seeks back and updates pointers (4 byte records) within the file
with each update.
That's writing.
Reading - it reads each record, gets a pointer to the location of the next one, and
reads from the memory location that corresponds to db->map_base + pointer_offset.
Depending on your requirements, it may make sense to place your mailboxes.db on local
disk (it's pretty small) and regularly copy/rsync it onto your GFS partition. Worst
case you lose a couple of mailboxes.db records in a crash. Depends what you can afford
to lose. You could probably stat the file every second and copy it on any change pretty
cheaply and risk losing at most the last second's changes (it doesn't change often)
Regards,
Bron.
--
Bron Gondwana
brong at fastmail.fm
More information about the Info-cyrus
mailing list