"re-singleinstancestore" a partition?

Simon Matter simon.matter at invoca.ch
Thu Jul 12 04:50:01 EDT 2007


> I need a sanity check here.
>
> I had a single storage partition that I've grown to ~400GB to house
> about 270,000 mailboxes.  I managed to reduce that number by around
> 115,000 by purging out accounts no one's logged into, still leaving
> quite a mess of accounts.  I've been migrating them with some perl and
> the mailbox move stuff in cyradm to four new partitions of 75GB each and
> I'm finding that I'm very quickly running out of space due to the
> breaking up of the singleinstancestore storage gains.
>
> To remedy this, I'm thinking about traversing the mailboxes on each
> partition building a database of checksums to identify identical
> messages, then replacing the duplicated content with hard links and
> reconstructing the user's mailbox (for good measure, although it
> shouldn't be necessary).
>
> I imagine the storage savings with this plan would be huge, but it
> screams danger and I'm wondering if I should bother.  However, by my
> calculations, I'll need another 600GB (for a total of 900GB) instead of
> the current 400GB (which is actually only at 41% right now) and I simply
> don't have the space.
>
> The code to do this seems pretty trivial, but has anyone had to do this
> before/is there a tool out there already that does it?

I did some tests long time ago and IIRC it was without any problems.
Possible tools are here:
ftp://ftp.redhat.com/pub/redhat/mirror-tools/
http://code.google.com/p/hardlinkpy/

Simon


More information about the Info-cyrus mailing list