Did calculating the quota change from 2.3 to 2.5?

Bron Gondwana brong at fastmail.fm
Tue Nov 29 17:12:38 EST 2016


On Wed, 30 Nov 2016, at 08:37, Jason L Tibbitts III via Info-cyrus wrote:
> >>>>> "BG" == Bron Gondwana via Info-cyrus <info-cyrus at lists.andrew.cmu.edu> writes:
> 
> BG> If you use imapsync, it doesn't know about that, and will upload the
> BG> same message twice. 2.5 doesn't have the smarts to recognise that
> BG> it's the same message.
> 
> Fun random question: Does anything blow up if you run hardlink on your
> mail spool?  (The hardlink program finds identical files and hardlinks
> them.)

No, that is fine.

> Given an index of message-id/filenames it should be possible to write a
> deduplicator that's orders of magnitude faster than hardlink, but I have
> a sneaking suspicion that someone's already done that.

Yep, I wrote something which can read 2.5 cyrus.index files and hardlink
matching files.  It depends on a ton of FastMail internals though.

3.0 will have much better support for deduplication when you upload via
IMAP, because it will know where all the other copies in the same user are
(there's no support for cross-user deduplication because we don't use it at
all, every user gets their own sieve script and their own lmtp pre-processing
at FastMail, so every message will have different headers and hence be a
different GUID.  I have to prioritise designs that I actually use)


#!/usr/bin/perl -w

# SETUP {{{
use strict;
use warnings;
use ME;
use Date::Manip;
use IO::File;
use ME::Machine;
use Cyrus::HeaderFile;
use Data::Dumper;
use Cyrus::IndexFile;
use Getopt::Std;
use Digest::SHA;
use ME::CyrusBackup;
use ME::User;
use Data::Dumper;
# }}}

my $sn = shift;

my (undef,undef,$uid,$gid) = getpwnam('cyrus');

foreach my $Slot (ME::Machine->ImapSlots()) {
  next if ($sn and $sn ne $Slot->Name());
  my $users = $Slot->AllMailboxes();
  my $conf = $Slot->ImapdConf();
  foreach my $user (sort keys %$users) {
    process($conf, $user, $users->{$user});
  }
}

sub process {
  my ($conf, $user, $folders) = @_;
  print "$user\n";
  my %ihave;
  foreach my $folder (@$folders) {
    my $meta = $conf->GetUserLocation('meta', $user, 'default', $folder);
    my $index = Cyrus::IndexFile->new_file("$meta/cyrus.index") || die "Failed to open $meta/cyrus.index";
    while (my $record = $index->next_record()) {
      push @{$ihave{$record->{MessageGuid}}}, [$folder, $record->{Uid}];
    }
  }

  foreach my $guid (keys %ihave) {
    next if @{$ihave{$guid}} <= 1;
    my ($inode, $srcname);
    my @others;
    foreach my $item (@{$ihave{$guid}}) {
      my $spool = $conf->GetUserLocation('spool', $user, 'default', $item->[0]);
      $spool =~ s{/$}{};
      my $file = "$spool/$item->[1].";
      my (@sd) = stat($file);
      if ($inode) {
        next if $sd[1] == $inode;
        push @others, $file;
      }
      else {
        $inode = $sd[1];
        $srcname = $file;
      }
    }
    next unless @others;
    print "fixing up files for $guid ($srcname)\n";
    foreach my $file (@others) {
      my $tmpfile = $file . "tmp";
      print "link error $tmpfile\n" unless link($srcname, $tmpfile);
      chown($uid, $gid, $tmpfile);
      chmod(0600, $tmpfile);
      print "rename error $file\n" unless rename($tmpfile, $file);
    }
  }
}





-- 
  Bron Gondwana
  brong at fastmail.fm


More information about the Info-cyrus mailing list