Incorrect size calculations on bogus messages

Bron Gondwana brong at fastmail.fm
Wed Jun 24 23:34:36 EDT 2009


Here's a funny one.  I've recreated it as a simple testcase which I'll
paste below.  Basically, a message with invalid mime structure causes
cyrus to put the wrong "size" information in its headers.

Seems some spammers have been generating these, and they show up as
replication errors because the index size doesn't match the file size.

[brong at imap3 hm]$ cat /mnt/data8/slot308/store23/data/b/user/brong/390978.
Return-Path: <brong at fastmail.fm>
Received: from compute2.internal (compute2.internal [10.202.2.42])
	 by store23m.internal (Cyrus v2.3.14-fmsvn18904-c7f26adc) with LMTPA;
	 Wed, 24 Jun 2009 21:53:09 -0400
X-Sieve: CMU Sieve 2.3
X-Spam-score: 1.4
X-Spam-hits: BAYES_20 -0.74, MISSING_MID 0.001, NO_RECEIVED -0.001, NO_RELAYS -0.001,
  TVD_SPACE_RATIO 2.219, BAYES_USED user
X-Spam-source: IP='127.0.0.1', Host='unk', Country='unk', FromHeader='fm', MailFrom='fm'
X-Spam-charsets: 
X-Attached: ForwardedMessage
X-Resolved-to: brong at fastmail.fm
X-Mail-from: brong at fastmail.fm
Received: from test ([10.202.2.231])
  by compute2.internal (LMTPProxy); Wed, 24 Jun 2009 21:53:08 -0400
Date: 20 Jun 2009 07:21:45 -0000
MIME-Version: 1.0
To: brong at fastmail.fm
Subject: bogusmessage
From: brong at fastmail.fm
Content-Type: multipart/mixed;
	boundary="=_31ff156115c676d4fc4fe82130032447"
Message-ID: <cmu-lmtpd-11464-1245894789-0 at store23m.internal>

--=_31ff156115c676d4fc4fe82130032447
Content-Transfer-Encoding: 
Content-Type: message/rfc822;
 name="ForwardedMessage"; 
Content-Disposition: inline;
 filename="ForwardedMessage"; 
--=_31ff156115c676d4fc4fe82130032447--

[brong at imap3 hm]$ ls -la /mnt/data8/slot308/store23/data/b/user/brong/390978.
-rw------- 1 cyrus mail 1189 Jun 24 21:53 /mnt/data8/slot308/store23/data/b/user/brong/390978.

[brong at imap3 hm]$ utils/oneoff/index_uids.pl -u 390978 -D /mnt/meta8/slot308/store23/meta/b/user/brong/cyrus.index
Uid: 390978
InternalDate: 1245894789
SentDate: 1245513600
Size: 1147
HeaderSize: 961
ContentOffset: 961
CacheOffset: 1066472
LastUpdated: 1245894810
SystemFlags: 00000000000000000000000000000000
UserFlags: 00000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
ContentLines: 5
CacheVersion: 2
MessageGuid: a8c26e46c4ce83fb5d77d360f024e3bbaa8d7371
Modseq: 14869

=======================

So, the file on disk is 1189 bytes long, but the
cyrus.index says the size is 1147 bytes.

The reason for this is that cyrus builds the
bodystructure and calculates the size of all
the component parts rather than just using the
actual file size.

I guess my question is - is there any reason not
to just put the actual size-in-bytes of the file
into the index header record?  Envelope parsing
might be slightly messed up, but at least the
basics will be OK.

Bron.



More information about the Cyrus-devel mailing list