Xapian searches of the body of an email
Robert Stepanek
rsto at fastmailteam.com
Mon Jan 7 13:24:22 EST 2019
Hi Egon,
Yes, the slave should index in conversations.db automatically AFAIK.
You should run squatter in rolling mode on the master, too.
BTW: in 2014, Bron wrote a blog post about the search setup at FastMail: https://fastmail.blog/2014/12/01/email-search-system/It’s quite technical, but should give you a good idea at how it’s set up
for fast indexing and search
Cheers, Robert
On Mon, Jan 7, 2019, at 5:54 PM, Egoitz Aurrekoetxea wrote:
> Hi Robert!
>
> Thank you so much for helping us (mainly which is the one boring the
> list with questions :) although I promise I've checked the doc before
> asking :) :) ).>
> When you have a master/slave config... in the slave one, when running
> Squatter in rolling mode... does it update the conversations db too?.
> By the way, Squatter in rolling mode only makes sense in slave
> machines isn't it?.>
> Many thanks!
>
> ---
>
> sarenet
> *Egoitz Aurrekoetxea*
> Departamento de sistemas
> 944 209 470
> Parque Tecnológico. Edificio 103
> 48170 Zamudio (Bizkaia)
> egoitz at sarenet.es
> www.sarenet.es
>
> Antes de imprimir este correo electrónico piense si es necesario
> hacerlo.>
> El 07-01-2019 16:42, Robert Stepanek escribió:
>> Hi,
>>
>> Sebastian is right:
>>
>> On Mon, Jan 7, 2019, at 3:57 PM, Sebastian Hagedorn wrote:
>>>
>>> squatter is nowadays a bit of a misnomer, because it uses
>>> whatever index>>> you have configured. In cyrus 2.4, squatter would always create
>>> a SQUAT>>> index. When you run squatter with Xapian, it will build the index,
>>> but for>>> the index to actually work, you also need the conversationsdb.
>>
>> conversations.db is indeed a misnomer now. The database was only used
>> to keep track of mail threads (hence the name), but its role
>> expanded. One of the indexes it stores is the SHA1 hashes of every
>> message, and separate hashes for each of that message MIME parts.
>> Such a hash is named the GUID, and for each GUID we store a list of
>> all mailbox:UID[bodypart] pairs where this content occurs in.>>
>> For search, we keep track of the indexed messages by GUID, so we can
>> avoid reindexing duplicate mails. To return a search result, we can
>> now map that GUID back to its mailbox:message pairs. That's why we
>> need conversations.db for search.>>
>> I can't help with upgrading from 2.4, unfortunately, but if you re-
>> index your mailboxes once in conversations.db, you should be all set.>>
>> Cheers,
>> Robert
>>
>> ----
>> Cyrus Home Page: http://www.cyrusimap.org/
>> List Archives/Info:
>> http://lists.andrew.cmu.edu/pipermail/info-cyrus/>> To Unsubscribe:
>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20190107/618f31be/attachment.html>
More information about the Info-cyrus
mailing list