Xapian searches of the body of an email

Robert Stepanek rsto at fastmailteam.com
Mon Jan 7 13:24:22 EST 2019


Hi Egon,

Yes, the slave should index in conversations.db automatically AFAIK. 

You should run squatter in rolling mode on the master, too.  

BTW: in 2014, Bron wrote a blog post about the search setup at FastMail: https://fastmail.blog/2014/12/01/email-search-system/It’s quite technical, but should give you a good idea at how it’s set up
for fast indexing and search
Cheers, Robert 


On Mon, Jan 7, 2019, at 5:54 PM, Egoitz Aurrekoetxea wrote:
> Hi Robert!


> 


> Thank you so much for helping us (mainly which is the one boring the
> list with questions :) although I promise I've checked the doc before
> asking :) :)  ).> 


> When you have a master/slave config... in the slave one, when running
> Squatter in rolling mode... does it update the conversations db too?.
> By the way, Squatter in rolling mode only makes sense in slave
> machines isn't it?.> 


> Many thanks!


> 


> ---
> 
> sarenet
> *Egoitz Aurrekoetxea*
> Departamento de sistemas
> 944 209 470
> Parque Tecnológico. Edificio 103
> 48170 Zamudio (Bizkaia)
> egoitz at sarenet.es
> www.sarenet.es
> 
> Antes de imprimir este correo electrónico piense si es necesario
> hacerlo.> 


> El 07-01-2019 16:42, Robert Stepanek escribió:


>> Hi,
>>  
>> Sebastian is right:
>>  
>> On Mon, Jan 7, 2019, at 3:57 PM, Sebastian Hagedorn wrote:
>>>  
>>> squatter is nowadays a bit of a misnomer, because it uses
>>> whatever index>>> you have configured. In cyrus 2.4, squatter would always create
>>> a SQUAT>>> index. When you run squatter with Xapian, it will build the index,
>>> but for>>> the index to actually work, you also need the conversationsdb.
>>  
>> conversations.db is indeed a misnomer now. The database was only used
>> to keep track of mail threads (hence the name), but its role
>> expanded. One of the indexes it stores is the SHA1 hashes of every
>> message, and separate hashes for each of that message MIME parts.
>> Such a hash is named the GUID, and for each GUID we store a list of
>> all mailbox:UID[bodypart] pairs where this content occurs in.>>  
>> For search, we keep track of the indexed messages by GUID, so we can
>> avoid reindexing duplicate mails. To return a search result, we can
>> now map that GUID back to its mailbox:message pairs. That's why we
>> need conversations.db for search.>>  
>> I can't help with upgrading from 2.4, unfortunately, but if you re-
>> index your mailboxes once in conversations.db, you should be all set.>>  
>> Cheers,
>> Robert
>> 
>> ----
>>  Cyrus Home Page: http://www.cyrusimap.org/
>>  List Archives/Info:
>>  http://lists.andrew.cmu.edu/pipermail/info-cyrus/>>  To Unsubscribe:
>> https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.andrew.cmu.edu/pipermail/info-cyrus/attachments/20190107/618f31be/attachment.html>


More information about the Info-cyrus mailing list