Future Ideas wiki page

David Lang david.lang at digitalinsight.com
Fri Jan 8 22:17:15 EST 2010


On Fri, 8 Jan 2010, Bron Gondwana wrote:

> On Fri, 08 Jan 2010 09:56 -0800, "David Lang" <david.lang at digitalinsight.com> wrote:
>> one thing that I saw mentioned elsewhere as a limitation of IMAP (and
>> therefor I
>> don't know if there is a way to address it reasonably) is the lack of a
>> fuzzy
>> search capability.
>
> Without a specification document, it's hard to add anything that you expect
> clients to actually use.

true, but without a sample implementation it's unlikely to become a standard, so 
the discussion needs to stop somewhere.

>> the IMAP search is a exact match search, it would be useful to have the
>> hooks to
>> be able to use a search-engine like search capibility as well (not just
>> exact
>> matches, but matches with only some of the search terms, matches with
>> plural
>> versions of the search terms, etc)
>
> Yes, that would be lovely to have.  You'd probably run a separate search-engine
> process and have the IMAP server just send out a request and map the document
> IDs back to folder/uid on response.

sounds right.

>> As I understand it this would require a slight variation of the search
>> request
>> to indicate that you want the fuzzy match, and a variation of the search
>> response to be able to indicate the quality of each match returned.
>
> It would require a brand new spec for the search result - an ordered list of
> UIDs wouldn't cut it any more!
>
> While we're at it, I'm much more interested in cross-folder searching with sort
> order that doesn't require folder as the first item, but that's significantly more
> complex!

If we have to define a new search response, it may be reasonable to handle both. 
Does anyone who has been part of the IMAP standards work want to think/talk 
about what would work well and fit the flavor of IMAP?

On the same subject, what other wishes do people have related to search (if 
there are enough things people are interested in, possibly someone will get 
motivated enough to write the code to make it work)

> Thankfully, this is all pretty orthagonal to everything that I'm doing, so it's not
> a consideration I need to give much thought to at the moment.  Someone else
> who considers it worth putting effort in to could do it pretty independently.

agreed.

> The charset changes would allow an initial pre-processing pass to spit out the
> "document" as UTF-8 rather than its original MIME encoding for processing by
> the search engine, but that's the only interaction it would have.  If the search
> engine supports a chunked input, it would probably be worth embedding that
> target into the lib/charset.c as a character filter sink, and chaining the documents
> into it rather than building an entire buffer at once.  There's already code that
> does that just using a standard buffer and sending it to the squatter callback
> whenever it reaches a fixed size, then resetting it.  Easy enough to do.

If that interface is enough more efficiant, it may be worth making that a 
requirement and let the external tool deal with splitting it back up if needed.

So, if I am understanding you correctly, the hooks into Cyrus to support 
something like this are fairly easy to do, the hard thing would be the IMAP 
command and response.

David Lang

> Regards,
>
> Bron.
>


More information about the Info-cyrus mailing list