Discussion:
Conversations and search
Sebastian Hagedorn
2018-08-16 12:04:22 UTC
Permalink
Hi,

I have a question regarding the conversations db and how it affects
(Xapian) search. In this GitHub issue
(<https://github.com/cyrusimap/cyrus-imapd/issues/2376>) I was dealing with
Xapian search always failing. The underlying reason turned out to be an
empty conversations db. When I regenerate that user's conversations db,
Xapian search works fine. So far, so good. But then I noticed this piece of
documentation:

<https://www.cyrusimap.org/imap/concepts/deployment/databases.html?highlight=conversations#conversations-userid-conversations>

Quote: "This file contains all the message-id fields from every email that
has been seen in the ***past three months***, mapping to the conversation
IDs in which this message ID has been seen, and the timestamp when it was
last seen."

This raises a number of questions. What does "has been seen" mean in this
context? And does it mean that Xapian search will always fail to find
emails that arrived more than three months ago?? That doesn't sound very
useful ...
--
.:.Sebastian Hagedorn - Weyertal 121 (Gebäude 133), Zimmer 2.02.:.
.:.Regionales Rechenzentrum (RRZK).:.
.:.Universität zu Köln / Cologne University - ✆ +49-221-470-89578.:.
----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
Albert Shih
2018-08-17 14:08:45 UTC
Permalink
Le 16/08/2018 à 14:04:22+0200, Sebastian Hagedorn a écrit
Hi
I have a question regarding the conversations db and how it affects (Xapian)
search. In this GitHub issue
(<https://github.com/cyrusimap/cyrus-imapd/issues/2376>) I was dealing with
Xapian search always failing. The underlying reason turned out to be an
empty conversations db. When I regenerate that user's conversations db,
Xapian search works fine. So far, so good. But then I noticed this piece of
<https://www.cyrusimap.org/imap/concepts/deployment/databases.html?highlight=conversations#conversations-userid-conversations>
Quote: "This file contains all the message-id fields from every email that
has been seen in the ***past three months***, mapping to the conversation
IDs in which this message ID has been seen, and the timestamp when it was
last seen."
This raises a number of questions. What does "has been seen" mean in this
context? And does it mean that Xapian search will always fail to find emails
that arrived more than three months ago?? That doesn't sound very useful ...
If I'm correct (I'm new with cyrus too), the conversations DB as nothing to
do with the database of xapian.

The xapian database is some file with .glass extension and contain all the
index of all your mail. Those files stand inside

t1searchpartition-default/FIRST_LETTER_OF_LOGIN/user/LOGIN/

The conversation database is some cyrus internal database use by cyrus to
*create* the xapian index.

In other word the three months means, if you stop the xapian index now, you
have three months to restart it or you're going to need to regenerated the
xapian database from the beginning (and that take very loooong time).

Please if it's incorrect please someone correct me.

Regards
--
Albert SHIH
DIO bâtiment 15
Observatoire de Paris
xmpp: ***@obspm.fr
Heure local/Local time:
Fri Aug 17 16:01:52 CEST 2018
Bron Gondwana
2018-08-18 10:42:08 UTC
Permalink
Le 16/08/2018 à 14:04:22+0200, Sebastian Hagedorn a écrit
Hi
Post by Sebastian Hagedorn
I have a question regarding the conversations db and how it affects
(Xapian)>> search. In this GitHub issue
(<https://github.com/cyrusimap/cyrus-imapd/issues/2376>) I was
dealing with>> Xapian search always failing. The underlying reason turned out
to be an>> empty conversations db. When I regenerate that user's
conversations db,>> Xapian search works fine. So far, so good. But then I noticed this
<https://www.cyrusimap.org/imap/concepts/deployment/databases.html?highlight=conversations#conversations-userid-conversations>>>
Quote: "This file contains all the message-id fields from every
email that>> has been seen in the ***past three months***, mapping to the
conversation>> IDs in which this message ID has been seen, and the timestamp
when it was>> last seen."
This raises a number of questions. What does "has been seen" mean
in this>> context? And does it mean that Xapian search will always fail to
find emails>> that arrived more than three months ago?? That doesn't sound very
useful ...>
If I'm correct (I'm new with cyrus too), the conversations DB as
nothing to> do with the database of xapian.
The xapian database is some file with .glass extension and
contain all the> index of all your mail. Those files stand inside
t1searchpartition-default/FIRST_LETTER_OF_LOGIN/user/LOGIN/
The conversation database is some cyrus internal database use by
cyrus to> **create** the xapian index.
In other word the three months means, if you stop the xapian index
now, you> have three months to restart it or you're going to need to
regenerated the> xapian database from the beginning (and that take very loooong time).>
Please if it's incorrect please someone correct me.
That's incorrect - the messageids are actually only used for thread
calculations - so if you get a new messages more than 3 months later, it
won't be threaded with related messages (conversations.db threads only,
aka: JMAP and XCONV commands).
The G keys used for xapian are kept forever (at least: until the message
is deleted from the index - which is not when it's expunged, but when
cyr_expire cleans up the expunged record).
Bron.

--
Bron Gondwana, CEO, FastMail Pty Ltd
***@fastmailteam.com
Sebastian Hagedorn
2018-08-20 10:55:21 UTC
Permalink
--On 18. August 2018 um 20:42:08 +1000 Bron Gondwana
Post by Bron Gondwana
Le 16/08/2018 à 14:04:22+0200, Sebastian Hagedorn a écrit
Hi
Post by Sebastian Hagedorn
I have a question regarding the conversations db and how it affects
(Xapian)>> search. In this GitHub issue
(<https://github.com/cyrusimap/cyrus-imapd/issues/2376>) I was
dealing with>> Xapian search always failing. The underlying reason
turned out to be an>> empty conversations db. When I regenerate that
user's conversations db,>> Xapian search works fine. So far, so good.
<https://www.cyrusimap.org/imap/concepts/deployment/databases.html?high
light=conversations#conversations-userid-conversations>>> Quote: "This
file contains all the message-id fields from every email that>> has
been seen in the ***past three months***, mapping to the conversation>>
IDs in which this message ID has been seen, and the timestamp when it
was>> last seen."
This raises a number of questions. What does "has been seen" mean
in this>> context? And does it mean that Xapian search will always fail
to find emails>> that arrived more than three months ago?? That doesn't
sound very useful ...>
If I'm correct (I'm new with cyrus too), the conversations DB as
nothing to> do with the database of xapian.
The xapian database is some file with .glass extension and
contain all the> index of all your mail. Those files stand inside
t1searchpartition-default/FIRST_LETTER_OF_LOGIN/user/LOGIN/
The conversation database is some cyrus internal database use by
cyrus to> **create** the xapian index.
In other word the three months means, if you stop the xapian index
now, you> have three months to restart it or you're going to need to
regenerated the> xapian database from the beginning (and that take very
loooong time).> Please if it's incorrect please someone correct me.
That's incorrect - the messageids are actually only used for thread
calculations - so if you get a new messages more than 3 months later, it
won't be threaded with related messages (conversations.db threads only,
aka: JMAP and XCONV commands).
The G keys used for xapian are kept forever (at least: until the message
is deleted from the index - which is not when it's expunged, but when
cyr_expire cleans up the expunged record).
Bron.
Thanks for the explanation. I will file a documentation bug so this is
stated there.
--
.:.Sebastian Hagedorn - Weyertal 121 (GebÀude 133), Zimmer 2.02.:.
.:.Regionales Rechenzentrum (RRZK).:.
.:.UniversitÀt zu Köln / Cologne University - ✆ +49-221-470-89578.:.
Loading...