Discussion:
Cyrus IMAP and MySQL mailboxes (Building load-balancing cluster)
Igor Zhbanov
2006-11-16 16:24:32 UTC
Permalink
Hello!

The short question is: can Cyrus IMAP take mail from MySQL tables? And
where can I read about how to setup it?

The long story is now. I need to setup load-balancing mail system. I
culdn't find open-source mail systems that have genuine cluster
support, so I have decided to build cluster on my own.
The main problem is to build a shared storage that can survive server
crashes, where mail will be stored. I have found that MySQL-cluster is
reliable and fast. So, mail will be stored in MySQL tables. And I know
that Postfix can store mail in MySQL. So, I need POP3 server that can
take letters from MySQL tables.

Is it possible to make such system? Or do you know better solutions
for multiple node load-balancing cluster?

Thanks.
a***@morrison-ind.com
2006-11-16 18:39:20 UTC
Permalink
Post by Igor Zhbanov
The short question is: can Cyrus IMAP take mail from MySQL tables?
I'm not aware of any such backend; and I think it would be a crazy
method to store mail and performance would suck. You may *THINK*
MySQL is fast, it isn't - certainly not compared to a filesystem
[doubly so when dealling with BLOBs, which is essentially what a mail
message is]
Post by Igor Zhbanov
And where can I read about how to setup it?
There is a package called DbMail. Maybe you want to look at that.
Post by Igor Zhbanov
The long story is now. I need to setup load-balancing mail system. I
culdn't find open-source mail systems that have genuine cluster
support,
What is "genuine cluster support"?
Post by Igor Zhbanov
The main problem is to build a shared storage that can survive server
crashes, where mail will be stored.
Use a SAN.
Post by Igor Zhbanov
I have found that MySQL-cluster is
reliable and fast. So, mail will be stored in MySQL tables. And I know
that Postfix can store mail in MySQL. So, I need POP3 server that can
take letters from MySQL tables.
See DbMail.
Post by Igor Zhbanov
Is it possible to make such system? Or do you know better solutions
for multiple node load-balancing cluster?
Igor Zhbanov
2006-11-16 23:37:08 UTC
Permalink
Post by a***@morrison-ind.com
Post by Igor Zhbanov
The short question is: can Cyrus IMAP take mail from MySQL tables?
I'm not aware of any such backend; and I think it would be a crazy
method to store mail and performance would suck. You may *THINK*
MySQL is fast, it isn't - certainly not compared to a filesystem
[doubly so when dealling with BLOBs, which is essentially what a mail
message is]
Post by Igor Zhbanov
And where can I read about how to setup it?
There is a package called DbMail. Maybe you want to look at that.
Post by Igor Zhbanov
The long story is now. I need to setup load-balancing mail system. I
culdn't find open-source mail systems that have genuine cluster
support,
What is "genuine cluster support"?
I mean some support of clustering by software itself
Post by a***@morrison-ind.com
Post by Igor Zhbanov
The main problem is to build a shared storage that can survive server
crashes, where mail will be stored.
Use a SAN.
First of all, such SAN must be very reliable itself. Second, it must
support some kind of global locking mechanism, so several nodes can
use lock to protect file from simultaneous writing. Third, Cyrus IMAP
must lock mailboxes, so several instances on different server can work
with one mailbox without conflicts. Whether Cyrus IMAP use locks or
assumes that he is the only one who access mail box, I don't know. Can
it safely access one mailbox from different servers, I don't know too.
Post by a***@morrison-ind.com
Post by Igor Zhbanov
I have found that MySQL-cluster is
reliable and fast. So, mail will be stored in MySQL tables. And I know
that Postfix can store mail in MySQL. So, I need POP3 server that can
take letters from MySQL tables.
See DbMail.
---
Post by a***@morrison-ind.com
Post by Igor Zhbanov
The main problem is to build a shared storage that can survive server
crashes, where mail will be stored. I have found that MySQL-cluster is
reliable and fast.
My
god.
You are aware that MySQL-cluster only supports in-memory databases in
all but the most recent development snapshots?
If your mail system is so small you can afford to put all your users
email in memory, then good for you. Otherwise, mysql replication won't
buy you much more than Cyrus replication with a few good monitoring
scripts (and yes, we have failed real cyrus replications off failed
machines now - it's never fun, but going through the logs we lost a
grand total of two messages, and they had both been sieved into the
Junk Mail folder anyway.)
Yes, I know. But the latest version of MySQL can use on disk
(non-indexed) tables fields. I don't think that I need search by
letter content. If I will need it (of course, via web-interface), I
think I can make that feature.
Post by a***@morrison-ind.com
Seriously, see the other response, DbMail might be what you want -
personally I'd put blobs in the filesystem (actually, my SHA1 based
VFS system, but that's a different story) and metadata in mysql... if
I was writing my perfect IMAP solution, which I'm not, yet. Cyrus
does the job just fine, and you work around the wrinkles. It's better
than anything else out there for a biggish system right now.
I will look at DbMail too. Generally, I don't need exactly MySQL. I
just want to build load-balanced mail system. Probably, based on
Postfix + Cyrus IMAP + Squirrel Mail. I have found some links that may
be useful but didn't look at them yet.

http://cyrusimap.web.cmu.edu/ag.html
http://www-uxsup.csx.cam.ac.uk/~dpc22/cyrus/replication.html
http://cyrusimap.web.cmu.edu/imapd/install-replication.html

First of all, Cyrus IMAP has feature called "Cyrus IMAP Aggregator":
"The Cyrus IMAP Aggregator transparently distributes IMAP and POP
mailboxes across multiple servers. Unlike other systems for load
balancing IMAP mailboxes, the aggregator allows users to access
mailboxes on any of the IMAP servers in the system." Probably, it's
what I want.

Also, there is Cyrus IMAP replication.But some people says that there
can be lost letters during failover. And, of course, "use at your own
risk".

For now I see two ways to build load-balanced mail system:
1) Some kind of shared storage. It may be NFS-like global filesystem
or MySQL database.
2) Mailbox replication. It can be done by Cyrus IMAP replication or by
some other software.

Perhaps, there are other ways. I will look...

Thanks.
Bron Gondwana
2006-11-16 21:46:35 UTC
Permalink
Post by Igor Zhbanov
The main problem is to build a shared storage that can survive server
crashes, where mail will be stored. I have found that MySQL-cluster is
reliable and fast.
My

god.

You are aware that MySQL-cluster only supports in-memory databases in
all but the most recent development snapshots?

If your mail system is so small you can afford to put all your users
email in memory, then good for you. Otherwise, mysql replication won't
buy you much more than Cyrus replication with a few good monitoring
scripts (and yes, we have failed real cyrus replications off failed
machines now - it's never fun, but going through the logs we lost a
grand total of two messages, and they had both been sieved into the
Junk Mail folder anyway.)

Seriously, see the other response, DbMail might be what you want -
personally I'd put blobs in the filesystem (actually, my SHA1 based
VFS system, but that's a different story) and metadata in mysql... if
I was writing my perfect IMAP solution, which I'm not, yet. Cyrus
does the job just fine, and you work around the wrinkles. It's better
than anything else out there for a biggish system right now.

Bron.
Rudy Gevaert
2006-11-17 13:15:04 UTC
Permalink
Post by Bron Gondwana
If your mail system is so small you can afford to put all your users
email in memory, then good for you. Otherwise, mysql replication won't
buy you much more than Cyrus replication with a few good monitoring
scripts (and yes, we have failed real cyrus replications off failed
machines now - it's never fun, but going through the logs we lost a
grand total of two messages, and they had both been sieved into the
Junk Mail folder anyway.)
Bron,

would you be willing to share those monitoring scripts?

TIA

Rudy
--
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Rudy Gevaert ***@UGent.be tel:+32 9 264 4734
Directie ICT, afd. Infrastructuur Direction ICT, Infrastructure dept.
Groep Systemen Systems group
Universiteit Gent Ghent University
Krijgslaan 281, gebouw S9, 9000 Gent, Belgie www.UGent.be
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Attila Nagy
2006-11-26 21:14:02 UTC
Permalink
Post by Bron Gondwana
Seriously, see the other response, DbMail might be what you want -
personally I'd put blobs in the filesystem (actually, my SHA1 based
VFS system, but that's a different story) and metadata in mysql... if
I was writing my perfect IMAP solution, which I'm not, yet. Cyrus
does the job just fine, and you work around the wrinkles. It's better
than anything else out there for a biggish system right now.
I've come to a sligthly different conclusion after pondering on the
"perfect IMAP solution" topic.

I've thought about the following type/group of servers:
- "traditional" mailbox servers, which speak protocols like imap and pop
to the user
- object servers, which store key/value pairs and basically nothing more
- metadata servers, which store the needed metadata to serve the e-mails
(could be the same group as the object servers)
- directors, which would manage all the data (object and meta) on the
storage servers and direct the clients (the mailbox servers) to the
right one

The basic principles would be:
- object servers can be dumb, their all purpose is to store the value (a
blob) with a key (identifier to them)
- object servers report to the directors regularly, so the directors
know how much space is there, what keys are stored there and how loaded
is the given server at the moment
- metadata servers store everything, which is now stored in a directory
structure or in different databases (bdb, skiplist, others) in cyrus
backends, they know that a folder in user's mailbox consists of what
keys, stores meta information to the e-mails (headers, maybe keywords
from the e-mail for faster searching, etc)
- delivery to the system would happen by first split the e-mail to
different parts (metadata and data, data into mime parts, etc). Then
each of the data would be checksummed (for example with SHA1) and stored
into one (or more, depends on the design) object server along with the
metadata for the aproppriate servers.
- directors would then notice (by the object servers announcing them, or
the mailbox servers asked them where to store the given object) the new
object, distributed between themselves (for redundancy)
- fetching e-mails (like fetching them from the filesystem now) would
involve metadata, director lookups (where is the information) and
finally one or more object store get operation
- the mailbox servers could do some compression and/or encryption on the
contents they store in the metadata/object servers
- the directors play a role of a global broker for the data, so every
transaction would flow through them. This gives the ability of
implementing storage-hierarchies (where there are multiple level of
servers and a given information could be kept in geographically
different location for redundancy, availability or speed) and automatic
leveling, so you could keep each of your object servers busy (both in
the terms of disk space and CPU/IO capacity) and equal the load among them.

Obvious benefits:
- if there is a pdf attachment flowing around in 100.000 people's
mailbox, it will be stored exactly once, regardless of the surrounding
e-mail
- you can choose to replicate a given object to any number of servers.
If you combine this with this with crypto, you can even store
information on untrustworthy computers (you can use your spare diskspace
from your dns servers for example)
- you can pull out and start a new storage server any time, you just
have to tell the broker not to direct connections to there (if the data
is replicated to at least one other box), or migrate them to another
servers (if you don't do replication). Installing a new server is as
simple as adding it's IP to the directors, they notice that there is an
effectively unused, empty system, so a slow migration (leveling) starts,
maybe according to the usage statistics of the object servers, so "hot"
object would be moved first
- you could install an in-memory (or a local disk-backed) object cache
for each mail frontend, so heavily used objects would remain local to them

etc, etc

It seems to be simple in mind (of course there are a lot more details
inside), a little harder to code, but not impossible. The hardest part
seems to be the directors and the metadata servers, then the
modifications to the mail server (eg. cyrus) and the object store (which
is painly simple).

Speaking for existing components, I think memcached could be used (the
protocol and the server implementation) for the object storage
(complemented with a disk based store on the object servers and a
transactional layer, which would ask for servers from the directors),
and maybe an SQL based DB for the metadata store (with some replication).

Any takers? :)
Sarah Walters
2006-11-16 23:53:17 UTC
Permalink
Post by Igor Zhbanov
Post by a***@morrison-ind.com
Post by Igor Zhbanov
The main problem is to build a shared storage that can
survive server
Post by a***@morrison-ind.com
Post by Igor Zhbanov
crashes, where mail will be stored.
Use a SAN.
First of all, such SAN must be very reliable itself. Second, it must
support some kind of global locking mechanism, so several nodes can
use lock to protect file from simultaneous writing. Third, Cyrus IMAP
must lock mailboxes, so several instances on different server can work
with one mailbox without conflicts. Whether Cyrus IMAP use locks or
assumes that he is the only one who access mail box, I don't know. Can
it safely access one mailbox from different servers, I don't know too.
A good commercial SAN costs a fortune, but it is very reliable. And did
you want a *cluster* or a group of servers? A cluster should operate as
if it is a single host, and thus avoid the locking issues. Have a look
at Red Hat's clustering product for an example. If our cyrus
installation was going to be clustered, RH cluster with a SAN backend is
what we would do.

Regards,
Sarah Walters
Igor Zhbanov
2006-11-17 00:31:52 UTC
Permalink
Post by Sarah Walters
Post by Igor Zhbanov
Post by a***@morrison-ind.com
Post by Igor Zhbanov
The main problem is to build a shared storage that can
survive server
Post by a***@morrison-ind.com
Post by Igor Zhbanov
crashes, where mail will be stored.
Use a SAN.
First of all, such SAN must be very reliable itself. Second, it must
support some kind of global locking mechanism, so several nodes can
use lock to protect file from simultaneous writing. Third, Cyrus IMAP
must lock mailboxes, so several instances on different server can work
with one mailbox without conflicts. Whether Cyrus IMAP use locks or
assumes that he is the only one who access mail box, I don't know. Can
it safely access one mailbox from different servers, I don't know too.
A good commercial SAN costs a fortune, but it is very reliable. And did
you want a *cluster* or a group of servers? A cluster should operate as
if it is a single host, and thus avoid the locking issues. Have a look
at Red Hat's clustering product for an example. If our cyrus
installation was going to be clustered, RH cluster with a SAN backend is
what we would do.
Yes, I need the cluster exactly. Have I lots of domains, I could store
mailboxes of each domain on separate server. But I have only one big
domain. So, I need to spread mailboxes on one domain across several
servers. And than I need very clever load-balancer that will send
request to server where particular mailbox is. I don't know such
balancers aware of mail system structure.

Another way is to try to use something like GFS, constructed from
several servers. Each server will contain one portion of data and will
have Cyrus IMAP instance. But GFS will be mounted to each server, so
each Cyrus IMAP instance will see whole filesystem. Of course, here we
have same possible lock issues as with NFS. And we have no redundancy,
as I understand.
So, I need cluster for reliability to. I want to have each mailbox at
least on two servers (or to have shared storage upon two-nodes
failover cluster), so I can survive crash of any node (and not only of
disk system, that can be protected by RAID, but CPU crash, memory...)
with minimal recovery time.
Simon Matter
2006-11-17 06:24:24 UTC
Permalink
Post by Igor Zhbanov
Post by Sarah Walters
Post by Igor Zhbanov
Post by a***@morrison-ind.com
Post by Igor Zhbanov
The main problem is to build a shared storage that can
survive server
Post by a***@morrison-ind.com
Post by Igor Zhbanov
crashes, where mail will be stored.
Use a SAN.
First of all, such SAN must be very reliable itself. Second, it must
support some kind of global locking mechanism, so several nodes can
use lock to protect file from simultaneous writing. Third, Cyrus IMAP
must lock mailboxes, so several instances on different server can work
with one mailbox without conflicts. Whether Cyrus IMAP use locks or
assumes that he is the only one who access mail box, I don't know. Can
it safely access one mailbox from different servers, I don't know too.
A good commercial SAN costs a fortune, but it is very reliable. And did
you want a *cluster* or a group of servers? A cluster should operate as
if it is a single host, and thus avoid the locking issues. Have a look
at Red Hat's clustering product for an example. If our cyrus
installation was going to be clustered, RH cluster with a SAN backend is
what we would do.
Yes, I need the cluster exactly. Have I lots of domains, I could store
mailboxes of each domain on separate server. But I have only one big
domain. So, I need to spread mailboxes on one domain across several
servers. And than I need very clever load-balancer that will send
request to server where particular mailbox is. I don't know such
balancers aware of mail system structure.
The question here is what is "one big domain" exactly? How many users,
what's the expected mailbox size (quota) and which usage pattern. That's
what really matters here.
Post by Igor Zhbanov
Another way is to try to use something like GFS, constructed from
several servers. Each server will contain one portion of data and will
have Cyrus IMAP instance. But GFS will be mounted to each server, so
each Cyrus IMAP instance will see whole filesystem. Of course, here we
have same possible lock issues as with NFS. And we have no redundancy,
as I understand.
So, I need cluster for reliability to. I want to have each mailbox at
least on two servers (or to have shared storage upon two-nodes
failover cluster), so I can survive crash of any node (and not only of
disk system, that can be protected by RAID, but CPU crash, memory...)
with minimal recovery time.
You may also search the list archives for some more info on the topic.
There has been alot of discussion recently.

Simon
Igor Zhbanov
2006-11-17 11:28:32 UTC
Permalink
Post by Sarah Walters
-----Original Message-----
Igor Zhbanov
Sent: Friday, 17 November 2006 10:32 AM
Subject: Re: Cyrus IMAP and MySQL mailboxes (Building
load-balancing cluster)
Post by Sarah Walters
Post by Igor Zhbanov
Post by a***@morrison-ind.com
Post by Igor Zhbanov
The main problem is to build a shared storage that can
survive server
Post by a***@morrison-ind.com
Post by Igor Zhbanov
crashes, where mail will be stored.
Use a SAN.
First of all, such SAN must be very reliable itself.
Second, it must
Post by Sarah Walters
Post by Igor Zhbanov
support some kind of global locking mechanism, so several
nodes can
Post by Sarah Walters
Post by Igor Zhbanov
use lock to protect file from simultaneous writing.
Third, Cyrus IMAP
Post by Sarah Walters
Post by Igor Zhbanov
must lock mailboxes, so several instances on different
server can work
Post by Sarah Walters
Post by Igor Zhbanov
with one mailbox without conflicts. Whether Cyrus IMAP
use locks or
Post by Sarah Walters
Post by Igor Zhbanov
assumes that he is the only one who access mail box, I
don't know. Can
Post by Sarah Walters
Post by Igor Zhbanov
it safely access one mailbox from different servers, I
don't know too.
Post by Sarah Walters
A good commercial SAN costs a fortune, but it is very
reliable. And did
Post by Sarah Walters
you want a *cluster* or a group of servers? A cluster
should operate as
Post by Sarah Walters
if it is a single host, and thus avoid the locking issues.
Have a look
Post by Sarah Walters
at Red Hat's clustering product for an example. If our cyrus
installation was going to be clustered, RH cluster with a
SAN backend is
Post by Sarah Walters
what we would do.
Yes, I need the cluster exactly. Have I lots of domains, I could store
mailboxes of each domain on separate server. But I have only one big
domain. So, I need to spread mailboxes on one domain across several
servers. And than I need very clever load-balancer that will send
request to server where particular mailbox is. I don't know such
balancers aware of mail system structure.
I believe that Sun's mail server can do this, but if you lost a
particular
server you would lose the mailboxes on that server (or rather, would
have
to put up a new server connected to that part of the SAN storage).
Why don't you look at throwing two beefy boxes at this problem in a
hot-spare
configuration? Have a single large box managing the mail and a heartbeat
so
that if one goes down the other immediately takes over its IP and just
keeps
going? You will lose anything that is actually in memory, but that
shouldn't
be an issue as long as you are using a SAN and immediately committing to
disk
rather than using a solution like MySQL. There is no need for load
balancing
here as far as I can tell, and what you lose in having to buy a chunkier
server
you will gain in reduced power consumption and associated data centre
costs.
Yes, I know how failover cluster works. But what if one server
(active) can't process such a load? Suppose, we plan to have 100 000
users working actively with mail. I understand that it is possible to
use one monstrous server to take all of the load, but I am interested
in load-balancing solution on relatively inexpensive servers. And what
about slow anti-viruses for 100 000 users' mail? Or to use
load-balanced front-ends connected to single SAN and connected to
anti-virus load-balanced cluster? :-)

---
Post by Sarah Walters
Post by Sarah Walters
Post by Igor Zhbanov
Post by a***@morrison-ind.com
Post by Igor Zhbanov
The main problem is to build a shared storage that can
survive server
Post by a***@morrison-ind.com
Post by Igor Zhbanov
crashes, where mail will be stored.
Use a SAN.
First of all, such SAN must be very reliable itself. Second, it must
support some kind of global locking mechanism, so several nodes can
use lock to protect file from simultaneous writing. Third, Cyrus IMAP
must lock mailboxes, so several instances on different server can work
with one mailbox without conflicts. Whether Cyrus IMAP use locks or
assumes that he is the only one who access mail box, I don't know. Can
it safely access one mailbox from different servers, I don't know too.
A good commercial SAN costs a fortune, but it is very reliable. And did
you want a *cluster* or a group of servers? A cluster should operate as
if it is a single host, and thus avoid the locking issues. Have a look
at Red Hat's clustering product for an example. If our cyrus
installation was going to be clustered, RH cluster with a SAN backend is
what we would do.
Yes, I need the cluster exactly. Have I lots of domains, I could store
mailboxes of each domain on separate server. But I have only one big
domain. So, I need to spread mailboxes on one domain across several
servers. And than I need very clever load-balancer that will send
request to server where particular mailbox is. I don't know such
balancers aware of mail system structure.
The question here is what is "one big domain" exactly? How many users,
what's the expected mailbox size (quota) and which usage pattern. That's
what really matters here.
By one big domain I mean network domain. Consider gmail.com. You
cannot make load-balancing by domain names.
About 100 000 users with 100 mb mailboxes.
Post by Sarah Walters
Another way is to try to use something like GFS, constructed from
several servers. Each server will contain one portion of data and will
have Cyrus IMAP instance. But GFS will be mounted to each server, so
each Cyrus IMAP instance will see whole filesystem. Of course, here we
have same possible lock issues as with NFS. And we have no redundancy,
as I understand.
So, I need cluster for reliability to. I want to have each mailbox at
least on two servers (or to have shared storage upon two-nodes
failover cluster), so I can survive crash of any node (and not only of
disk system, that can be protected by RAID, but CPU crash, memory...)
with minimal recovery time.
You may also search the list archives for some more info on the topic.
There has been alot of discussion recently.
Please, tell me few keywords for search. Cyrus + GFS?
Adam Tauno Williams
2006-11-17 12:56:08 UTC
Permalink
Post by Igor Zhbanov
Yes, I know how failover cluster works. But what if one server
(active) can't process such a load? Suppose, we plan to have 100 000
users working actively with mail. I understand that it is possible to
use one monstrous server to take all of the load, but I am interested
in load-balancing solution on relatively inexpensive servers.
(a) SANs are not that expensive.
(b) SANs are *WAY* *WAY* more reliable than *ANY* storage solution you
can build yourself for the same amount of money. If you really don't
believe that you need to lay of smoking the good stuff. And (b.1) - if
you have that many users but can't afford a SAN...
(c) Then there is Cyrus replication and there is GFS. There was long
thread on Cyrus IMAP, HA, & GFS just back in October.
Post by Igor Zhbanov
And what
about slow anti-viruses for 100 000 users' mail? Or to use
load-balanced front-ends connected to single SAN and connected to
anti-virus load-balanced cluster? :-)
It doesn't require a cluster to load balance SMTP, traditional and well
established technologies will do that for you. Setup multiple SMTP
servers and publish multiple MX records.
Igor Zhbanov
2006-11-17 16:54:10 UTC
Permalink
Post by Adam Tauno Williams
Post by Igor Zhbanov
Yes, I know how failover cluster works. But what if one server
(active) can't process such a load? Suppose, we plan to have 100 000
users working actively with mail. I understand that it is possible to
use one monstrous server to take all of the load, but I am interested
in load-balancing solution on relatively inexpensive servers.
(a) SANs are not that expensive.
(b) SANs are *WAY* *WAY* more reliable than *ANY* storage solution you
can build yourself for the same amount of money. If you really don't
believe that you need to lay of smoking the good stuff. And (b.1) - if
you have that many users but can't afford a SAN...
(c) Then there is Cyrus replication and there is GFS. There was long
thread on Cyrus IMAP, HA, & GFS just back in October.
Post by Igor Zhbanov
And what
about slow anti-viruses for 100 000 users' mail? Or to use
load-balanced front-ends connected to single SAN and connected to
anti-virus load-balanced cluster? :-)
It doesn't require a cluster to load balance SMTP, traditional and well
established technologies will do that for you. Setup multiple SMTP
servers and publish multiple MX records.
Yes, I can use DNS-based load-balancing to spread load to several
frontends. But what about backends? :-) How to balance load for them?
Andrew Morgan
2006-11-17 21:24:49 UTC
Permalink
Post by Igor Zhbanov
Post by Adam Tauno Williams
Post by Igor Zhbanov
Yes, I know how failover cluster works. But what if one server
(active) can't process such a load? Suppose, we plan to have 100 000
users working actively with mail. I understand that it is possible to
use one monstrous server to take all of the load, but I am interested
in load-balancing solution on relatively inexpensive servers.
(a) SANs are not that expensive.
(b) SANs are *WAY* *WAY* more reliable than *ANY* storage solution you
can build yourself for the same amount of money. If you really don't
believe that you need to lay of smoking the good stuff. And (b.1) - if
you have that many users but can't afford a SAN...
(c) Then there is Cyrus replication and there is GFS. There was long
thread on Cyrus IMAP, HA, & GFS just back in October.
Post by Igor Zhbanov
And what
about slow anti-viruses for 100 000 users' mail? Or to use
load-balanced front-ends connected to single SAN and connected to
anti-virus load-balanced cluster? :-)
It doesn't require a cluster to load balance SMTP, traditional and well
established technologies will do that for you. Setup multiple SMTP
servers and publish multiple MX records.
Yes, I can use DNS-based load-balancing to spread load to several
frontends. But what about backends? :-) How to balance load for them?
There are basically 3 known solutions to building a scalable Cyrus system:

1. Cyrus murder + Cyrus replication. Cyrus murder distributes mailboxes
across multiple backend servers. The murder backend servers could use
local storage, or be connected to a SAN, non-shared file system. The
murder frontends (you can run as many frontends as you want) accept
incoming IMAP and LMTP connections and route them to the correct murder
backend server. You could use DNS round-robin to load balance connections
between the murder frontends, or you could use something more
sophisticated like LVS or a hardware-based network load balancer. Use
Cyrus replication to keep a backup copy of each murder backend.

2. Cyrus replication + perdition/nginx. Manually distribute your
mailboxes between multiple Cyrus servers (in a non-murder configuration).
Use Perdition or nginx to route incoming IMAP connections to the correct
server. Use Cyrus replication to keep a backup copy of each murder
backend.

3. Cyrus + SAN + clustering. Use multiple servers in a cluster, connected
to a SAN. Several different people have attempted this according to
recent mailing list postings here. The only successful cluster I'm aware
of was a Tru64 cluster.

Andy
Marcelo Maraboli
2006-11-20 11:34:07 UTC
Permalink
Andrew

thanks for the "scalable" Cyrus solutions, but I´m wondering
what can be done for Availability solutions ??

What if an IMAP server dies (we had this happen) ??

We have a Solaris Server with a RAID5 disk array storing the
MBOX, but the server died....so downtime was a bit huge..

I wan to build a 100% available IMAP solution...is there any?

regards,
Post by Andrew Morgan
Post by Igor Zhbanov
Post by Adam Tauno Williams
Post by Igor Zhbanov
Yes, I know how failover cluster works. But what if one server
(active) can't process such a load? Suppose, we plan to have 100 000
users working actively with mail. I understand that it is possible to
use one monstrous server to take all of the load, but I am interested
in load-balancing solution on relatively inexpensive servers.
(a) SANs are not that expensive.
(b) SANs are *WAY* *WAY* more reliable than *ANY* storage solution you
can build yourself for the same amount of money. If you really don't
believe that you need to lay of smoking the good stuff. And (b.1) - if
you have that many users but can't afford a SAN...
(c) Then there is Cyrus replication and there is GFS. There was long
thread on Cyrus IMAP, HA, & GFS just back in October.
Post by Igor Zhbanov
And what
about slow anti-viruses for 100 000 users' mail? Or to use
load-balanced front-ends connected to single SAN and connected to
anti-virus load-balanced cluster? :-)
It doesn't require a cluster to load balance SMTP, traditional and well
established technologies will do that for you. Setup multiple SMTP
servers and publish multiple MX records.
Yes, I can use DNS-based load-balancing to spread load to several
frontends. But what about backends? :-) How to balance load for them?
1. Cyrus murder + Cyrus replication. Cyrus murder distributes mailboxes
across multiple backend servers. The murder backend servers could use
local storage, or be connected to a SAN, non-shared file system. The
murder frontends (you can run as many frontends as you want) accept
incoming IMAP and LMTP connections and route them to the correct murder
backend server. You could use DNS round-robin to load balance
connections between the murder frontends, or you could use something
more sophisticated like LVS or a hardware-based network load balancer.
Use Cyrus replication to keep a backup copy of each murder backend.
2. Cyrus replication + perdition/nginx. Manually distribute your
mailboxes between multiple Cyrus servers (in a non-murder
configuration). Use Perdition or nginx to route incoming IMAP
connections to the correct server. Use Cyrus replication to keep a
backup copy of each murder backend.
3. Cyrus + SAN + clustering. Use multiple servers in a cluster,
connected to a SAN. Several different people have attempted this
according to recent mailing list postings here. The only successful
cluster I'm aware of was a Tru64 cluster.
Andy
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
--
Marcelo Maraboli Rosselott
Jefe Area de Redes y Comunicaciones (Network & UNIX Systems Engineer)
Ingeniero Civil Electronico, CISSP (Electronic Engineer, CISSP)

Direccion Central de Servicios Computacionales (DCSC)
Universidad Tecnica Federico Santa Maria phone: +56 32 2654071
Chile. http://www.usm.cl http://elqui.dcsc.utfsm.cl
Chris St. Pierre
2006-11-20 16:48:24 UTC
Permalink
Post by Marcelo Maraboli
Andrew
thanks for the "scalable" Cyrus solutions, but IŽm wondering
what can be done for Availability solutions ??
What if an IMAP server dies (we had this happen) ??
We have a Solaris Server with a RAID5 disk array storing the
MBOX, but the server died....so downtime was a bit huge..
I wan to build a 100% available IMAP solution...is there any?
Andrew's options 1 and 3 both ensure very high availability (with a
failover mupdate server in the murder). We use a SAN for mail
storage, and then have two IMAP servers that do failover, so if one
dies the other picks up. This is probably the simplest high
availability solution. Add a second failover SAN (or use a clustering
SAN solution with multiple, failsafe nodes) and you'll have extremely
high uptime. Clients that use persistent IMAP connections will have
problems, but those are few and far between.

Even with all that, 100% availability might be setting your sights a
little high. :)

Chris St. Pierre
Unix Systems Administrator
Nebraska Wesleyan University
Post by Marcelo Maraboli
Post by Andrew Morgan
Post by Igor Zhbanov
Post by Adam Tauno Williams
Post by Igor Zhbanov
Yes, I know how failover cluster works. But what if one server
(active) can't process such a load? Suppose, we plan to have 100 000
users working actively with mail. I understand that it is possible to
use one monstrous server to take all of the load, but I am interested
in load-balancing solution on relatively inexpensive servers.
(a) SANs are not that expensive.
(b) SANs are *WAY* *WAY* more reliable than *ANY* storage solution you
can build yourself for the same amount of money. If you really don't
believe that you need to lay of smoking the good stuff. And (b.1) - if
you have that many users but can't afford a SAN...
(c) Then there is Cyrus replication and there is GFS. There was long
thread on Cyrus IMAP, HA, & GFS just back in October.
Post by Igor Zhbanov
And what
about slow anti-viruses for 100 000 users' mail? Or to use
load-balanced front-ends connected to single SAN and connected to
anti-virus load-balanced cluster? :-)
It doesn't require a cluster to load balance SMTP, traditional and well
established technologies will do that for you. Setup multiple SMTP
servers and publish multiple MX records.
Yes, I can use DNS-based load-balancing to spread load to several
frontends. But what about backends? :-) How to balance load for them?
1. Cyrus murder + Cyrus replication. Cyrus murder distributes mailboxes
across multiple backend servers. The murder backend servers could use local
storage, or be connected to a SAN, non-shared file system. The murder
frontends (you can run as many frontends as you want) accept incoming IMAP
and LMTP connections and route them to the correct murder backend server.
You could use DNS round-robin to load balance connections between the murder
frontends, or you could use something more sophisticated like LVS or a
hardware-based network load balancer. Use Cyrus replication to keep a backup
copy of each murder backend.
2. Cyrus replication + perdition/nginx. Manually distribute your mailboxes
between multiple Cyrus servers (in a non-murder configuration). Use Perdition
or nginx to route incoming IMAP connections to the correct server. Use Cyrus
replication to keep a backup copy of each murder backend.
3. Cyrus + SAN + clustering. Use multiple servers in a cluster, connected to
a SAN. Several different people have attempted this according to recent
mailing list postings here. The only successful cluster I'm aware of was a
Tru64 cluster.
Andy
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
--
Marcelo Maraboli Rosselott
Jefe Area de Redes y Comunicaciones (Network & UNIX Systems Engineer)
Ingeniero Civil Electronico, CISSP (Electronic Engineer, CISSP)
Direccion Central de Servicios Computacionales (DCSC)
Universidad Tecnica Federico Santa Maria phone: +56 32 2654071
Chile. http://www.usm.cl http://elqui.dcsc.utfsm.cl
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
Robert Banz
2006-11-20 17:15:50 UTC
Permalink
Post by Marcelo Maraboli
Andrew
thanks for the "scalable" Cyrus solutions, but I´m wondering
what can be done for Availability solutions ??
What if an IMAP server dies (we had this happen) ??
We have a Solaris Server with a RAID5 disk array storing the
MBOX, but the server died....so downtime was a bit huge..
I wan to build a 100% available IMAP solution...is there any?
Even with a couple dumptrucks full of money, nobody could pull off 100%.

...but you can get really close.

Consider housing your storage on *two* individual storage devices,
potentially in different buildings. Mirror between them at the host
level (cheap), or, through multiple redundant storage virtualizers
(pricey)

Set up an active/passive cluster. Relatively "easy" to do nowadays.
Or go the clustered FS route (but then you'd probably need the
storage virtualizers)

Basically, for each '9' after 99% uptime, expect to double your cost
and complexity of implementation.

-rob
Marcelo Maraboli
2006-11-21 12:12:21 UTC
Permalink
Hi.

thanks for the input, I know wishing 100% is only available
with a gooooogle size amount of money ;), but I am looking
for a CYRUS IMAP server solution similar to a load balancing
web server farm...i.e:

- a Load balancing server (PEN in Freebsd if you like) that
will direct an IMAP session to ANY of a group of IMAP servers,
all of which have access to a central storage of user MBOXs.

So if any of the IMAP (backend) server dies, the load balancer with
automatically not forward any new requests to that server
and users won´t notice any downtime..

this is diferent from Andrew´s solution number 1, since ANY of
the backend IMAP server should accept connections for ANY user.

examples:
http://siag.nu/pen/vrrpd-linux.shtml
http://redundancy.org/fbsd_lb.html

can IMAP be set up this way ??

regards,
Post by Robert Banz
Post by Marcelo Maraboli
Andrew
thanks for the "scalable" Cyrus solutions, but I´m wondering
what can be done for Availability solutions ??
What if an IMAP server dies (we had this happen) ??
We have a Solaris Server with a RAID5 disk array storing the
MBOX, but the server died....so downtime was a bit huge..
I wan to build a 100% available IMAP solution...is there any?
Even with a couple dumptrucks full of money, nobody could pull off 100%.
...but you can get really close.
Consider housing your storage on *two* individual storage devices,
potentially in different buildings. Mirror between them at the host
level (cheap), or, through multiple redundant storage virtualizers (pricey)
Set up an active/passive cluster. Relatively "easy" to do nowadays. Or
go the clustered FS route (but then you'd probably need the storage
virtualizers)
Basically, for each '9' after 99% uptime, expect to double your cost and
complexity of implementation.
-rob
--
Marcelo Maraboli Rosselott
Jefe Area de Redes y Comunicaciones (Network & UNIX Systems Engineer)
Ingeniero Civil Electronico, CISSP (Electronic Engineer, CISSP)

Direccion Central de Servicios Computacionales (DCSC)
Universidad Tecnica Federico Santa Maria phone: +56 32 2654071
Chile. http://www.usm.cl http://elqui.dcsc.utfsm.cl
Bron Gondwana
2006-11-17 06:50:54 UTC
Permalink
Post by Igor Zhbanov
Yes, I need the cluster exactly. Have I lots of domains, I could store
mailboxes of each domain on separate server. But I have only one big
domain. So, I need to spread mailboxes on one domain across several
servers. And than I need very clever load-balancer that will send
request to server where particular mailbox is. I don't know such
balancers aware of mail system structure.
We use nginx with an authentication daemon which returns the backend
server name along with the authentication response. You can also use
perdition if you're willing to put up with the scalability hit of
process-per-connection. Similarly there's an authentication daemon that
speaks saslauthd protocol to Cyrus on the backend.
Post by Igor Zhbanov
So, I need cluster for reliability to. I want to have each mailbox at
least on two servers (or to have shared storage upon two-nodes
failover cluster), so I can survive crash of any node (and not only of
disk system, that can be protected by RAID, but CPU crash, memory...)
with minimal recovery time.
That's what cyrus replication in 2.3.3+ is for, and why we're using that
in addition to the nginx frontends spreading the load out. This works
very nicely with having about 6 times as many "hosts" as real machines,
because otherwise you tend to run out of TCP port pairs on the frontend
eventually.

Bron.
Sarah Walters
2006-11-17 01:54:39 UTC
Permalink
-----Original Message-----
Igor Zhbanov
Sent: Friday, 17 November 2006 10:32 AM
Subject: Re: Cyrus IMAP and MySQL mailboxes (Building
load-balancing cluster)
Post by Sarah Walters
Post by Igor Zhbanov
Post by a***@morrison-ind.com
Post by Igor Zhbanov
The main problem is to build a shared storage that can
survive server
Post by a***@morrison-ind.com
Post by Igor Zhbanov
crashes, where mail will be stored.
Use a SAN.
First of all, such SAN must be very reliable itself.
Second, it must
Post by Sarah Walters
Post by Igor Zhbanov
support some kind of global locking mechanism, so several
nodes can
Post by Sarah Walters
Post by Igor Zhbanov
use lock to protect file from simultaneous writing.
Third, Cyrus IMAP
Post by Sarah Walters
Post by Igor Zhbanov
must lock mailboxes, so several instances on different
server can work
Post by Sarah Walters
Post by Igor Zhbanov
with one mailbox without conflicts. Whether Cyrus IMAP
use locks or
Post by Sarah Walters
Post by Igor Zhbanov
assumes that he is the only one who access mail box, I
don't know. Can
Post by Sarah Walters
Post by Igor Zhbanov
it safely access one mailbox from different servers, I
don't know too.
Post by Sarah Walters
A good commercial SAN costs a fortune, but it is very
reliable. And did
Post by Sarah Walters
you want a *cluster* or a group of servers? A cluster
should operate as
Post by Sarah Walters
if it is a single host, and thus avoid the locking issues.
Have a look
Post by Sarah Walters
at Red Hat's clustering product for an example. If our cyrus
installation was going to be clustered, RH cluster with a
SAN backend is
Post by Sarah Walters
what we would do.
Yes, I need the cluster exactly. Have I lots of domains, I could store
mailboxes of each domain on separate server. But I have only one big
domain. So, I need to spread mailboxes on one domain across several
servers. And than I need very clever load-balancer that will send
request to server where particular mailbox is. I don't know such
balancers aware of mail system structure.
I believe that Sun's mail server can do this, but if you lost a
particular
server you would lose the mailboxes on that server (or rather, would
have
to put up a new server connected to that part of the SAN storage).

Why don't you look at throwing two beefy boxes at this problem in a
hot-spare
configuration? Have a single large box managing the mail and a heartbeat
so
that if one goes down the other immediately takes over its IP and just
keeps
going? You will lose anything that is actually in memory, but that
shouldn't
be an issue as long as you are using a SAN and immediately committing to
disk
rather than using a solution like MySQL. There is no need for load
balancing
here as far as I can tell, and what you lose in having to buy a chunkier
server
you will gain in reduced power consumption and associated data centre
costs.

Regards,
Sarah Walters
Bron Gondwana
2006-11-17 06:54:34 UTC
Permalink
Post by Sarah Walters
Why don't you look at throwing two beefy boxes at this problem in a
hot-spare
configuration? Have a single large box managing the mail and a heartbeat
so
that if one goes down the other immediately takes over its IP and just
keeps
going? You will lose anything that is actually in memory, but that
shouldn't
be an issue as long as you are using a SAN and immediately committing to
disk
rather than using a solution like MySQL. There is no need for load
balancing
here as far as I can tell, and what you lose in having to buy a chunkier
server
you will gain in reduced power consumption and associated data centre
costs.
That doesn't scale forever unfortunately, though you can get pretty
beefy, eventually you need to scale sideways. Once you're scaling to
multiple machines you need a proxying frontend of some sort (murder or
our perdition/nginx solution, take your pick) and this all becomes a lot
easier.

Also it means you don't have hot spare hardware (or cold spare hardware)
sitting in your datacentre doing nothing if you're willing to take the
time to make multiple cyrus instances work on the same machine. Then
you can have both masters and replicas on the same host, and just switch
one up to be a master when the other dies.

Bron.
Igor Zhbanov
2006-11-17 12:46:19 UTC
Permalink
Post by Bron Gondwana
Post by Sarah Walters
Why don't you look at throwing two beefy boxes at this problem in a
hot-spare
configuration? Have a single large box managing the mail and a heartbeat
so
that if one goes down the other immediately takes over its IP and just
keeps
going? You will lose anything that is actually in memory, but that
shouldn't
be an issue as long as you are using a SAN and immediately committing to
disk
rather than using a solution like MySQL. There is no need for load
balancing
here as far as I can tell, and what you lose in having to buy a chunkier
server
you will gain in reduced power consumption and associated data centre
costs.
That doesn't scale forever unfortunately, though you can get pretty
beefy, eventually you need to scale sideways. Once you're scaling to
multiple machines you need a proxying frontend of some sort (murder or
our perdition/nginx solution, take your pick) and this all becomes a lot
easier.
Also it means you don't have hot spare hardware (or cold spare hardware)
sitting in your datacentre doing nothing if you're willing to take the
time to make multiple cyrus instances work on the same machine. Then
you can have both masters and replicas on the same host, and just switch
one up to be a master when the other dies.
I have thought about LVS (Linux Virtual Server) load-balancer. As I
understand, having some kind of shared storage, I can build system
without spare servers. All frontends will be equal to each other. And
all of them will be loaded equally by load-balancer.
Sarah Walters
2006-11-23 00:10:13 UTC
Permalink
Marcelo et al,
-----Original Message-----
Marcelo Maraboli
thanks for the input, I know wishing 100% is only available
with a gooooogle size amount of money ;), but I am looking
for a CYRUS IMAP server solution similar to a load balancing
- a Load balancing server (PEN in Freebsd if you like) that
will direct an IMAP session to ANY of a group of IMAP servers,
all of which have access to a central storage of user MBOXs.
So if any of the IMAP (backend) server dies, the load balancer with
automatically not forward any new requests to that server
and users won´t notice any downtime..
this is diferent from Andrew´s solution number 1, since ANY of
the backend IMAP server should accept connections for ANY user.
http://siag.nu/pen/vrrpd-linux.shtml
http://redundancy.org/fbsd_lb.html
can IMAP be set up this way ??
regards,
This need is why I suggested beefy servers rather than the Murder, which I don't consider sufficiently highly available due to actually being a number of discrete servers at the back end. Great for load balancing, useless for instant failover in case of server loss.

In short, as I understand it Cyrus cannot be set up this way. Only a single machine can have write privileges to the mailboxes database at a time. The only way I can see to do this is to use NFSv4 which is supposed to get the locking correct. Then, assuming the database is closed between changes (can a developer please confirm whether it is kept open by master or not?) you should be able to run multiple IMAP servers over the same filesystem stored on a NAS (network-attached storage, as opposed to SAN). That is the only way I can think of to do what you are after. You would need two NAS boxes, ideally in separate buildings, with live mirroring (10 Gb fibre or copper connection between) and a bunch of cheap servers in each building all load-balanced. You should be able to lose a complete data centre and just keep running at 50% capacity as long as your network is properly routed (with redundancy in case of an idiot with a spade cutting through your fibre of course).

It's expensive, but it should work if the database is not held open. If it is, then you need to look at a different email product. Cyrus is a great server, but if you need five 9s reliability then you have to pay for it. You could always look at an appliance - dedicated hardware is often more reliable and at least if it goes down you can scream at the vendor and cover your butt that way.

Regards,
Sarah Walters
Marcelo Maraboli
2006-11-23 01:48:26 UTC
Permalink
Dear Sarah

thank you for your thorough answer !

maybe we can wait and see if Cyrus 2.3.7 and mupdate
can do the job along with FreeBSD+PEN+VRRPD...i´ll test it.

best regards,
Post by Sarah Walters
Marcelo et al,
-----Original Message-----
Marcelo Maraboli
thanks for the input, I know wishing 100% is only available
with a gooooogle size amount of money ;), but I am looking
for a CYRUS IMAP server solution similar to a load balancing
- a Load balancing server (PEN in Freebsd if you like) that
will direct an IMAP session to ANY of a group of IMAP servers,
all of which have access to a central storage of user MBOXs.
So if any of the IMAP (backend) server dies, the load balancer with
automatically not forward any new requests to that server
and users won´t notice any downtime..
this is diferent from Andrew´s solution number 1, since ANY of
the backend IMAP server should accept connections for ANY user.
http://siag.nu/pen/vrrpd-linux.shtml
http://redundancy.org/fbsd_lb.html
can IMAP be set up this way ??
regards,
This need is why I suggested beefy servers rather than the Murder, which I don't consider sufficiently highly available due to actually being a number of discrete servers at the back end. Great for load balancing, useless for instant failover in case of server loss.
In short, as I understand it Cyrus cannot be set up this way. Only a single machine can have write privileges to the mailboxes database at a time. The only way I can see to do this is to use NFSv4 which is supposed to get the locking correct. Then, assuming the database is closed between changes (can a developer please confirm whether it is kept open by master or not?) you should be able to run multiple IMAP servers over the same filesystem stored on a NAS (network-attached storage, as opposed to SAN). That is the only way I can think of to do what you are after. You would need two NAS boxes, ideally in separate buildings, with live mirroring (10 Gb fibre or copper connection between) and a bunch of cheap servers in each building all load-balanced. You should be able to lose a complete data centre and just keep running at 50% capacity as long as your network is properly routed (with redundancy in case of an idiot with a spade cutting through your fibre of course).
It's expensive, but it should work if the database is not held open. If it is, then you need to look at a different email product. Cyrus is a great server, but if you need five 9s reliability then you have to pay for it. You could always look at an appliance - dedicated hardware is often more reliable and at least if it goes down you can scream at the vendor and cover your butt that way.
Regards,
Sarah Walters
----
Cyrus Home Page: http://cyrusimap.web.cmu.edu/
Cyrus Wiki/FAQ: http://cyrusimap.web.cmu.edu/twiki
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html
--
Marcelo Maraboli Rosselott
Jefe Area de Redes y Comunicaciones (Network & UNIX Systems Engineer)
Ingeniero Civil Electronico, CISSP (Electronic Engineer, CISSP)

Direccion Central de Servicios Computacionales (DCSC)
Universidad Tecnica Federico Santa Maria phone: +56 32 2654071
Chile. http://www.usm.cl http://elqui.dcsc.utfsm.cl
Wesley Craig
2006-11-23 03:17:23 UTC
Permalink
Post by Sarah Walters
Only a single machine can have write privileges to the mailboxes
database at a time.
Actually, only a single process can be writing to a mailboxes
database at a time.
Post by Sarah Walters
Then, assuming the database is closed between changes (can a
developer please confirm whether it is kept open by master or not?)
you should be able to run multiple IMAP servers over the same
filesystem stored on a NAS (network-attached storage, as opposed to
SAN).
I'm not sure why closing the database between changes might be
important. Multiple imapd processes on backends all access the same
mailboxes database. Writes are serialized with locks. If a network
filesystem implements these locks correctly, there's no reason why
you couldn't run multiple "backends" against a single redundant
filesystem. There are a number of examples of people doing this.

:wes
Simon Matter
2006-11-23 07:42:16 UTC
Permalink
Post by Sarah Walters
Marcelo et al,
-----Original Message-----
Marcelo Maraboli
thanks for the input, I know wishing 100% is only available
with a gooooogle size amount of money ;), but I am looking
for a CYRUS IMAP server solution similar to a load balancing
- a Load balancing server (PEN in Freebsd if you like) that
will direct an IMAP session to ANY of a group of IMAP servers,
all of which have access to a central storage of user MBOXs.
So if any of the IMAP (backend) server dies, the load balancer with
automatically not forward any new requests to that server
and users won´t notice any downtime..
this is diferent from Andrew´s solution number 1, since ANY of
the backend IMAP server should accept connections for ANY user.
http://siag.nu/pen/vrrpd-linux.shtml
http://redundancy.org/fbsd_lb.html
can IMAP be set up this way ??
regards,
This need is why I suggested beefy servers rather than the Murder, which I
don't consider sufficiently highly available due to actually being a
number of discrete servers at the back end. Great for load balancing,
useless for instant failover in case of server loss.
In short, as I understand it Cyrus cannot be set up this way. Only a
single machine can have write privileges to the mailboxes database at a
time. The only way I can see to do this is to use NFSv4 which is supposed
to get the locking correct. Then, assuming the database is closed between
changes (can a developer please confirm whether it is kept open by master
or not?) you should be able to run multiple IMAP servers over the same
filesystem stored on a NAS (network-attached storage, as opposed to SAN).
That is the only way I can think of to do what you are after. You would
need two NAS boxes, ideally in separate buildings, with live mirroring (10
Gb fibre or copper connection between) and a bunch of cheap servers in
each building all load-balanced. You should be able to lose a complete
data centre and just keep running at 50% capacity as long as your network
is properly routed (with redundancy in case of an idiot with a spade
cutting through your fibre of course).
It's expensive, but it should work if the database is not held open. If it
is, then you need to look at a different email product. Cyrus is a great
server, but if you need five 9s reliability then you have to pay for it.
You could always look at an appliance - dedicated hardware is often more
reliable and at least if it goes down you can scream at the vendor and
cover your butt that way.
Hi Sarah,

I'm really confused now.

1) You are talking about NAS as a possible solution and I don't know how
that should work if NFS doesn't work. Until now I thought a NAS device is
an embedded fileserver which can be accessed using different network
filesystems like SMB/CIFS, NFS or whatever. As long as it doesn't provide
proper locking (which may only be true if it provides NFSv4), it will
never work as a shared storage among more than one cyrus-imapd server.

2) Several people on this list have confirmed that they are running
cyrus-imapd clusters on shared storage (SAN) which works fine with a
cluster filesystem. That tells me that shared access to cyrus databases
works fine as long as the filesystem used provides proper locking, which
means in case of a cluster that the cluster filesystem has to coordinate
locking among all cluster members. Isn't that the main reason why those
filesystems exist?

3) I don't think the reliability of an appliance hardware is any better
than good server and SAN components. In fact most appliances I saw are
simply relabeled DELL or Intel OEM hardware. What you say may be true for
the installed software but I don't think it's true for the hardware.

The whole high availability thing has been discussed on this list so many
times and I'm a bit confused again. It's a really interesting topic.

Simon
Janne Peltonen
2006-11-23 08:30:49 UTC
Permalink
Hi all,

just to comment on one isolated point:

On Thu, Nov 23, 2006 at 08:42:16AM +0100, Simon Matter wrote:
[...]
Post by Simon Matter
2) Several people on this list have confirmed that they are running
cyrus-imapd clusters on shared storage (SAN) which works fine with a
cluster filesystem. That tells me that shared access to cyrus databases
works fine as long as the filesystem used provides proper locking, which
means in case of a cluster that the cluster filesystem has to coordinate
locking among all cluster members. Isn't that the main reason why those
filesystems exist?
[...]

You're completely correct, this is the kind of system I'm building and
others have had in production use (see the thread on clusters, GFS and
HA, for instance - begins at
http://www.mail-archive.com/info-***@lists.andrew.cmu.edu/msg30675.html
(couldn't open the official info-cyrus archive)). So you /can/ run Cyrus
on multiple servers accessing the same mailboxes database. Just don't
use BDB. Don't even compile it in.


--Janne Peltonen
IMAP admin
Univ. of Helsinki
Amos
2006-11-23 15:50:23 UTC
Permalink
Post by Janne Peltonen
You're completely correct, this is the kind of system I'm building and
others have had in production use (see the thread on clusters, GFS and
HA, for instance - begins at
(couldn't open the official info-cyrus archive)). So you /can/ run Cyrus
on multiple servers accessing the same mailboxes database. Just don't
use BDB. Don't even compile it in.
It's a shame that apparently the only true HA/LB cluster appears to
have been done with Veritas cluster. Actually, we're hoping to get
away from VxFS/VxVM and use ZFS. Not that VxFS/VxVM are bad products,
but they can be a bear to manage.

Amos
Sarah Walters
2006-11-23 03:58:38 UTC
Permalink
Wesley et al,
-----Original Message-----
Sent: Thursday, 23 November 2006 1:17 PM
To: Sarah Walters
Subject: Re: Cyrus IMAP and MySQL mailboxes (Building
load-balancing cluster)
Post by Sarah Walters
Only a single machine can have write privileges to the mailboxes
database at a time.
Actually, only a single process can be writing to a mailboxes
database at a time.
I think you misinterpreted me. I said write "privileges". All the
processes have that privilege, they just respect the locks.
Post by Sarah Walters
Then, assuming the database is closed between changes (can a
developer please confirm whether it is kept open by master
or not?)
Post by Sarah Walters
you should be able to run multiple IMAP servers over the same
filesystem stored on a NAS (network-attached storage, as
opposed to
Post by Sarah Walters
SAN).
I'm not sure why closing the database between changes might be
important. Multiple imapd processes on backends all access thesame
mailboxes database. Writes are serialized with locks. If a network
filesystem implements these locks correctly, there's no reason why
you couldn't run multiple "backends" against a single redundant
filesystem. There are a number of examples of people doing this.
Because database locks are an operating system or program-dependant
feature and don't operate over NFS. It's built into the BDB library.
Filesystem locks are on *files*. Obviously it would be possible to
do exclusive write locks over NFS as required and otherwise just
have the file open. The issue then becomes one of operating system
caching. You would need to label the filesystem as volatile so that
there is no local caching of the disk, or you would risk database
changes made on one host being overwritten when the next host writes
to the database. You are also at risk of not seeing changes as they
are made. I haven't looked into this lately, so it may be the case
that no caching is the default for NFSv4. Closing and re-opening
the file is a reliable way to ensure that the file is re-read in its
entirety. Of course, there's a performance hit there. That is an
argument for a database server that can do entry-by-entry locking
of the mailboxes database as requested by networked clients. Which
takes us right back to the "MySQL" angle on this conversation! While
I wouldn't want to see the actual mail stored in MySQL, you could
make an argument for storing the mailboxes database in that way.
There are lighter-weight solutions that would carry all the needed
functionality so you wouldn't need MySQL. Of course, then you need
a cluster for the database server too!

This is an interesting discussion, I'm keen for comments on the
above.

Regards,
Sarah Walters
Amos
2006-11-23 04:55:22 UTC
Permalink
Post by Sarah Walters
takes us right back to the "MySQL" angle on this conversation! While
I wouldn't want to see the actual mail stored in MySQL, you could
make an argument for storing the mailboxes database in that way.
I seem to recall that's what they do with Zimbra. Not only something
like the mailboxes.db in MySQL, but also the Cyrus cache files within
each folder, and the squat index. I believe they now also leverage
this for fast cross-folder searching. However, like Cyrus, the
messages remain as individual files in the filesystem.
Post by Sarah Walters
There are lighter-weight solutions that would carry all the needed
functionality so you wouldn't need MySQL. Of course, then you need
a cluster for the database server too!
I believe BerkelyDB now supports clustering, but I suspect that would
cause some to shudder at the thought of such a thing. ;-)

We also have had good experience with a PostgreSQL cluster that we use
in conjunction with our RADIUS servers.

Amos
Sarah Walters
2006-11-24 01:48:59 UTC
Permalink
All,
-----Original Message-----
[...]
Post by Simon Matter
2) Several people on this list have confirmed that they are running
cyrus-imapd clusters on shared storage (SAN) which works fine with a
cluster filesystem. That tells me that shared access to
cyrus databases
Post by Simon Matter
works fine as long as the filesystem used provides proper
locking, which
Post by Simon Matter
means in case of a cluster that the cluster filesystem has
to coordinate
Post by Simon Matter
locking among all cluster members. Isn't that the main
reason why those
Post by Simon Matter
filesystems exist?
[...]
You're completely correct, this is the kind of system I'm building and
others have had in production use (see the thread on clusters, GFS and
HA, for instance - begins at
g30675.html
(couldn't open the official info-cyrus archive)). So you
/can/ run Cyrus
on multiple servers accessing the same mailboxes database. Just don't
use BDB. Don't even compile it in.
Ah, didn't think of that. Yes, that would work just fine.

Regards,
Sarah Walters
Sarah Walters
2006-11-24 02:00:02 UTC
Permalink
-----Original Message-----
Hi Sarah,
I'm really confused now.
1) You are talking about NAS as a possible solution and I
don't know how
that should work if NFS doesn't work. Until now I thought a
NAS device is
an embedded fileserver which can be accessed using different network
filesystems like SMB/CIFS, NFS or whatever. As long as it
doesn't provide
proper locking (which may only be true if it provides NFSv4), it will
never work as a shared storage among more than one cyrus-imapd server.
As I said in an earlier post, NFSv4 only. And I haven't tested this, I
am just saying that v4 is *supposed* to sort out the locking issues. I
used to work in an NAS environment, am in a SAN environment now (though
Cyrus is on local disk for budgetary reasons I won't go into here). All
of the above is correct.
2) Several people on this list have confirmed that they are running
cyrus-imapd clusters on shared storage (SAN) which works fine with a
cluster filesystem. That tells me that shared access to cyrus
databases
works fine as long as the filesystem used provides proper
locking, which
means in case of a cluster that the cluster filesystem has to
coordinate
locking among all cluster members. Isn't that the main reason
why those
filesystems exist?
If it works, great. I haven't worked with this kind of cluster before,
so I don't have any experience with it. What I said is that *database*
locking is in the OS, not the filesystem, and as such the clustering
software wouldn't actually work. As another post (by Janne) pointed
out, if you avoid BDB then this isn't an issue because you would be
using filesystem locking.
3) I don't think the reliability of an appliance hardware is
any better
than good server and SAN components. In fact most appliances I saw are
simply relabeled DELL or Intel OEM hardware. What you say may
be true for
the installed software but I don't think it's true for the hardware.
I have used an excellent appliance in a prior workplace, but as I
haven't been there for a couple of years now I don't have recent data
and as such won't comment on the brand. But as it is known hardware,
you can have a much leaner operating system with all components
properly tested in a professional laboratory to ensure that they
work well together. Every patch is tested for impact on the entire
system. The appliance I had in mind is still Cyrus at its guts I
believe. It was merely another option to consider. The nice thing
about appliances is that management types can't hijack the spare
CPU cycles for their "project of the week" :)

Regards,
Sarah Walters
Simon Matter
2006-11-24 07:08:05 UTC
Permalink
Post by Sarah Walters
Post by Simon Matter
2) Several people on this list have confirmed that they are running
cyrus-imapd clusters on shared storage (SAN) which works fine with a
cluster filesystem. That tells me that shared access to cyrus
databases
works fine as long as the filesystem used provides proper
locking, which
means in case of a cluster that the cluster filesystem has to
coordinate
locking among all cluster members. Isn't that the main reason
why those
filesystems exist?
If it works, great. I haven't worked with this kind of cluster before,
so I don't have any experience with it. What I said is that *database*
locking is in the OS, not the filesystem, and as such the clustering
software wouldn't actually work. As another post (by Janne) pointed
out, if you avoid BDB then this isn't an issue because you would be
using filesystem locking.
While I agree that avoiding berkeley DB is a good thing, are you really
sure that the cluster won't work with it? What is really different between
using berkeley or skiplist DB for the cluster? I'm also running without
BDB but I want to understand how berkeley databases are different to
skiplist databases in a cluster situation.

Regards,
Simon
Igor Zhbanov
2006-11-25 20:56:20 UTC
Permalink
So, what is the best way to build load-balancing Cyrus IMAP cluster?
Nginx, perdition, Cyrus IMAP Aggregator, Cyrus IMAP murder, Cyrus IMAP
replication?

And can you tell me what problems to high-availability can happen when
using Cyrus IMAP replication? In what situations it is unreliable? And
is it possible to avoid it?

Thanks to all!
Janne Peltonen
2006-11-27 08:27:30 UTC
Permalink
Post by Igor Zhbanov
So, what is the best way to build load-balancing Cyrus IMAP cluster?
Nginx, perdition, Cyrus IMAP Aggregator, Cyrus IMAP murder, Cyrus IMAP
replication?
You forgot the simplest one: Cyrus IMAP on a cluster with no
replication, no murder, no nothing, but on a (really working) clustered
FS and no BDB. Oh yeah, you still have to have some way to make the
system appear to be a single system to the users. I'm probably going to
use just a simple round-robin DNS, but you could use an LVS frontend or
something similar, if you want real load-balancing.

It really all depends upon what you need. I prefer the aforementioned
solution because I think it is very simple at the application level (if
not the FS level).
Post by Igor Zhbanov
And can you tell me what problems to high-availability can happen when
using Cyrus IMAP replication? In what situations it is unreliable? And
is it possible to avoid it?
According to the Cambridge deployment doc, you can possibly lose a
couple of messages that arrive to the master that fails before the
messages get replicated. The replication is asynchronous. And the
problem can be avoided e.g. by setting the sendmail (or equiv.) to store
the delivered messages for a while (on another machine than the master).
So if the master fails, the queue can be replayed... see

http://www-uxsup.csx.cam.ac.uk/~dpc22/cyrus/replication.html


--Janne Peltonen
Igor Zhbanov
2006-11-28 08:54:55 UTC
Permalink
Post by Janne Peltonen
Post by Igor Zhbanov
So, what is the best way to build load-balancing Cyrus IMAP cluster?
Nginx, perdition, Cyrus IMAP Aggregator, Cyrus IMAP murder, Cyrus IMAP
replication?
You forgot the simplest one: Cyrus IMAP on a cluster with no
replication, no murder, no nothing, but on a (really working) clustered
FS and no BDB. Oh yeah, you still have to have some way to make the
system appear to be a single system to the users. I'm probably going to
use just a simple round-robin DNS, but you could use an LVS frontend or
something similar, if you want real load-balancing.
It really all depends upon what you need. I prefer the aforementioned
solution because I think it is very simple at the application level (if
not the FS level).
I need to build mail system without using uncommon hardware such as
shared storage connected via SCSI or fiber channel. I need
"software-only" solution. Can you recommend me what kind of clustered
filesystem and/or network block devices to use? GFS, DRBD-0.8,
something else? What is better, reliable and more tested?
Simon Matter
2006-11-28 10:01:47 UTC
Permalink
Post by Igor Zhbanov
Post by Janne Peltonen
Post by Igor Zhbanov
So, what is the best way to build load-balancing Cyrus IMAP cluster?
Nginx, perdition, Cyrus IMAP Aggregator, Cyrus IMAP murder, Cyrus IMAP
replication?
You forgot the simplest one: Cyrus IMAP on a cluster with no
replication, no murder, no nothing, but on a (really working) clustered
FS and no BDB. Oh yeah, you still have to have some way to make the
system appear to be a single system to the users. I'm probably going to
use just a simple round-robin DNS, but you could use an LVS frontend or
something similar, if you want real load-balancing.
It really all depends upon what you need. I prefer the aforementioned
solution because I think it is very simple at the application level (if
not the FS level).
I need to build mail system without using uncommon hardware such as
shared storage connected via SCSI or fiber channel. I need
"software-only" solution. Can you recommend me what kind of clustered
filesystem and/or network block devices to use? GFS, DRBD-0.8,
something else? What is better, reliable and more tested?
Gary Mills
2006-11-28 14:01:19 UTC
Permalink
Post by Igor Zhbanov
I need to build mail system without using uncommon hardware such as
shared storage connected via SCSI or fiber channel. I need
"software-only" solution. Can you recommend me what kind of clustered
filesystem and/or network block devices to use? GFS, DRBD-0.8,
something else? What is better, reliable and more tested?
In that case, you should consider iSCSI for shared storage. It has
the advantage that the host connections are ordinary ethernet.
--
-Gary Mills- -Unix Support- -U of M Academic Computing and Networking-
Adam Tauno Williams
2006-11-28 15:13:01 UTC
Permalink
Post by Gary Mills
Post by Igor Zhbanov
I need to build mail system without using uncommon hardware such as
shared storage connected via SCSI or fiber channel. I need
"software-only" solution. Can you recommend me what kind of clustered
filesystem and/or network block devices to use? GFS, DRBD-0.8,
something else? What is better, reliable and more tested?
In that case, you should consider iSCSI for shared storage. It has
the advantage that the host connections are ordinary ethernet.
And you can setup an ordinary LINUX box as an iSCSI target.

http://linux-iscsi.sourceforge.net/
http://www.cuddletech.com/articles/iscsi/index.html
http://www.cs.uml.edu/~mbrown/iSCSI/

Of course I really don't understand an attitude of I-really-need-HA but
an unwillingness to buy real SAN hardware. Real SANs are rigorously
tested and optimized to do exactly what they do more so than what you
will be able to accomplish with an ordinary host box.

"What is better, reliable and more tested?"

A real SAN. On the software side GFS is widely used, maybe not for
Cyrus, but it is stable.

Chaskiel M Grundman
2006-11-27 19:10:32 UTC
Permalink
--On Friday, November 24, 2006 08:08:05 AM +0100 Simon Matter
Post by Simon Matter
What is really different between
using berkeley or skiplist DB for the cluster? I'm also running without
BDB but I want to understand how berkeley databases are different to
skiplist databases in a cluster situation.
Berkeley DB does interprocess coordination by using mmaping files. One of
the things it can put in the mmaped files are locks (either custom
test-and-set spinlocks or pthread mutexes). It is not possible for these
locks to work correctly across NFS (or even GFS), since changes do not
appear instantaneously on all clients/nodes.

skiplists are much simpler data structures, as they are only appended to.
as long as file locks and O_APPEND work correctly, skiplists should work
correctly
Loading...