Wednesday, June 20, 2012

hornetQ high available cluster, step by step

well, unlike jboss cluster, hornetq cluster work on shared/common storage space that could be a SAN shared disc or even NFS mount, though best option is SAN disc, you must keep in mind that both cluster members will not write simultaneously on shared disc as oracle RAC do, but they will use it to validate master/live server health, soon master/live server die, slave/backup server will replace it and from passive started state will move to active started state and all persistent messages of queues will be served from slave/backup server. This failover process is automatic, even when master/live server will come back online, the slave/backup server will again set itself automatically to passive started mode and will leave master/slave to take the control of queues and client requests/responses.

hornetQ offer two type of cluster solution i.e. load-balance plus failover and just failover not load-balance, in our case we is used just failover not load-balance. i have little edited the official hornetq cluster diagram, below, because for no good reason they made shared storage in two separate places and then the storage show no link among each other, while actually it must be single storage with link to both Live and Backup server.


for more details you must consult official guide or check below my step by step for hornetQ cluster installation.

the startup script was made by our development team as they like to pass JVM arguments inside script, though yo can use default run.sh which is more easy to understand.

find below step by step for hornetQ high available cluster.

Download

Cheers

4 comments:

xue said...
This comment has been removed by the author.
xue said...

Hi Nayyar,

I have found your blog when searching for Hornetq Cluster+Failover configuration.

I have created a similar configuraiton as you did, but with load balancing.

But my backup machine seems not working well. After failing over. It announced itself to be alive already, but doesn't handle the message originally stored in the Queue.

Do you have any idea, what could be the reason?

PS: I use the embedded Hornetq as the Live, but the Standalone hornetq as the backup.

Many thanks in advance!

Xue

nayyares said...

Hi Xue,

your journal directory is NFS or external storage? does it has synchronous disc updates?

xue said...

Hi Nayyar,

I am using NFS at this moment and with the following setting in /etc/exports.
/tmp/share *(rw,fsid=0,no_root_squash)

Thus i assume, by default there's no synchronous disc updates.

But after the backup machine became alive. I have checked the journal file with XmlDataExporter and PrintData method of hornetq and I have seen the message present in the journal, but not sent to the other Live machine.