[Ocfs-users] cluster with 2 nodes - heartbeat problem fencing

g.digiambelardini at fabaris.it g.digiambelardini at fabaris.it
Fri Mar 7 01:16:14 PST 2008


Hi,
My real problem is that if the communication link between the two node
breaks, specially on node 0, the node 1 shouldn't to be fence, beacause
work well, but it would be more normal fence node 0 not node 1.
for this reason i'd like stop heartbeat, so averythink work well, but exist
a method for do it?



-----Sunil Mushran <Sunil.Mushran at oracle.com> wrote: -----

To: g.digiambelardini at fabaris.it
From: Sunil Mushran <Sunil.Mushran at oracle.com>
Date: 06/03/2008 19.13
cc: ocfs-users at oss.oracle.com
Subject: Re: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing

What that note says is that in a 2 node setup, if the communication link
between the two node breaks, the higher node number will be fenced.

In your case, you are shutting down the network on node 0. The cluster
stack
sees this as a comm link down between the two nodes. At this stage, even
if you do a umount vol on node 1, node 1 will still have node 0 in its
domain
and will want to ping it to migrate lockres', leave the domain, etc. As in,
umount is a clusterwide event and not an isolated one.

Forcibly shutting down hb won't work because the vol is still mounted
and all those inodes are still cached and maybe still in use.

I am unclear as to what your real problem is.

Sunil

g.digiambelardini at fabaris.it wrote:
> Hi thanks for your help.
> We read your link, and we tried many solutions, but nothings work well
for
> us.
> The situation is that when we stop the eth link con the server have node
> number = 0 ( virtual1 ) and shared partition is mounted, we can't for
some
> second umount manually the partition ( or shutdown the server ) before
the
> node 2 go in  kernel panic ( the partition seems locked ).
>
> this is our /etc/default/o2cb:
>
> # O2CB_ENABLED: 'true' means to load the driver on boot.
> O2CB_ENABLED=true
>
> # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.
> O2CB_BOOTCLUSTER=ocfs2
>
> # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead.
> O2CB_HEARTBEAT_THRESHOLD=30
>
> # O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is
> considered dead.
> O2CB_IDLE_TIMEOUT_MS=50000
>
> # O2CB_KEEPALIVE_DELAY_MS: Max. time in ms before a keepalive packet is
> sent.
> O2CB_KEEPALIVE_DELAY_MS=5000
>
> # O2CB_RECONNECT_DELAY_MS: Min. time in ms between connection attempts.
> O2CB_RECONNECT_DELAY_MS=5000
> -----------------------------------------------------------------------
> We tried to change many times the value but nothing to do.
>
> I think the most easy way is stop heartbeat, but we can success to do it.
>
> HELP ME
>
>
>
>
>
>
>
>
> -----Sunil Mushran <Sunil.Mushran at oracle.com> wrote: -----
>
> To: g.digiambelardini at fabaris.it
> From: Sunil Mushran <Sunil.Mushran at oracle.com>
> Date: 05/03/2008 18.55
> cc: ocfs-users at oss.oracle.com
> Subject: Re: [Ocfs-users] cluster with 2 nodes - heartbeat problem
fencing
>
>
http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#QUORUM

>
>
> g.digiambelardini at fabaris.it wrote:
>
>> Hi,
>> now the problem is different,
>> this is My cluster.conf:
>>
>> ----------------------------------------------------------
>> node:
>>         ip_port = 7777
>>         ip_address = 1.1.1.1
>>         number = 0
>>         name = virtual1
>>         cluster = ocfs2
>>
>> node:
>>         ip_port = 7777
>>         ip_address = 1.1.1.2
>>         number = 1
>>         name = virtual2
>>         cluster = ocfs2
>>
>> cluster:
>>         node_count = 2
>>         name = ocfs2
>> -----------------------------------------------------
>> now seems the one of the cluster is a master, or better the virtual1 is
a
>> master, so when we shutdown the heartbeat interface ( eth0 - with
>>
> partition
>
>> mounted ) on the virtual1, the virtual2 gone in kernel panic. Instead if
>>
> we
>
>> shutdown the eth0 on virtual2, virtual1 work well.
>> some body can help us?
>> obviously if we reboot any server, so the partition gone unmounted
before
>> network gone down, avery thing work well.
>> THANKS
>>
>>
>>
>>
>> -----ocfs-users-bounces at oss.oracle.com wrote: -----
>>
>> To: ocfs-users at oss.oracle.com
>> From: g.digiambelardini at fabaris.it
>> Sent by: ocfs-users-bounces at oss.oracle.com
>> Date: 05/03/2008 13.51
>> Subject: [Ocfs-users] cluster with 2 nodes - heartbeat problem fencing
>>
>>
>>
>> Hi to all, this is My first time on this mailinglist.
>> I have a problem with Ocfs2 on Debian etch 4.0
>> I'd like when a node go down or freeze without unmount the ocfs2
>>
> partition
>
>> the heartbeat  not fence the server that work well ( kernel panic ).
>> I'd like disable or heartbeat or fencing. So we can work also with only
1
>> node.
>> Thanks
>>
>>
>> _______________________________________________
>> Ocfs-users mailing list
>> Ocfs-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs-users
>>
>>
>> _______________________________________________
>> Ocfs-users mailing list
>> Ocfs-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs-users
>>
>>
>
>
>





More information about the Ocfs-users mailing list