[Ocfs2-users] "another node is heartbeating in our slot"

Sunil Mushran sunil.mushran at oracle.com
Wed Sep 23 11:36:04 PDT 2009


You cannot share a device between two different clusters.

Florin Andrei wrote:
> OCFS2 cluster, two nodes, nothing fancy:
>
> #####################################
> [root at serv1 ~]# cat /etc/ocfs2/cluster.conf
> node:
>          ip_port = 7777
>          ip_address = 10.10.20.64
>          number = 0
>          name = serv1.foobar
>          cluster = ocfs2
>
> node:
>          ip_port = 7777
>          ip_address = 10.10.20.65
>          number = 1
>          name = serv2.foobar
>          cluster = ocfs2
>
> cluster:
>          node_count = 2
>          name = ocfs2
> #####################################
>
> A filesystem shared by these two machines got mounted on a 3rd machine, 
> which is part of another cluster, and the 3rd machine happens to share 
> the same node number with serv2.
> Some files were deleted on the 3rd machine, then the fs was unmounted 
> from it (but remained mounted on 1 and 2).
> As a result, a bunch of messages like this appeared in the logs:
>
> serv2 kernel: (21146,1):o2hb_do_disk_heartbeat:982 ERROR: Device "dm-3": 
> another node is heartbeating in our slot!
>
> And now there's a discrepancy between the disk usage indicated by df 
> (it's pretty high) and du (it's much lower). Also, ls -l generates weird 
> output for some files (which were supposedly deleted on the 3rd machine):
>
> ?--------- ? ?        ?              ?            ? access_log.20090601
> ?--------- ? ?        ?              ?            ? access_log.20090602
> ?--------- ? ?        ?              ?            ? access_log.20090603
> ?--------- ? ?        ?              ?            ? access_log.20090604
>
> I unmounted the fs on serv2 then mounted it back, but that didn't help. 
> Didn't try to unmount serv1 yet.
>
> Any suggestions?
>
>   




More information about the Ocfs2-users mailing list