[Ocfs2-users] Previously working cluster - now one node cannot connect

Joseph Qi joseph.qi at huawei.com
Wed Aug 5 17:58:07 PDT 2015


Please check the node config in configfs in each node:
/sys/kernel/config/cluster/<your_cluster_name>/node/

If it is not the same, try to offline the cluster and then online, which
will reload the config from the cluster.conf.

On 2015/8/6 0:38, Jonathan Ramsay wrote:
> Hello , 
> 
> Have four node cluster - was working fine - until our DHCP/DNS went down . 
> The cluster will now not show up on one host.  This host DID change IP addresses as it was not static .
> I updated all (4) /etc/ocfs2/cluster.conf . They are all identical. 
> 
> cluster:
>         node_count = 4
>         name = saturn
> 
> node:
>         number = 0
>         cluster = saturn
>         ip_port = 7777
>         ip_address = 10.0.0.11
>         name = nile
> 
> node:
>         number = 1
>         cluster = saturn
>         ip_port = 7777
>         ip_address = 10.0.0.32
>         name = rio
> 
> node:
>         number = 2
>         cluster = saturn
>         ip_port = 7777
>         ip_address = 10.0.0.30
>         name = mekong
> 
> node:
>         number = 3
>         cluster = saturn
>         ip_port = 7777
>         ip_address = 10.0.0.13
>         name = volga
> 
> When either using fstab or direct mount I get : 
> 
> root at mekong:~# service ocfs2 reload
> Starting Oracle Cluster File System (OCFS2) mount.ocfs2: Invalid argument while mounting /dev/sda on /titan. Check 'dmesg' for more information on this error.
> 
> mount -t ocfs2 /dev/sda /titan
> mount.ocfs2: Invalid argument while mounting /dev/sda on /titan. Check 'dmesg' for more information on this error.
> 
> And in dmesg I get : 
> 
> [Wed Aug  5 12:20:27 2015] o2net: Connected to node nile (num 0) at 10.0.0.11:7777 <http://10.0.0.11:7777>
> [Wed Aug  5 12:20:31 2015] (mount.ocfs2,25447,3):dlm_send_nodeinfo:1291 ERROR: node mismatch -22, node 0
> [Wed Aug  5 12:20:31 2015] (mount.ocfs2,25447,3):dlm_try_to_join_domain:1675 ERROR: status = -22
> [Wed Aug  5 12:20:31 2015] (mount.ocfs2,25447,3):dlm_join_domain:1945 ERROR: status = -22
> [Wed Aug  5 12:20:31 2015] (mount.ocfs2,25447,1):dlm_register_domain:2204 ERROR: status = -22
> [Wed Aug  5 12:20:31 2015] (mount.ocfs2,25447,1):o2cb_cluster_connect:368 ERROR: status = -22
> [Wed Aug  5 12:20:31 2015] (mount.ocfs2,25447,1):ocfs2_dlm_init:3004 ERROR: status = -22
> [Wed Aug  5 12:20:31 2015] (mount.ocfs2,25447,1):ocfs2_mount_volume:1881 ERROR: status = -22
> [Wed Aug  5 12:20:31 2015] ocfs2: Unmounting device (8,0) on (node 0)
> [Wed Aug  5 12:20:31 2015] (mount.ocfs2,25447,1):ocfs2_fill_super:1229 ERROR: status = -22
> [Wed Aug  5 12:20:33 2015] o2net: No longer connected to node nile (num 0) at 10.0.0.11:7777 <http://10.0.0.11:7777>
> [Wed Aug  5 12:21:13 2015] o2net: Connected to node nile (num 0) at 10.0.0.11:7777 <http://10.0.0.11:7777>
> [Wed Aug  5 12:21:17 2015] (mount.ocfs2,25470,2):dlm_send_nodeinfo:1291 ERROR: node mismatch -22, node 0
> [Wed Aug  5 12:21:17 2015] (mount.ocfs2,25470,2):dlm_try_to_join_domain:1675 ERROR: status = -22
> [Wed Aug  5 12:21:17 2015] (mount.ocfs2,25470,2):dlm_join_domain:1945 ERROR: status = -22
> [Wed Aug  5 12:21:17 2015] (mount.ocfs2,25470,2):dlm_register_domain:2204 ERROR: status = -22
> 
> 
> I have found found several "node mismatch" and "ERROR: status = -22" issue(s) but none seem applicable . 
> 
> Any suggestions welcome .
> 
> Thanks , 
> 
> J.R. 
> 
> 
> 
> 
> 
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
> 





More information about the Ocfs2-users mailing list