[Ocfs2-users] Odd error on FC12 with ocfs2

David Murphy david at icewatermedia.com
Mon Mar 29 15:01:46 PDT 2010

Some additional data:
>From Web1 ( New Fedora Machine) to Web2:
	[root at web1 /etc/sysconfig/network-scripts]# nmap

	Starting Nmap 5.21 ( http://nmap.org ) at 2010-03-29 16:56 CDT
	Nmap scan report for
	Host is up (0.000076s latency).
	Not shown: 993 closed ports
	22/tcp   open  ssh
	80/tcp   open  http
	81/tcp   open  hosts2-ns
	111/tcp  open  rpcbind
	5666/tcp open  nrpe
	7777/tcp open  unknown
	9102/tcp open  jetdirect
	MAC Address: 00:50:56:A3:58:5D (VMware)
	Nmap done: 1 IP address (1 host up) scanned in 1.18 seconds

>From   web2 -> web1 (new fedora machine)
	[root at web2 ~]# nmap
	Starting Nmap 5.00 ( http://nmap.org ) at 2010-03-29 16:40 CDT
	Interesting ports on
	Not shown: 994 closed ports
	22/tcp   open  ssh
	80/tcp   open  http
	81/tcp   open  hosts2-ns
	111/tcp  open  rpcbind
	443/tcp  open  https
	7777/tcp open  unknown
	MAC Address: 00:50:56:A3:14:62 (VMWare)

	Nmap done: 1 IP address (1 host up) scanned in 1.31 seconds

		node_count = 6
		name = appshare
		ip_port = 7777
		ip_address =
		number = 1
		name = web1
		cluster = appshare
		ip_port = 7777
		ip_address =
		number = 2
		name = web2
		cluster = appshare
		ip_port = 7777
		ip_address =
		number = 3
		name = web3
		cluster = appshare
		ip_port = 7777
		ip_address =
		number = 4
		name = rgapp1
		cluster = appshare
		ip_port = 7777
		ip_address =
		number = 5
		name = deploy
		cluster = appshare
		ip_port = 7777
		ip_address =
		number = 6
		name = app1
		cluster = appshare

	OCFS2 1.5.0
	(1199,0):o2net_connect_expired:1656 ERROR: no connection established
with node 2 after 30.0 seconds, giving up and returning errors.
	(1199,0):o2net_connect_expired:1656 ERROR: no connection established
with node 3 after 30.0 seconds, giving up and returning errors.
	(1199,0):o2net_connect_expired:1656 ERROR: no connection established
with node 4 after 30.0 seconds, giving up and returning errors.
	(1199,0):o2net_connect_expired:1656 ERROR: no connection established
with node 5 after 30.0 seconds, giving up and returning errors.
	(1199,0):o2net_connect_expired:1656 ERROR: no connection established
with node 6 after 30.0 seconds, giving up and returning errors.
	(1262,0):dlm_request_join:1035 ERROR: status = -107
	(1262,0):dlm_try_to_join_domain:1209 ERROR: status = -107
	(1262,0):dlm_join_domain:1487 ERROR: status = -107
	(1262,0):dlm_register_domain:1753 ERROR: status = -107
	(1262,0):o2cb_cluster_connect:313 ERROR: status = -107
	(1262,0):ocfs2_dlm_init:2963 ERROR: status = -107
	(1262,0):ocfs2_mount_volume:1788 ERROR: status = -107
	ocfs2: Unmounting device (253,1) on (node 0)
	(1199,0):o2net_connect_expired:1656 ERROR: no connection established
with node 2 after 30.0 seconds, giving up and returning errors.
	(1199,0):o2net_connect_expired:1656 ERROR: no connection established
with node 3 after 30.0 seconds, giving up and returning errors.
	(1199,0):o2net_connect_expired:1656 ERROR: no connection established
with node 5 after 30.0 seconds, giving up and returning errors.
	(1199,0):o2net_connect_expired:1656 ERROR: no connection established
with node 6 after 30.0 seconds, giving up and returning errors.
	(1323,0):dlm_request_join:1035 ERROR: status = -107
	(1323,0):dlm_try_to_join_domain:1209 ERROR: status = -107
	(1323,0):dlm_join_domain:1487 ERROR: status = -107
	(1323,0):dlm_register_domain:1753 ERROR: status = -107
	(1323,0):o2cb_cluster_connect:313 ERROR: status = -107
	(1323,0):ocfs2_dlm_init:2963 ERROR: status = -107
	(1323,0):ocfs2_mount_volume:1788 ERROR: status = -107
	ocfs2: Unmounting device (253,1) on (node 0)
	VMCI: Major device number is: 249
	VMware memory control driver initialized
	vmmemctl: started kernel thread pid=1522
	ocfs2: Unregistered cluster interface o2cb
	OCFS2 Node Manager 1.5.0
	OCFS2 DLM 1.5.0
	ocfs2: Registered cluster interface o2cb
	OCFS2 DLMFS 1.5.0
	OCFS2 User DLM kernel interface loaded
	OCFS2 1.5.0
	(1810,0):o2net_connect_expired:1656 ERROR: no connection established
with node 4 after 30.0 seconds, giving up and returning errors.
	(1810,0):o2net_connect_expired:1656 ERROR: no connection established
with node 5 after 30.0 seconds, giving up and returning errors.
	(1810,0):o2net_connect_expired:1656 ERROR: no connection established
with node 6 after 30.0 seconds, giving up and returning errors.
	(1810,0):o2net_connect_expired:1656 ERROR: no connection established
with node 2 after 30.0 seconds, giving up and returning errors.
	(1810,0):o2net_connect_expired:1656 ERROR: no connection established
with node 3 after 30.0 seconds, giving up and returning errors.
	(1839,0):dlm_request_join:1035 ERROR: status = -107
	(1839,0):dlm_try_to_join_domain:1209 ERROR: status = -107
	(1839,0):dlm_join_domain:1487 ERROR: status = -107
	(1839,0):dlm_register_domain:1753 ERROR: status = -107
	(1839,0):o2cb_cluster_connect:313 ERROR: status = -107
	(1839,0):ocfs2_dlm_init:2963 ERROR: status = -107
	(1839,0):ocfs2_mount_volume:1788 ERROR: status = -107
	ocfs2: Unmounting device (253,1) on (node 0)

So clearly  ocfs2 the service things it can connect to the node, but nmap
sees the connection just fine. And Web2 can see the port on web1 just fine,
so there is no firewall blocking the connections.

I think it might be   Fedora 12 used 1.50 for the OCFS kernel module and
CentOS 5.3/5.4 use 1.4.4-1. Am I correct in thinking this?

-----Original Message-----
From: Sunil Mushran [mailto:sunil.mushran at oracle.com]
Sent: Thursday, March 25, 2010 6:46 PM
To: David Murphy
Cc: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] Odd error on FC12 with ocfs2

hmm.. o2cb_ctl makes no connections. It just reads the cluster.conf and
populates configfs. AFAIK.

David Murphy wrote:
> We had  6 nodes running CentOS 5.4 using  1.4.3 ocfs2-tools.
> I decided to rebuild one node with FC12.
> Which is working fine, however
> Nmap  shows 7777 as open
> And
> O2cb_ctl is  timing out when trying to connect to that node which then 
> causes a 107 error. This happens with all node and all node have 7777 
> open  via nmap from the FC machine.
> Is there a way to further debug this to see what exactly  o2cb_ctl is 
> seeing when trying to connect?
> David
> ----------------------------------------------------------------------
> --
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users

More information about the Ocfs2-users mailing list