[Ocfs2-users] Error message whil booting system

Raheel Akhtar rakhtar at ryerson.ca
Wed Jul 29 10:17:47 PDT 2009


Hi,

 

When system booting getting error message "modprobe: FATAL: Module
ocfs2_stackglue not found" in message. Some nodes reboot without any error
message.

-------------------------------------------------

ul 27 10:02:19 alf3 kernel: ip_tables: (C) 2000-2006 Netfilter Core Team

Jul 27 10:02:19 alf3 kernel: Netfilter messages via NETLINK v0.30.

Jul 27 10:02:19 alf3 kernel: ip_conntrack version 2.4 (8192 buckets, 65536
max) - 304 bytes per conntrack

Jul 27 10:02:19 alf3 kernel: e1000: eth0: e1000_watchdog_task: NIC Link is
Up 1000 Mbps Full Duplex, Flow Control: None

Jul 27 10:02:20 alf3 setroubleshoot: [server.ERROR] cannot start systen DBus
service: Failed to connect to socket /var/run/db

us/system_bus_socket: No such file or directory

Jul 27 10:02:20 alf3 kernel: VMware memory control driver initialized

Jul 27 10:02:20 alf3 kernel: e1000: eth0: e1000_set_tso: TSO is Enabled

Jul 27 10:02:21 alf3 modprobe: FATAL: Module ocfs2_stackglue not found. 

Jul 27 10:02:21 alf3 kernel: OCFS2 Node Manager 1.4.2 Wed Jul  1 19:55:44
PDT 2009 (build 0b9eb999c4d39c0d4b66219a2752cda6)

Jul 27 10:02:21 alf3 kernel: OCFS2 DLM 1.4.2 Wed Jul  1 19:55:44 PDT 2009
(build 0faae8d4263a8c594749be558d8d7edd)

Jul 27 10:02:21 alf3 kernel: OCFS2 DLMFS 1.4.2 Wed Jul  1 19:55:44 PDT 2009
(build 0faae8d4263a8c594749be558d8d7edd)

Jul 27 10:02:21 alf3 kernel: OCFS2 User DLM kernel interface loaded

Jul 27 10:02:25 alf3 kernel: o2net: connected to node alf0 (num 0) at
172.25.29.10:7777

Jul 27 10:02:25 alf3 kernel: o2net: connected to node alf2 (num 2) at
172.25.29.12:7777

Jul 27 10:02:25 alf3 kernel: o2net: accepted connection from node alf5 (num
5) at 172.25.29.15:7777

Jul 27 10:02:26 alf3 kernel: o2net: accepted connection from node alf4 (num
4) at 172.25.29.14:7777

Jul 27 10:02:27 alf3 kernel: o2net: connected to node alf1 (num 1) at
172.25.29.11:7777

Jul 27 10:02:31 alf3 kernel: OCFS2 1.4.2 Wed Jul  1 19:55:41 PDT 2009 (build
966fd2793489955b2271e7bb7e691088)

Jul 27 10:02:31 alf3 kernel: ocfs2_dlm: Nodes in domain
("7BE7E9E2026A40F8801B56257D805C88"): 0 1 2 3 4 5

 

Kernel log from another node alf1 for above node alf3 is like

 

Jul 29 10:15:57 alf1 kernel: o2net: connection to node alf3 (num 3) at
172.25.29.13:7777 has been idle for 30.0 seconds, shut

ting it down.

Jul 29 10:15:57 alf1 kernel: (0,1):o2net_idle_timer:1506 here are some times
that might help debug the situation: (tmr 124887

6927.861591 now 1248876957.858464 dr 1248876927.861556 adv
1248876927.861622:1248876927.861623 func (0ffa2aed:506) 1248876927

.861592:1248876927.861604)

Jul 29 10:15:57 alf1 kernel: o2net: no longer connected to node alf3 (num 3)
at 172.25.29.13:7777

Jul 29 10:16:27 alf1 kernel: (2600,1):o2net_connect_expired:1667 ERROR: no
connection established with node 3 after 30.0 seco

nds, giving up and returning errors.

Jul 29 10:17:27 alf1 last message repeated 2 times

Jul 29 10:17:30 alf1 kernel: (2618,0):ocfs2_dlm_eviction_cb:98 device
(8,33): dlm has evicted node 3

Jul 29 10:17:32 alf1 kernel: (2629,2):dlm_get_lock_resource:844
7BE7E9E2026A40F8801B56257D805C88:$RECOVERY: at least one node

 (3) to recover before lock mastery can begin

Jul 29 10:17:32 alf1 kernel: (2629,2):dlm_get_lock_resource:878
7BE7E9E2026A40F8801B56257D805C88: recovery map is not empty, 

but must master $RECOVERY lock now

Jul 29 10:17:32 alf1 kernel: (2629,1):dlm_do_recovery:524 (2629) Node 1 is
the Recovery Master for the Dead Node 3 for Domain

 7BE7E9E2026A40F8801B56257D805C88

Jul 29 10:17:34 alf1 kernel: o2net: accepted connection from node alf3 (num
3) at 172.25.29.13:7777

Jul 29 10:17:38 alf1 kernel: ocfs2_dlm: Node 3 joins domain
7BE7E9E2026A40F8801B56257D805C88

Jul 29 10:17:38 alf1 kernel: ocfs2_dlm: Nodes in domain
("7BE7E9E2026A40F8801B56257D805C88"): 1 2 3 4 5 

Jul 29 11:09:42 alf1 kernel: o2net: connected to node alf0 (num 0) at
172.25.29.10:7777

Jul 29 11:09:45 alf1 kernel: ocfs2_dlm: Node 0 joins domain
7BE7E9E2026A40F8801B56257D805C88

Jul 29 11:09:45 alf1 kernel: ocfs2_dlm: Nodes in domain
("7BE7E9E2026A40F8801B56257D805C88"): 0 1 2 3 4 5

 

 

OS = Red Hat 5.2 

[root at alf3 /]# uname -a

Linux alf3 2.6.18-128.1.16.el5 #1 SMP Fri Jun 26 10:53:31 EDT 2009 x86_64
x86_64 x86_64 GNU/Linux

 

[root at alf3 /]# rpm -qa | grep ocfs2

ocfs2-tools-1.4.2-1.el5

ocfs2-2.6.18-128.1.16.el5-1.4.2-1.el5

ocfs2console-1.4.2-1.el5

 

Any help will be appreciated, OCFS2 cluster is not stable. Mounting File
system for file sharing with Alfresco.

 

 

Thanks

Raheel

 

 

 

 

 

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20090729/6c9b80b4/attachment.html 


More information about the Ocfs2-users mailing list