[Ocfs2-users] Kernel panic due to ocfs2
Ramappa, Ravi (NSN - IN/Bangalore)
ravi.ramappa at nsn.com
Mon Feb 25 22:07:01 PST 2013
Hi,
In a 13 node cluster, the first four nodes went into kernel panic state. The /var/log/messages contained messages as below,
Feb 25 22:02:46 prod152 kernel: (o2net,9470,5):dlm_assert_master_handler:1817 ERROR: DIE! Mastery assert from 4, but current owner is 2! (O000000000000000c36706200000000)
Feb 25 22:02:46 prod152 kernel: lockres: O000000000000000c36706200000000, owner=2, state=0
Feb 25 22:02:46 prod152 kernel: last used: 0, refcnt: 3, on purge list: no
Feb 25 22:02:46 prod152 kernel: on dirty list: no, on reco list: no, migrating pending: no
Feb 25 22:02:46 prod152 kernel: inflight locks: 0, asts reserved: 0
Feb 25 22:02:46 prod152 kernel: refmap nodes: [ ], inflight=0
Feb 25 22:02:46 prod152 kernel: granted queue:
Feb 25 22:02:46 prod152 kernel: type=3, conv=-1, node=2, cookie=2:222205, ref=2, ast=(empty=y,pend=n), bast=(empty=y,pend=n), pending=(conv=n,lock=n,cancel=n,unlock=n)
Feb 25 22:02:46 prod152 kernel: converting queue:
Feb 25 22:02:46 prod152 kernel: blocked queue:
Feb 25 22:02:46 prod152 kernel: ----------- [cut here ] --------- [please bite here ] ---------
Feb 25 22:02:46 prod152 kernel: Kernel BUG at .../build/BUILD/ocfs2-1.4.10/fs/ocfs2/dlm/dlmmaster.c:1819
Feb 25 22:02:46 prod152 kernel: invalid opcode: 0000 [1] SMP
Feb 25 22:02:46 prod152 kernel: last sysfs file: /block/cciss!c0d0/cciss!c0d0p1/stat
Feb 25 22:02:46 prod152 kernel: CPU 5
Feb 26 09:50:27 prod152 syslogd 1.4.1: restart.
The OCFS2 rpm versions used are as below,
[root at prod152 ~]# uname -r
2.6.18-308.1.1.el5
[root at prod152 ~]# rpm -qa| grep ocfs
ocfs2-2.6.18-308.1.1.el5xen-1.4.10-1.el5
ocfs2-tools-devel-1.6.3-2.el5
ocfs2-2.6.18-308.1.1.el5-1.4.10-1.el5
ocfs2-tools-debuginfo-1.6.3-2.el5
ocfs2-tools-1.6.3-2.el5
ocfs2console-1.6.3-2.el5
ocfs2-2.6.18-308.1.1.el5debug-1.4.10-1.el5
root at prod152 ~]# cat /etc/ocfs2/cluster.conf
node:
ip_port = 7777
ip_address = 10.10.10.150
number = 0
name = prod150
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.151
number = 1
name = prod151
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.152
number = 2
name = prod152
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.153
number = 3
name = prod153
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.106
number = 4
name = prod106
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.107
number = 5
name = prod107
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.155
number = 6
name = prod155
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.156
number = 7
name = prod156
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.157
number = 8
name = prod157
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.158
number = 9
name = prod158
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.51
number = 10
name = prod51
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.52
number = 11
name = prod52
cluster = ocfs2
node:
ip_port = 7777
ip_address = 10.10.10.154
number = 12
name = prod154
cluster = ocfs2
cluster:
node_count =13
name = ocfs2
[root at prod152 ~]#
Is this a known issue ? Any issues in the configuration ?
Thanks,
Ravi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20130226/72b8e1bb/attachment.html
More information about the Ocfs2-users
mailing list