[Ocfs2-users] strange fencing behavior
Sunil Mushran
sunil.mushran at oracle.com
Thu Sep 24 14:24:11 PDT 2009
You have to look at the logs of the fenced nodes. Setup netconsole
to trap the kernel logs.
On a side note, the logs show you are hitting the following bugzilla.
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1053
Upgrade the tools to 1.4.2.
Sean Thon wrote:
> I have 10 servers in a cluster running Debian Etch with 2.6.26-bpo.2
> with a backport of ocfs2-tools-1.4.1-1
> I'm using AoE to export the drives from a Debian Lenny server in the
> cluster.
>
> My problem is if I mount the ocfs2 partition on the server that is
> exporting it via AoE it fences the entire cluster. Looking at the logs
> exporting the ocfs2 partition doesn't give much information... Is this a
> known limitation?
>
> Sep 24 14:06:20 storage0 kernel: [650570.916574] ocfs2_hb_ctl[12083]:
> segfault at 0 ip 7f20d9c97a90 sp 7fffe238cf98 error 4 in
> libc-2.7.so[7f20d9c1d000+14a000]
> Sep 24 14:06:20 storage0 kernel: [650570.916932] ocfs2: Unmounting
> device (8,17) on (node 0)
> Sep 24 14:07:04 storage0 kernel: [650622.608756] ocfs2_dlm: Nodes in
> domain ("E042658E558940E6B05EFE2DBA548DFD"): 0
> Sep 24 14:07:04 storage0 kernel: [650622.608818] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:07:04 storage0 kernel: [650622.609924] ocfs2: Mounting device
> (8,17) on (node 0, slot 0) with ordered data mode.
> Sep 24 14:07:29 storage0 kernel: [650651.285363] ocfs2_hb_ctl[12160]:
> segfault at 0 ip 7f17160f3a90 sp 7fff1e7e93e8 error 4 in
> libc-2.7.so[7f1716079000+14a000]
> Sep 24 14:07:29 storage0 kernel: [650651.285638] ocfs2: Unmounting
> device (8,17) on (node 0)
> Sep 24 14:07:57 storage0 kernel: [650683.535472] ocfs2_dlm: Nodes in
> domain ("1C1694511D464F16971020CE53D45401"): 0
> Sep 24 14:07:57 storage0 kernel: [650683.566388] JBD: Ignoring recovery
> information on journal
> Sep 24 14:07:57 storage0 kernel: [650683.566388] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:07:57 storage0 kernel: [650683.566388] ocfs2: Mounting device
> (8,18) on (node 0, slot 10) with ordered data mode.
> Sep 24 14:07:57 storage0 kernel: [650683.566388]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 10 from slot 0 on
> device (8,18)
> Sep 24 14:08:00 storage0 kernel: [650687.138110] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:08:00 storage0 kernel: [650687.268898]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 2 from slot 1 on
> device (8,18)
> Sep 24 14:08:02 storage0 kernel: [650690.547309] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:08:03 storage0 kernel: [650690.581554]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 3 from slot 2 on
> device (8,18)
> Sep 24 14:08:05 storage0 kernel: [650693.934489] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:08:05 storage0 kernel: [650693.983941]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 4 from slot 3 on
> device (8,18)
> Sep 24 14:08:08 storage0 kernel: [650697.403252] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:08:08 storage0 kernel: [650697.478297]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 5 from slot 4 on
> device (8,18)
> Sep 24 14:08:11 storage0 kernel: [650700.979480] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:08:11 storage0 kernel: [650701.049023]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 6 from slot 5 on
> device (8,18)
> Sep 24 14:08:14 storage0 kernel: [650704.525081] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:08:14 storage0 kernel: [650704.569578]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 7 from slot 6 on
> device (8,18)
> Sep 24 14:08:17 storage0 kernel: [650708.610043] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:08:17 storage0 kernel: [650708.642044]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 8 from slot 7 on
> device (8,18)
> Sep 24 14:08:20 storage0 kernel: [650712.306538] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:08:20 storage0 kernel: [650712.366600]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 9 from slot 8 on
> device (8,18)
> Sep 24 14:08:24 storage0 kernel: [650716.491634] kjournald starting.
> Commit interval 5 seconds
> Sep 24 14:08:24 storage0 kernel: [650716.587527]
> (12231,1):ocfs2_replay_journal:1149 Recovering node 1 from slot 9 on
> device (8,18)
> Sep 24 14:08:27 storage0 kernel: [650720.039214] kjournald starting.
> Commit interval 5 seconds
>
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
More information about the Ocfs2-users
mailing list